RocketRAID 2680 and FreeBSD, A tragic story

davidgurvich said:
Have you tried initializing the drives prior to attaching them to the 2680? You might be able to access the bare drives with JBOD after that.

Not sure what initializing can be, but I have tried to make single disk JBOD. If you look in one of the earlier posts in this thread I mention it. Also, the problem as I understand it is that the drivers for the controller card isn't working. And as a result it doesn't find the card. And so from the systems point of view if there is no card there can be no disks connected to it. If you understand what I mean... :\
 
Bobbla said:
So, basically your saying give up? That sounds defeatist..
At some point you have to evaluate what your time is worth and whether you can get a working system for less money than continuing to fight with this card.
Also, more off topic(maybe even another topic).. But does this mean a HDD is about to die?
Code:
ad8: FAILURE - READ_DMA48 status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=343212162
ad8: FAILURE - READ_DMA48 status=51<READY,DSC,ERROR> error=40<UNCORRECTABLE> LBA=343212162
If the error is always on a particular block, then you likely have a media error. A range of different blocks could indicate multiple media errors or a controller that has become confused.
Also, I know of the Hardware Notes but is there anywhere else I can obtain good info on what works with FreeBSD? When I say good info I mean stuff has been tested and results were awesome kinda stuff.. I also know about google, but still.. anyone?
Asking here or on the FreeBSD mailing lists is a good start.

Personally, I've had excellent experiences with the various 3Ware 9xxx controllers (twa driver). I also have a number of systems running with various LSI 1068-based controllers (mpt driver). I prefer the 3Ware because of the management tools available, but both of these drivers seem to work quite well.
 
Terry_Kennedy said:
If the error is always on a particular block, then you likely have a media error.

Well, the LBA is the same so I guess it is a media error. What is a "media error"? is it just a small section of the disk that has gone bad, and no problem? Or is this a sign of disk failure and big trouble?

And again whats the difference between the "clear" and "online" zpool command? To me they both sounds like ways to avoid a problem. "clear": There is no problem, move along. And "online": Problem? Pff, who cares. Which one should I use? Should I just scrub and clear?

Terry_Kennedy said:
I also have a number of systems running with various LSI 1068-based controllers (mpt driver).

I've been looking a little around and this one does not sees so bad? AOC-USAS-L8i However I wonder what the UIO part mean and if it is possible for me to use with regular (m)ATX hardware..?
 
Format the drives with a different controller then attach to the 2680 and configure the controller to see them as JBOD. That might pass through to freebsd as regular hard drives.
 
davidgurvich said:
Format the drives with a different controller then attach to the 2680 and configure the controller to see them as JBOD. That might pass through to freebsd as regular hard drives.

If I remember correctly I formated the disks on a windows machine in order to make em show as legacy when in the HighPoint Bios thingy.. :\ I don't think I would still be here if it had worked.. :(

Maybe I have misunderstood something, but does not the OS need contact with the controller card before it can contact the hard drives that are connected to it???
 
Depends on how much load the controller card takes from the cpu. If you were lucky you had a really crappy one which would act as just more ports and give you the functionality of a bad adapter. So all the drives were initialized and the controller bios saw them? Have you tried the same system with openbsd or linux?
 
davidgurvich said:
Depends on how much load the controller card takes from the cpu. If you were lucky you had a really crappy one which would act as just more ports and give you the functionality of a bad adapter. So all the drives were initialized and the controller bios saw them? Have you tried the same system with openbsd or linux?

What does the load matter? And also, I only need more ports as I use zfs to make Raidz. Furthermore I can make raid on the hardware side in the controller bios, but again not what I want. Just look at my 3rd post on page 1, you can see that something is detected by something. No I have not tried the controller with any other OS, and I don't think it will help because it is the driver that is at fault as I understand it.
 
Bobbla said:
Well, the LBA is the same so I guess it is a media error. What is a "media error"? is it just a small section of the disk that has gone bad, and no problem? Or is this a sign of disk failure and big trouble?
It depends what caused the problem. If the drive was trying to write when power failed, it may have corrupted the sector headers. Or, if there is a physical imperfection at that point inside the drive, it may damage the head and/or media each time the head moves past it.

If you have a backup, I'd try erasing it (write 0's to the whole thing) and see if you get an error when trying to write to that spot (or when trying to read it back later).

I've never had a drive manufacturer refuse an in-warranty replacement for even a single bad sector.
I've been looking a little around and this one does not sees so bad? AOC-USAS-L8i However I wonder what the UIO part mean and if it is possible for me to use with regular (m)ATX hardware..?
The UIO cards go into a special slot in some Supermicro systems. Some people have reported success using them in other systems. The issue, as I understand it, is that the card mounting bracket is on the "wrong side" of the board, so it doesn't just drop in to a regular slot.
 
Terry_Kennedy said:
It depends what caused the problem. If the drive was trying to write when power failed, it may have corrupted the sector headers.

There was no power failure. I dunno what it is, but some times the screen turns black everything freeze. And the orange light(indicating hdd activity I think) in front of the computer case get a constant light. So it may not be a power failure but something fails.. I was transferring some files to the server at that moment so I know the failure happen during file transfer. Also, wouldn't SMART or something "lock out" bad sectors? I don't like the idea of writing stuff to a hard drive inside a raid... If I use scrub won't it detect errors? And I still don't know what to think of the zfs commands...

Also no backup, I hoped the raidz would be enough....... :\
 
Bobbla said:
Also, wouldn't SMART or something "lock out" bad sectors?
Nope. It just reports errors to you (assuming you're listening with something like sysutils/smartmontools).
I don't like the idea of writing stuff to a hard drive inside a raid...
Well, as I said you'd need to have a useful backup (see below) and then remove the drive from the ZFS pool and test it.
If I use scrub won't it detect errors? And I still don't know what to think of the zfs commands...
Not necessarily. If the error isn't in a spot that ZFS thinks is allocated space (either file or metadata), then it won't see the error until you try to access it.
Also no backup, I hoped the raidz would be enough....... :\
RAID is not backup. And neither is ZFS (or at least, a single ZFS pool). You might want to look at this thread where the nitty-gritty details have been discussed. The (at the present time) last post in that thread has pointers to two other discussions were ZFS pools became (at least temporarily) un-recoverable.
 
Terry_Kennedy said:
Well, as I said you'd need to have a useful backup (see below) and then remove the drive from the ZFS pool and test it.

I read the both pages of the thread and some of the links. So, what I mean is: I may not have a complete backup. However I know the chance of permanent corrupt data will decline with the use of raidz. And quite frankly I am a student without a job, meaning no real income and I don't really want to buy a lot of extra stuff. And the whole reason I have this homemade NAS is because it was the cheapest way with some kinda "anti" corruption mechanism. And as I've only got personal data on this I don't have the problem of disappearing cities, thermonuclear visits, aliens invasion or other cataclysm events. For I will probably die in any of the listed events, and dead people don't complain. (usually �e)

Terry_Kennedy said:
Not necessarily. If the error isn't in a spot that ZFS thinks is allocated space (either file or metadata), then it won't see the error until you try to access it.

Well, zfs reports "bad intent log"(see page 1). So I think its fair to assume that the error is located somewhere in zfs country. I bet metadata...

durrr...
 
Back
Top