Simba7 said:
I have bumped into this issue with an old DL100 G1 server.The Adaptec 2410SA controller does not like arrays above 1TB (controller kernel crashes even with the latest firmware), and I have 4x 640GB drives in it. So, I was thinking of doing RAID5 in hardware making 2 pairs of ~900GB arrays and striping them together in software. I can't really afford a new controller, so I have to make do with what I have. I will be using FreeBSD and ZFS.
I'm not sure I understand how you want to use your RAID card.
In raw hardware, you have 4 x 640 GB = 2560 GB. Really dumb RAID controllers can only make disk arrays out of whole disks. With RAID-5, you always take one parity disk and use the rest for data, so this would give you a 1920 GB array (except, as you say, your controller doesn't like that).
I think what you are suggesting is that the controller is not completely dumb, and can build logical units or arrays that use parts of a disk, namely two 3+P arrays built out of 320 GB half-disks, each with 960 GB useful capacity. You seem to be implying that your controller can handle them, because they're smaller than 1TB . Good. Then you want to strap those two together using ZFS, fundamentally just concatenating them logically, and ending up with 1920 GB useful capacity. Performance will be a little weird, because ZFS will think that it has two independent volumes to work against, and when doing IO scheduling, it won't know that if you send an IO to array1, that slows down array2. And if ZFS tries to stripe sequential data across array1 and array2, that will lead to a terrible zig-zag access pattern on the real disk drives, which would be utterly performance killing. But all this could work.
Now, is it a good idea? You say that you want to do the heavy lifting of parity-based RAID In the hardware, because your CPU is an old P4. Certainly a sensible argument. But: Small cheap PCI RAID controllers can be remarkably crappy, for example in performance. And modern software RAID implementations have learned quite a few tricks.
The alternative is obvious: Tell the Adaptec to get out of the way, and just present the four drives as four block devices to the OS (fundamentally, put the RAID controller into JBOD mode). Then use ZFS to turn the four drives into a 3+P array.
You should probably benchmark both approaches, before letting your (sensible) prejudices drive you to a decision. I would not be surprised at all if ZFS in the CPU running a sensible disk scheduler and a great cache management algorithm might outperform the Adaptec hardware, already on the initial benchmark. Obviously the benchmark should use a workload similar to what you really do. If your intended use is as a build machine running parallel make, and your benchmark is a sequential dd of a terabyte, you're fooling yourself (and vice versa).
What would I do? The obvious answer is: Buy, cheat or steal yourself a modern disk controller and a modern disk, and get on with life. The problem with this approach is that you might get caught stealing it, and you said this is not an option. So forget this. My serious advice would be this: If you can survive with 1280 GB of capacity, then just use the four disks in mirrored mode, using ZFS for mirroring. If you If you absolutely need the 1920 GB of capacity, then use ZFS to do the RAID-Z 3+P array using four raw disks. If the Adaptec hardware RAID is actually faster, and you absolutely need the extra speed, the go with your 2x 960 GB solution,
Here's why. I really don't like small hardware RAID controllers. To begin with, they tend to not be terribly fast in RAID mode, in particular on small writes. They also tend to have serious issues in case of power loss or controller firmware failure; those tend to cause data loss more often than one likes. In contrast, ZFS is remarkable stable and safe. Another issue is the following: The data loss rate of a RAID array depends crucially on the rebuild time of the array. And I wonder whether after a disk failure, your P4 CPU with ZFS might be able to rebuild faster onto a spare disk, than the Adaptec can do it. This is particularly true if your file system isn't completely full: ZFS only needs to rebuild allocated space, while the Adaptec has to rebuild every block (it doesn't know whether the space holds a file or is free space). Furthermore, ZFS has very good scrubbing, which will help find bad sectors much earlier. With big disks, the biggest cause of data loss is a "strip kill": You suffer one disk failure, and while rebuilding onto a spare disk, you find a bad sector on one of the good disks, and now your data is toast. Regular scrubbing is the best prevention of strip kills. One other argument is this: If you use ZFS to do the RAID, you can easily set up a monitoring system, which alerts you quickly when a disk fails (because the sooner you get there with a spare and start the rebuild, the less likely it is that a second disk goes down before you get there). With a hardware RAID solution, you need to set up complex and flimsy monitoring software to accomplish the same thing. The worst thing that can happen to a RAID array is that a disk fails, the array is running in degraded mode (waiting for a spare to rebuild onto) for weeks or months, and nobody knows about it.
Also, I don't believe that parity-based RAID codes are a good idea, either for small hardware-based controllers (parity-based RAID is actually horrendously complex, and implementations on boards tend to have holes in both performance and reliability), nor for household or amateur workloads (which tend to be heavy on small writes, and bursty and uneven workloads). RAID-5 is great for big sequential stuff, like data mining or video processing, or supercomputers. It should be left to highly trained experts in white lab coats (and with a good liability insurance policy). But your workload is probably not that, nor do you look good in a white lab coat.
Hope this helps. And if it doesn't, it might at least amuse you.