Solved Software RAID-0 (or ZFS)?

Dear Forum. I have a task that needs substantial amount of space to write to. Doesn't need to be contiguous just large(ish) enough. So I ended up with a bunch of NVMe SSDs in my system all UFS. Now, I could simply run said jobs on each SSD separately and in fact I have been doing exactly that for now. It then occurred to me that another option would be to put them in software RAID-0 and use it as one big drive.

Two questions.
Would RAID-0 like that help any? Does it have any benefits that targeting separate drives would not (other than "total contiguous space" available)?
Would ZFS pooling (in some way or other) offer any benefit? I understand ZFS not at all tbh.

In the above please assume write intensive IO tasks, mostly sequential.
Disk failure can be tolerated and in fact may happen, in which case all I would do is replace the failed disk and start over. Once task is complete the result of its computation (data of value) is moved to better storage.

Thank you
 
Does it have any benefits that targeting separate drives would not (other than "total contiguous space" available)?
Speed. That's typically the reason why RAID0 is used, space is a secondary reason. The downside of a RAID0 is that if one drive dies the whole RAID0 is dead.

Would ZFS pooling (in some way or other) offer any benefit? I understand ZFS not at all tbh.
ZFS is just a software solution for, amongst others, RAID0. Or mirroring, or RAIDZ (ZFS equivalent of RAID5). The added benefit of ZFS is ease of management and error correction (if you have redundant data). Hardware RAID only has error detection, no correction.
 
If you use ZFS, use it for RAID as well. If you have enough disks, consider RAID-5 (in ZFS terms RAID-Z) instead, so you have at least *some* redundancy at the expence of the capacity of ONE disk.
 
so, in your opinion should I go with RAID-0 following handbook or research the equivalent in ZFS if I were to attempt that sort of "striping"?
Yes, no, maybe. It really depends on your load. RAID0 or striping just balances the writes/reads over multiple disks, thereby speeding up your data transfers. You're not bottlenecking on a single drive. The added benefit of ZFS here is that you can easily create separated filesystems (datasets). While you could do that on a traditional UFS system too, with ZFS each filesytem can use the whole pool. So you're not limited to the size of a partition as would be the case with UFS. Other benefits of ZFS are snapshots and easy backups and/or moving of complete datasets using zfs-send(8)/zfs-receive(8). I can highly recommend playing with ZFS to get a feel for it. Once you understand how to use it you're going to wonder why you've never used it before.
 
Yes, no, maybe.
I don't think there's a "no" or "maybe" concerning the actual question: If you decide to use ZFS and you want striping, RAID, whatever, use ZFS for that.

Of course, if the question is "should I use raid-0, raid-5, raid-6, raid-10?", there's a lot of "yes, no maybe" ;)
 
Well, I'd use ZFS in all cases. But some people still prefer to stick to UFS. On the other hand, there's also a question whether or not the risk of a RAID0 outweighs the speed benefits. Regardless if it's UFS or ZFS. For certain loads it can be quite beneficial and the risk of blowing up the whole stack is taken for granted (temporary data should be fine, or something that can easily be rebuilt for example).
 
The added benefit of ZFS here is that you can easily create separated filesystems (datasets).
This for me is the true magic of ZFS. I have the dubious benefit of having spent hours resizing partitions and trying to get LVM to behave over in Linux land. ZFS is objectively superior.
Well, I'd use ZFS in all cases.
I almost agree. I've read that ZFS doesn't play nice with sendfile(2). Do not use it if you're hosting Varnish or Kafka.
 
Two distinct advantages of ZFS over RAID0 come to mind:

1. Integrity. ZFS checksums every block, so it will not (should not) return data that has been corrupted somehow. RAID will just blindly pass through raw data from the drive.

2. Compression. If your application data is compressible, using a fast compression algorithm (such as lz4) on an I/O bound application could increase overall I/O speed.
 
Two distinct advantages of ZFS over RAID0 come to mind:

2. Compression. If your application data is compressible, using a fast compression algorithm (such as lz4) on an I/O bound application could increase overall I/O speed.
bunch of hashes. Not compressible at all. In fact I'd rather my filesystem not even try to compress cause I know for a fact that's effort wasted.
 
ok, so the results are in and I guess ZFS is out. Substantial slowdown compared to just a bunch of UFS drives. Haven't tried RAID0 yet.

*update* RAID0 performs better than ZFS, worse than jbod
 
Back
Top