Solved 2.5" HDD, what's the best choice?

Hello,

I'm contemplating the idea of a replacement for my currently hosted 1U server. This server holds 4x 3.5" HDD and 1x 2.5" SSD, with this topology:
- first zpool ZRAID1 mirror 2x 1To + SSD L2ARC
- second zpool ZRAID1 mirror 2x 2To

My ideal next server would have 8x 2.5" SATA slots, and 1x NVME. I would like to maximize storage and perf, and minimize cost :). So I will go for large 2.5" HDD, may be 1 or 2 SSD, but not sure. NVME could act as log device and/or L2ARC. I plan to put 4x 32 GB RAM (=128 GB).

What would you suggest as (affordable) 2.5" HDD brand/model and as ZFS topology?

(edit: corrected confusion between mirror and raidz1)
 
Last edited:
What are you trying to maximize? Storage capacity? In that case, the highest capacity disks are 3.5" drives. Performance? In that case go SSD all the way. Capacity per dollar? Again, 3.5" wins. Today, there is not much of a niche left for 2.5" drives, except in a narrow segment for medium-performance per $.

Another argument: If you like your data, you need redundancy. Which means buying at least 2 disk drives (at which point your efficiency is 50%, and you are only protected against a single fault). Which is how I run my system at home, even if professionally I know that this is not dependable enough for high-value data. If you are maximizing for some combination of capacity and data dependability, you are probably best off with roughly a half dozen drives, which gives you tolerance to 1 to 3 faults (depending on how you set it up), and somewhere between 50% and 85% storage efficiency.

I think in reality, you probably have an amount of data you need to store in mind; knowing that amount might make it easier to think about system design.
 
As I wrote I try to achieve the impossible result : ++storage ++perf --cost :)
At least I'm pragmatic (and experienced), I know can only tend to that result. ZFS perfs can be achieved with lots of RAM, so I'm not too afraid as I plan to put 128 GB in that box.
I'm perfectly aware that huge capacity comes only in 3.5" HDD, and that those have best GB/$ too.

But:
- 1U servers offer only very limited number of 3.5" slots -> very limited topology options
- 2.5" HDD available on consumer market don't offer "enterprise" reliability

My current setup is 2x 1TB (raidz1 mirror) + 2x 2TB (raidz1 mirror), so 50% storage efficiency. It's not full so my current need is quite low, but I want to be future proof.

Let's say I get 8x 2TB Seagate BarraCuda Mobile (90€ each -> 720€ total), create a raidz3. I'll get 10TB theorical zpool, probably around 9TB usable capacity. It's not bad compared to my current ~2.7 TB usable capacity.
For about the same price I can get 2x 10TB WD Ultrastar 3.5" HDD, that would yield to ~9TB raidz1 mirror.
Seagate BarraCuda Mobile HDD tops at 5TB (200€), too expensive for me to buy 8 of them, but I could consider buying 2 for raidz1 mirror pool (local backup). It would give me: 6x 2TB arranged in raidz2 (~7TB) for prod + 2x 5TB arranged in raidz1 mirror (~4.5TB) for local backup.

This is what I want to discuss: topology, possible choices of brand/model, etc. What if, for example, Seagate BarraCuda Mobile is absolute crap?
 
Last edited:
Let's say I get 8x 2TB Seagate BarraCuda Mobile (90€ each -> 720€ total), create a raidz3. I'll get 10TB theorical zpool, probably around 9TB usable capacity. ...
For about the same price I can get 2x 10TB WD Ultrastar 3.5" HDD, that would yield to ~9TB raidz1.
Assuming all other things being equal, the setup with 8 drives and RAID-Z3 is far preferable. It can handle single to triple fault. That is actually really important today, because of the large size of disk drives: If one drive fails, you need to read one drive's worth of data from the rest of the drives. But the probability of a read error (single sector error) when reading O(10TB) of data is approaching one (take the 10^-14 reliability of disk drives times the number of bits). So in reality, a significant fraction of single faults are becoming double faults today, which is why 2-fault tolerant RAID is becoming important.

The setup with 8 drives will also have significantly better performance, when doing random IO (which most IO is, excluding highly professional applications like supercomputers and cloud data centers), because you have 8 disk arms working in parallel, instead of just 2.

The downside is also obvious: More cabling, more complexity, more opportunity for making configuration or administration mistakes (the single biggest source of data loss!). And then the extremely difficult question, to which I don't know the answer: How reliable are inexpensive 2.5" mobile drives really?

This is what I want to discuss: topology, possible choices of brand/model, etc. What if, for example, Seagate BarraCuda Mobile is absolute crap?
That's incredibly hard to find out. For amateurs, the best data available is what is published by BackBlaze; I actually go by that when buying disk drives for home. Inside the giant computer manufacturers (Dell, HP, IBM, NetApp, Oracle...) and cloud companies (FAANG), much more data on drive reliability is available, but that is never shared with the public. But all that data applies to nearline 3.5" drives being used in the intended fashion. And don't extrapolate from that data to other models, and in particular not to other series.

Every disk manufacturer has had bad models. Seagate had the infamous sticktion problem (long solved); IBM (now known as Hitachi) had the drives that forgot to perform some writes (also long solved). Seagate then had the Barracudas of the ~1TB generation that failed faster than the police allowed, WD had a whole model of nearline drives where the platters looked like mountainous terrain. With Seagate being the biggest drive manufacturer, they also had the biggest share of bad press. Giving advice here is virtually impossible.
 
For checking out brand/model reliability, I like to browse the Backblaze reports...

I love backblaze reports, but unfortunately they don't use 2.5" HDD, and quite often they don't use the same 3.5 HDD as I do (they use higher capacity drives)

For 2.5" HDD I prefer WD Black but its max size is 1TB.

I would probably not even asked my question here if they had provided 2TB HDD, I'm a fan of WD black HDD and they never disappointed me for years. :)

Assuming all other things being equal, the setup with 8 drives and RAID-Z3 is far preferable. It can handle single to triple fault. That is actually really important today, ../..

The setup with 8 drives will also have significantly better performance, when doing random IO ../..

../.. And then the extremely difficult question, to which I don't know the answer: How reliable are inexpensive 2.5" mobile drives really?

That's exactly my reasoning between my temptation to move from 4x 3.5" to 8x 2.5" but as you wrote it's mostly inexpensive mobile HDD. They are really not tailored to sustain server workload, even though my server load is quite light.

This — and what you wrote earlier — made me rethink my objective. It's impossible to be future-proof: eventually something will break/burn or become obsolete. My current server is stacked in a DC ~500 km from home since july 2013. A fan inside is dead but temp is still under control and smartd alerts me from time to time about unrecoverable bad sectors. If I buy a new server with 128 GB RAM, and ~10TB storage I won't be short anytime soon, or even during the total life of this server. Odds are that it will fail or become useless before I fill its storage.
Hence later expansion and evolution are kinda out of the picture (except may be a CPU upgrade when higher-end become more affordable).
The more I think about it the less I trust 2.5" consumer HDD, and SSD are too expensive.
So, even though 8x 2.5" HDD is way more appealing than 4x 3.5" HDD, the fact is I can get for about the same price a server with 4x 6TB WD Ultrastar SAS HDD (plus full RMA and warranty). I would feel safer with enterprise-grade SAS HDD than consumer slow/low power laptop drives. As Calomel puts it (<https://calomel.org/zfs_raid_speed_capacity.html>) there is no speed benefit from SAS vs. SATA, it's mostly a reliability choice.
With 4x HDD I would setup a raidz2 pool, which is safe enough considering I've also got remote backups. The biggest drawback for me with a 4x 3.5" HDD setup is it's impossible to add a couple of log or cache SSD to the pool, but the server can use an internal NVME SSD.
 
the fact is I can get for about the same price a server with 4x 6TB WD Ultrastar SAS HDD (plus full RMA and warranty). I would feel safer with enterprise-grade SAS HDD than consumer slow/low power laptop drives. As Calomel puts it (<https://calomel.org/zfs_raid_speed_capacity.html>) there is no speed benefit from SAS vs. SATA, it's mostly a reliability choice.
I like it! Matter-of-fact, my home server is somewhat similar, except I have even less redundancy: I have two enterprise near line 3.5" drives. They are quite old by now (about 3 and 5 years), but they were reasonably current capacity when I bought them.

Just one warning: About 15 years ago, everyone "knew" that SCSI=SAS drives are enterprise grade, and IDE=SATA drives are consumer grade; my colleague Erik Riedel even published a nice research paper about that. With enterprise nearline drives today that is no longer true. You can buy the same drive model with either SAS or SATA interface, and it has the same guts, the same quality, the same performance, the same reliability. If using SAS costs you extra, don't bother.

WARNING: I did not say that in general all SATA drives are as good as all SAS drives. At the low end, mobile and consumer drives only exist in SATA; and high-end SSDs only exist in SAS. But in the middle, in the nearline segment, most drives are available with either interface.

... I would setup a raidz2 pool, which is safe enough considering I've also got remote backups.
THIS! You nailed the argument. Data dependability or survival depends on many factors, and the first-level RAID redundancy is only one of them. If all you have to guarantee survival is an unmanaged RAID, then single-fault tolerance is not enough today for high-value data (where loss of data would have significant consequences, loss of money or loss of lots of time). But 2-fault tolerant RAID is probably sufficient for small systems, with a handful disks. The moment you go to thousands or millions of disks (such systems do exist), you probably want at least 3-fault tolerant RAID. But that is all assuming that you ONLY rely on RAID. To begin with, you can make a disk system considerably more reliable by running the disks are reasonable and stable temperatures (which requires monitoring the temperatures, and controlling fans and/or climate control). Scrubbing data also helps significantly, because it finds latent sector errors earlier, and reduces the time window where latent sector errors can metastasize. Monitoring disk health (with SMART) is also good, because you want to replace a failing disk while it is still readable. All these things make it more likely that a "only 2-fault-tolerant" system will survive.

And if it doesn't survive, that's when good backups come in. For example, I have only a 1-fault tolerant RAID at home (2-disk mirror), but I have the first backup that's never more than 1 hour out of date right at home, on a disk that is in a fire- and burglar-protected safe at very constant temperature. The second backup is about a week out of date on average, but sits in a different building many miles away. With that combination I have 4 copies of the data, in 3 different locations, and that's good enough for my home stuff. If I really cared about my data (like it was worth a lot of money), I would have an extra copy on a cloud provider I trust, but for now that's too much hassle.
 
Back
Top