Help me build my ZFS box

Hey guys,

Have slowly been researching this for the past little while, and this forum has been a very nice source of info - thanks to all so far.

I want to build myself a ZFS server, most likely using Sub.Mesa's distro. This would be as a back-up server; I already have a file server with 10 TB (consisting of 5 2TB drives that Windows 7 sees as "one" with its built-in disk manager) but I have no redundancy at all in there. I'm only looking for something to serve as a back-up box for music, movies, documents, photos, my music studio sessons, etc.; therefore, a 10 TB pool is what I'm looking for. No streaming from the ZFS box. Performance is not that big of a concern, but reliability is.

I do have a few questions, though, for now mostly relating to hardware, specifically the motherboard and hard drives themselves.

First off, I've already got an Ahtlon X2 250, a Scythe Shuriken cooler for it, and a Seasonic 300W power supply lying around. So, I'd like to use those. With that:

1. Mobo: I was thinking the Gigabyte GA-890GPA-UD3H. This has 8 built-in SATA ports. I know this doesn't support ECC memory; but can you suggest an AM3 board that does? The board would (much preferably) have an HDMI port, for all I've got right now for an extra monitor is a 32" 1080p TV. (BTW, plan would be for this to be run headless once configured.)

2. Memory: ECC or non-ECC? I know the whole thing about doing it once and doing it right, but is it worth it? My readings seem contradictory on this;

Hard drives: This is also a big one. I'd like to run RAIDZ2, which means either 6 or 10 drives, as I recall, for optimum performance (or is this only for the 4K corrected thing?). I suppose the easiest way to reach the 10 TB goal is with 4 data drives and 2 parity, which points to 3TB drives - but am I missing something? I guess I could also run 9 1.5 TB drives in RAIDZ. (The numbers for RAIDZ were, as I recall, either 2, 3, 5 or 9 drives for optimum performance?) Not that performance is THAT big a deal, as I've mentioned previously...

So anyhow:

3. What drive size/number combination would you recommend?
4. Using which specific hard drives?

Awesome, thanks!
 
Some points.

Go with ECC. The premium is not much, and if reliability is truly a concern then you should be using ECC. All ZFS does is ensure that data is correct once it is written to disk. If your RAM is faulty you may unknowingly be writing errant data to disk, which ZFS won't do anything about.

Asus motherboards have ECC support as standard (assuming processors support it) from memory. I'd check there first.

The first link has some comments about drives. My choice is Hitachi or Samsung. If you have lots of data, 3TB makes things easier.

Also, if reliability is truly as big a concern as you say, then IMO the 5*2TB of non-redundant data that sounds like it's important to you is an accident waiting to happen, unless backed up frequently. You can use SMART data as a poor man's method for determining data corruption, but it won't catch everything and obviously it won't self-heal.
 
Cool, thanks.

Yeah, I was actually looking at the Asus 880G series of motherboards - I've got one in one of my computers right now! -, I think they'd do fine. I'd need a SAS card though, probably the BR10i.

Either way - for ECC RAM, is this what I'd need? :

http://www.ncix.com/products/?sku=61737&vpn=kvr1333d3d8r9s/4g&manufacture=Kingston

or

http://www.ncix.com/products/?sku=51397&vpn=KVR1066D3D8R7SK2/8G&manufacture=Kingston

I understand that in this application (is in most, I guess), the more RAM, the better - so, maybe two of the latter item...?

And yes, I understand that my current setup is far less than ideal. =) Hence, the desire to setup a back-up box, ASAP...=)
 
SilverJS said:
2. Memory: ECC or non-ECC? I know the whole thing about doing it once and doing it right, but is it worth it? My readings seem contradictory on this;

I would not go for ECC personally here, I once had box with ECC memory, all later boxes was without ECC and I do not see the point in paying extra for it.

SilverJS said:
Hard drives: This is also a big one. I'd like to run RAIDZ2, which means either 6 or 10 drives, as I recall
RAIDZ2 (similar to RAID6) requires at least 4 drives, so 4/5/6/7/8/9/10/11/${MOAR} will do.

To comparision, RAIDZ (slimilar to RAID5) requires only 3 drives.

You should ask Yourself if 2 x RAID5 (STRIPE) + HOT SPARE would not be better (2*4 + 1) or (2*5 + 1).

(or is this only for the 4K corrected thing)
I personally stick with 512B drives, currently 2TB Seagate Barracuda LP (I do not need performance on that box as these are 'low power' drives).

3. What drive size/number combination would you recommend?

I would go for one of these:

STRIPE[ RAIDZ(4 * 2TB) + RAIDZ(4 * 2TB) ] + HOTSPARE(1) = 6TB + 6TB = 12TB with 8 drives + hotspare

STRIPE[ RAIDZ(3 * 2TB) + RAIDZ(3 * 2TB) ] + HOTSPARE(1) = 4TB + 4TB = 8TB with 6 drives + hotspare


4. Using which specific hard drives?
I would omit 4K drives, check lists.freebsd.org how the 3TB drives are working, but they are all 4K drives.
 
vermaden said:
RAIDZ2 (similar to RAID6) ... To comparision, RAIDZ (slimilar to RAID5)

You should ask Yourself if 2 x RAID5 (STRIPE) + HOT SPARE would not be better (2*4 + 1) or (2*5 + 1).

The main reason not to choose RAIDZ1 is with very large disks (2TB etc) in the event of a disk failure the RAID set can take over 24 hours to rebuild which leaves all the data in your pool vulnerable to a second disk failure. It's obviously not veryyy likely to have 2 disks fail within a day or 2 but none the less it's a real risk.
If you you can't or won't take regular backups, can't aford to loose any data (ie data changed between backups) or if uptime is really critical then go for RAIDZ2.

cheers Andy.
 
vermaden said:
RAIDZ2 (similar to RAID6) requires at least 4 drives, so 4/5/6/7/8/9/10/11/${MOAR} will do.

To comparision, RAIDZ (slimilar to RAID5) requires only 3 drives.

This is not correct.

RAIDZ1 requires N+1 drives, therefore the smallest RAIDZ array is with 2 drives.
RAIDZ2 requires N+2 drives, therefore the smallest RAIDZ2 array is 3 drives.

Today's large capacity commodity drives are rather flaky. There is high risk for drive failure while rebuilding. For a system of this size, you are risking too much in having only one parity disk. You may well consider using 8-stable which already has ZFS v28 support and RAIDZ3 (3 spare drives per vdev).

'RAID' was really designed to use large number of small disks, where rebuild times etc are much, much smaller -- but obviously for a home system you need to conserve drive bays, drive ports and power.
 
Awesome - thanks for the replies, guys. Still not sure about ECC or not - I'm not sure what I should get for ECC. Although I love Asus boards, they do have a reputation for being very finicky about memory, and even though they list ECC support, there's very few modules in their memory support list that are ECC (a particular board only had one!).

So I can do RAIDZ3 in FreeBSD 8? I didn't know that! Heck, while we're at it, might as well. So, it would be a total of 9 drives - 6 data, 3 parity? Then, I'll have to research the concept of a hot spare, I'm not too sure how to implement that in FreeBSD? But, if I have the room in the case, the SATA port, and the money, is there any disadvantage to using a hot spare?

Cheers!
 
SilverJS said:
Awesome - thanks for the replies, guys. Still not sure about ECC or not - I'm not sure what I should get for ECC. Although I love Asus boards, they do have a reputation for being very finicky about memory, and even though they list ECC support, there's very few modules in their memory support list that are ECC (a particular board only had one!).
Here's what I do. I go to one of the big memory vendors with a good reputation (e.g. I use Kingston), use their memory tool, select the motherboard I would use, and it spits out a list of suitable memory sticks. I compare with the motherboard's pdf manual to make sure of the rules about how many I need, where to put them and any constraints e.g. quad/dual channel etc. I then find prices from a parts vendor I trust.

A mobo manufacturer only has incentive to do so much validation, and often only with the existing memory at the time. A memory vendor will produce new memory module and validate them against existing motherboards, because if they don't, people won't buy their product. The memory manufacturers will guarantee that their sticks work with your motherboard, so do you really need the blessing of two companies?

Another reason I like ECC btw is that your motherboard will log ECC errors and whether they were correctable or not. It's nice to be able to check that out. With non-ECC RAM, the only way you'd know is if you had a reason to suspect and were willing to have it do memtest over a weekend.
 
carlton_draught said:
Here's what I do. I go to one of the big memory vendors with a good reputation (e.g. I use Kingston), use their memory tool, select the motherboard I would use, and it spits out a list of suitable memory sticks. I compare with the motherboard's pdf manual to make sure of the rules about how many I need, where to put them and any constraints e.g. quad/dual channel etc. I then find prices from a parts vendor I trust.
I discovered that for at least some high-end modules, "Kingston" memory is actually some other brand with a Kingston label on it. In particular, I purchased Kingston KVR1333D3D4R9S/8G which was actually re-labeled Hynix HMT31GR7AFR4C-H9. I had purchased it based on a "this seems like the right part" gut feeling, as the Kingston part number wasn't listed as supported on the Supermicro motherboard (X8DTH-iF) I was using. Once I received the parts and noticed the Hynix part number, I checked and that part was listed as supported by Supermicro.

Getting the exact part isn't as important when using more common memory modules - 8GB registered modules are somewhat unusual.

Another reason I like ECC btw is that your motherboard will log ECC errors and whether they were correctable or not. It's nice to be able to check that out. With non-ECC RAM, the only way you'd know is if you had a reason to suspect and were willing to have it do memtest over a weekend.
Some people will say that memory errors will never happen. This is from one of my other servers (not the Kingston modules) on Sunday:
Code:
+MCA: Global Cap 0x0000000000000005, Status 0x0000000000000000
+MCA: Vendor "GenuineIntel", ID 0x6b4, APIC ID 0
+MCA: CPU 1 UNCOR PCC OVER BUSL0 Source RD Memory
+MCA: Address 0x703ae14
+MCA: Bank 0, Status 0xf624210022200810
+MCA: Vendor "GenuineIntel", ID 0x6b4, APIC ID 0
+MCA: CPU 1 UNCOR PCC OVER BUSL0 Source RD Memory
+MCA: Address 0x703ae14
+MCA: Bank 0, Status 0xb601a00022000800
If someone is going to use checksummed RAID (like ZFS), it makes sense to also use ECC memory. Otherwise the ZFS could be reliably storing garbled data.
 
carlton_draught said:
Here's what I do. I go to one of the big memory vendors with a good reputation (e.g. I use Kingston), use their memory tool, select the motherboard I would use, and it spits out a list of suitable memory sticks. I compare with the motherboard's pdf manual to make sure of the rules about how many I need, where to put them and any constraints e.g. quad/dual channel etc. I then find prices from a parts vendor I trust.

Now why hadn't I thought of that? Perfect! Thanks for that.

So would this be feasible, then? RAIDZ3 plus hot spare? 10 drives total, I guess? (Plus boot drive)
 
OK - so I'll be making the drive down this weekend to pick everything up. Memory Express (in Edmonton) has the 3TB 5K3000 drives, or also some Seagate Barracuda 2TB drives, which I believe are 4K.

Do you guys see any issues with using 6 X 3TB drives in a RAIDZ2 setup? The board I want to use (M4A88T-M) has six SATA ports, so that's awesome. I'd use some random IDE drive on the board's PATA board for a boot drive.

Any show stoppers here? I could also get the 2TB drives, along with a PCI-E SATA expansion card, but I think the less hardware, the better...?
 
SilverJS said:
3TB 5K3000 drives, or also some Seagate Barracuda 2TB drives, which I believe are 4K.

I have Seagate Barracuda LP 2TB and they are 512B (not 4K), but there are more Barracudas then LP only, so check your exact model.
 
vermaden said:
Check here: http://consumer.media.seagate.com/2010/06/the-digital-den/advanced-format-drives-with-smartalign/

Thanks, still just seems to say its automagic. Well if it really works its great! And would beg the question why all 4k drive manufacturers haven't used something similar, would have saved a lot of problems!!
 
rusty said:
Just installed a Seagate Green SATA-III 2TB these are 4K and use SmartAlign.

A standard zpool on a single disk and no messing with ashift etc.
Transfers of a 10GB .mkv gives
read 132MB/s
write 115MB/s

Not bad at all for a 5900 rpm drive.

Hey,

Gotta ask you, since I've gotten burned myself once using a 3TB WD Green- have you also scrubbed that pool afterwards, and tried to use zfs send/recv? I got no indication of error until I did just that. I got as many checksum errors as I had files stored on it=) Then, after I used gpart and created an aligned partition, I haven't had any more issues with it. Could be that it's been fixed in firmware or that it's WD Green only, but I thought I'd at least give you a heads up, just in case.

/Sebulon
 
I did a scrub after the original rsync of 534GB to the pool, just tried a zfs send | zfs recv with no errors I'm relieved to say :)
 
vermaden said:
I would go for one of these:

STRIPE[ RAIDZ(4 * 2TB) + RAIDZ(4 * 2TB) ] + HOTSPARE(1) = 6TB + 6TB = 12TB with 8 drives + hotspare

STRIPE[ RAIDZ(3 * 2TB) + RAIDZ(3 * 2TB) ] + HOTSPARE(1) = 4TB + 4TB = 8TB with 6 drives + hotspare

OK, I'm now at the point where I have to decide how to allocate all those drives. =) I've just received the hardware, which is 8 3TB drives. I plan on using 7, and keeping the 8th as a spare once one fails. I am seriously considering using the second option above, I think that is the best split between performance and redundancy, and resilvering times if required.

I have actually started another thread on these forums on this, but I later remembered that somebody had posted some info about this in one of my previous threads, and here it is. =) Thanks.

So, what do you think? [RAIDZ (3 * 3TB) + RAIDZ (3 * 3TB)] + HOTSPARE?
 
SilverJS said:
[RAIDZ (3 * 3TB) + RAIDZ (3 * 3TB)] + HOTSPARE?

Lets get rid of the marketing first: 3 * 1000^4 / 1024^4 = 2.72 TB (instead of 3 TB)

Average WRITE speed for example Hitachi 7K3000 is about 115 MB/s: http://www.pcdiy.com.tw/cont_img/107291_4.jpg

You have about 2.72 TB = 2785 GB = 2851840 MB data to resilver if a drive fails, let's see how much time it would take: 2851840 / 115 / 60^2 = 6.9 (about 7 hours)

Which is quite acceptable for me, so if you have 8 drives, then RAIDZ(3) + RAIDZ(3) + hotspare(2) should be OK.
 
I do have eight drives, but one of them is not physically installed; I plan on using that as an actual spare to replace a physically failed drive.

So is there a performance advantage to going with two RAIDZ(3)+hotspare vice one RAIDZ2(6)+hotspare?
 
SilverJS said:
So is there a performance advantage to going with two RAIDZ(3)+hotspare vice one RAIDZ2(6)+hotspare?
Yes, with 2 * RAIDZ(3) You will get about 2 x performance of RAIDZ2(6) because these RAIDZ(3) are striped.
 
I see! I'm just starting to get into all of this, and a LOT more research is required - but so far, using FreeNAS 8.0.1 BETA3 and a RAIDZ2(6)+hotspare, I'm getting write and read speeds of about 11-12 Mb per second, which to me seems abysmally low...again, I'll have to research if this is due to network bottleneck or whatever (I'm very green!), but we'll see. Either way, I agree, a 7hour downtime is not a lot, unless I'm away, which I tend to be somewhat often with the job. I guess that's another consideration.
 
Back
Top