Doubts on ZFS

I have some doubts using ZFS on FreeBSD (at home).
I would like to create a backup server running FreeBSD (just to have sone OS redundancy, since my main server is Debian and I planned to use Solaris 11 for my NAS - although I have a lot of doubts about the latter).
Since I want full disk encryption, I think I should go with geli + ZFS.
Since I'm planning to use more than 10TB of storage I would like to stick with a raidz2 or raidz3 configuration.

From what I understood vdevs cannot be grown, is that correct? How can I therefore add a disk to the pool and grow it? Should I add disks in "packs" of two and put two (or more) HDD into a separate raidz2 vdev? Basically I would like to increase my available disk space, keeping the raid safety and encryption as well. Any tips on that?

Another doubt raises when looking for how to replace a failed HDD. Some say you should simply "zpool attach" a new drive, others say you should use something like "zpool replace". Can you please explain and / or provide a code sample ?

Right now I'm really a noob in the BSD world ... having been using Debian for more than 5 years (and a bit of Gentoo as well) but no FreeBSD / Solaris experience. I appreciate if you can clarify my doubts please (since this would also concern my NAS setup, because it'll use the same filesystem).
Thank you very much.
 
luckylinux said:
From what I understood vdevs cannot be grown, is that correct?

Correct. You cannot change the number of disks in a raidz vdev. You can replace the drives in the vdev with larger ones to increase the total storage size of the vdev (swap 2 TB drives in for 1 TB drives), though.

How can I therefore add a disk to the pool and grow it?

You can't.

Should I add disks in "packs" of two and put two (or more) HDD into a separate raidz2 vdev?

Correct. You add new raidz vdevs into the pool. Start with 1 raidz vdev. As it gets full, add another raidz vdev. As those get full, add another raidz vdev. Etc.

If you start with a 6-disk raidz2 vdev, then be sure to add a 6-disk raidz2 vdev to keep things balanced.

Another doubt raises when looking for how to replace a failed HDD. Some say you should simply "zpool attach" a new drive, others say you should use something like "zpool replace". Can you please explain and / or provide a code sample ?

"zpool attach" is used to attach a drive to an existing single-disk vdev (thus creating a 2-disk mirror vdev) or to an existing mirror vdev (thus turning an n-way mirror into n+1-way mirror). Once the mirror vdev has finished resilvering, you can use "zpool detach" to remove a disk from the mirror vdev.

"zpool replace" is used to replace a disk in a raidz vdev.
 
Thank you for your quick answer, phoenix.

&quot said:
If you start with a 6-disk raidz2 vdev, then be sure to add a 6-disk raidz2 vdev to keep things balanced.
Why is that? If I have a 6-disks raidz2 vdev, why cannot I add a 4-disk raidz2 vdev? I mean that I'd rather not buy HDD in packs of 10 now that they cost so much.

&quot said:
Correct. You cannot change the number of disks in a raidz vdev. You can replace the drives in the vdev with larger ones to increase the total storage size of the vdev (swap 2 TB drives in for 1 TB drives), though.
I'm not sure I understood: if I have a raidz2 for instance configuration of 6 drives of 2TB each, if I replace only one drive with one of 3TB the capacity increases? Or do I have to replace all drives with 3TB's ones before that happens (similarly to Linux's mdadm's RAID-6). At least this way of not allowing growing arrays means no rebuild times of > 24 hours like RAID-6 in Linux.

Is it possible to convert a raidz2 to a raidz3 setup (and viceversa). It could be useful but I doubt it's supported.

&quot said:
"zpool attach" is used to attach a drive to an existing single-disk vdev (thus creating a 2-disk mirror vdev) or to an existing mirror vdev (thus turning an n-way mirror into n+1-way mirror). Once the mirror vdev has finished resilvering, you can use "zpool detach" to remove a disk from the mirror vdev.

"zpool replace" is used to replace a disk in a raidz vdev.
I see, so zpool replace should be used in raidz vdevs when a disk fails. Should I put the new HDD in before I remove the faulty one (so that e.g. I don't get the new corrupted?) or it doesn't matter?

Another question though. The concept of "pool" seems very vast. For instance softwares (on GNU/Linux) like Greyhole or UnRAID claim that a "pool" consists of one "directory / mount point" and the files which will be put there will be copied on N drives (N redundancy). For FreeBSD the concept of "pool" seems to be like a RAID-0, isn't it? I seem to recall having read that it's a kind of stripe. Therefore why isn't ZFS getting much performance out of it?


A few other questions (I hope I'm not bothering you too much): does ZFS work "kind of like" mdadm works (i.e. there is a metadata on the disks which can help identify which HDD is part of which vdev, so that if I swap cables, change sata controller or change motherboard, ZFS will still recognise disks even if they changed name under /dev - on GNU/Linux sometimes you have e.g. /dev/sda and add a sata controller and that becomes /dev/sde for instance).

I don't plan to use deduplication (not sure if it is supported though). I was thinking of setting up backup c on the backup server running FreeBSD as it seems there are a few howtos online. Is is well supported or do you suggest other alternatives?

As for Geli ... how should I encrypt the drive, i.e. before adding it to the vdev or afterwards?

The server is a home made AMD FX-8120 3.1GHz (8 "cores"), 16GB Kingston 1333MHz ECC-RAM (4x4GB) and some hard drives (I'd really love to put them inside a server-like case like Norcos' but it's a bit too much expensive for now). Should this be powerful enough (I will do some virtualisation though, since I'm at home I can simply use virtualbox I think)? By the way I'd like to know some not-too-much expensive SATA controllers which run well under FreeBSD (I bough an Adaptec RAID 1430SA [4 ports] for around 120$ at the time but 3 of those are a bit too much for me). Maybe I'll switch to Norcos' cases of >16 HDD hot-swap but for now >500$ is just too much for me. However online I read something about using a SAS 36pin connector which then splits into 4 SATA connectors. Another alternative seemed to be to use a second case with a PSU and motherboard (no RAM or CPU though) and connect the two pcs with a special 12Gbps cable (sorry I forgot the protocol's name, something like SSG **87 / **88) to a card which goes into a x16 PCIe slot. Do you have an opinion on this or have other suggestions ? Thank you very much. Sorry for the bunch of questions but I'd like to know before I start setting it up.

Last but not least: given HDD prices at my place I though on using 1 (or 2 in a mirror configuration) SSD(s) of 60GB. I wanted to encrypt them though. Is there something like the "dropbear hack" that allows you to unlock the partition at boot via SSH (but actually works)?

Thank you very much. Looking forward for this build. I am free to hear other opinions and configurations. Please note: CPU, Case, RAM, Motherboard for this server are already been paid and are already assembled. I have some HDDs to throw in, but still don't know how many.
 
You can add different sized vdevs together, but it warns you, so you need to use the -f option. I assume you'll find someone telling you it kills performance or something.

To avoid needing to buy 6-10 disks at a time, you can use striped mirrors and just add 2 disks at a time, but it would mean less maximum space (and possibly sequential write speed) compared to raidz.

If you replace small disks with bigger ones, you need to do it to all disks in the vdev (not the whole pool from my understanding) before it increases your space (and you must set autoexpand=on).

I don't think you can replace a raidz2 with a raidz3 without recreating the pool and copying. This would be nice though... since today's disks in a properly sized raidz2 that is okay would suddenly be unsafe with larger future disks. You either want raidz3 or fewer disks in a vdev when the disks are bigger. Maybe some day...

It is up to you whether you want to remove the old before adding the new. I prefer to switch them to spares, remove the old, then replace again with the new. This can be a big waste of time, but is safer in case another disk fails while you remove a partially working disk. If the failing disk is horribly slow (check ms/r in gstat), you should remove or offline it first or your replacing will take much longer. (and speaking of spares, they are not automatically replaced in FreeBSD, in case you were wondering)

A pool is whatever you make of it. A 5 way mirror for example would have bad write performance. A stripe has great performance. A pool just means a thing that is built of smaller parts (vdevs).

About dedup, phoenix can tell you much, and I'm sure he would encourage you to use it your system has the RAM/cache, as he posted here. My experience with older FreeBSD 8.2 builds and dedup wasn't so positive, but briefly testing a build from Feb. 2012, it seems to work well (and got about 1.65x dedup ratio on satellite data files; backups from multiple servers are probably better deduped).

I don't know anything about Geli, but I think you can only combine it with zfs if you encrypt first and put the encrypted devices in vdevs.

About the RAID controller: with ZFS, you don't need to, but you should be using IT (initiator target) firmware instead of IR (integrated RAID), or a non-raid controller, such as the LSI 9211-8i. (That's what I use... I have 4 36-disk boxes (not all filled to 36) using those, but haven't tried other HBAs). That card is much more expensive than $120 though. It also comes in a 4 port version (maybe this is a rebranded one). I converted a hardware RAID system into a ZFS system, and I couldn't get the RAID card to act properly... for example, smartctl couldn't get direct access to the disks. But I'm no expert on things like passthrough, which I assume others would use. And also, again don't really know, but perhaps with a RAID card that doesn't simply have an "AHCI" mode (which mine didn't), it is possible that if your card died and you put a new slightly different one in to replace it, your disks wouldn't work correctly, because of whatever unstandard format the RAID card used. And some cards might not have this issue. I have no experience with it, and things I've read are not clear, so I don't know. (For example, these two threads are unanswered Possible: Configure SAS Sun Jbod J4200 ? and To JBOD or just to pass, that is the question)
 
luckylinux said:
Why is that? If I have a 6-disks raidz2 vdev, why cannot I add a 4-disk raidz2 vdev? I mean that I'd rather not buy HDD in packs of 10 now that they cost so much.

You can (I had a pool with a 3-disk raidz1 vdev and 2-disk mirror vdev), but it's not recommneded and you have to force (-f) the addition of the new vdev. The reason is that it imbalances the pool, as you now how vdevs of different sizes and different speeds striped together. It works, but it's not optimal.

I'm not sure I understood: if I have a raidz2 for instance configuration of 6 drives of 2TB each, if I replace only one drive with one of 3TB the capacity increases? Or do I have to replace all drives with 3TB's ones before that happens (similarly to Linux's mdadm's RAID-6). At least this way of not allowing growing arrays means no rebuild times of > 24 hours like RAID-6 in Linux.

You have to replace every drive in the vdev with larger ones before the extra space becomes available.

Is it possible to convert a raidz2 to a raidz3 setup (and viceversa). It could be useful but I doubt it's supported.

No, you cannot change anything about the setup of a raidz vdev. Once it's created, it's locked in stone.

The only vdev you can manipulate like that is the mirror vdev. You can start with 1 disk (bare vdev). Attach a second disk, thus creating a 2-way mirror vdev. Attach another disk, thus creating a 3-way mirror. Detach a disk, thus creating a 2-way mirror vdev. Detach a disk, thus creating a bare drive vdev. And so on.

I see, so zpool replace should be used in raidz vdevs when a disk fails. Should I put the new HDD in before I remove the faulty one (so that e.g. I don't get the new corrupted?) or it doesn't matter?

Depends on whether or not you have the drive slots for it. The general process is:
Code:
# zpool offline <poolname> <disk device>
<physically remove the disk from the system>
<insert new disk into system, do any configuration needed>
# zpool replace <poolname> <old disk device> <new disk device>
If you have spare slots, you can do step 2 last.

Another question though. The concept of "pool" seems very vast. For instance softwares (on GNU/Linux) like Greyhole or UnRAID claim that a "pool" consists of one "directory / mount point" and the files which will be put there will be copied on N drives (N redundancy). For FreeBSD the concept of "pool" seems to be like a RAID-0, isn't it? I seem to recall having read that it's a kind of stripe. Therefore why isn't ZFS getting much performance out of it?

ZFS gets great performance, depending on the pool layout (all mirror vdevs is the fastest), whether or not you have L2ARC (cache) or ZIL (log) devices, how much RAM you have, controller speeds, workload, etc.

For example, my home ZFS server (crappy 2.0 GHz P4 with 2 GB RAM using onboard SATA ports) uses 4x 500 GB SATA disks in 2x mirror vdevs. I can get a little under 2500 IOps from the pool during port upgrades (mainly updating the pkg db) and a little more than 100 MBps of reads and a little under 100 MB writes. Considering just how crappy of a system this thing is, and how little RAM there is, that's pretty damned fast.

And, our storage boxes at work are limited by network speeds (tops out around 300 Mbps across 9 connections), but can burst reads up to 350 MBps and writes up to 200 MBps (multiple raidz2 vdevs, SSD cache, tonnes of RAM, dedupe enabled).

A few other questions (I hope I'm not bothering you too much): does ZFS work "kind of like" mdadm works (i.e. there is a metadata on the disks which can help identify which HDD is part of which vdev, so that if I swap cables, change sata controller or change motherboard, ZFS will still recognise disks even if they changed name under /dev - on GNU/Linux sometimes you have e.g. /dev/sda and add a sata controller and that becomes /dev/sde for instance).

Yes.

The server is a home made AMD FX-8120 3.1GHz (8 "cores"), 16GB Kingston 1333MHz ECC-RAM (4x4GB) and some hard drives (I'd really love to put them inside a server-like case like Norcos' but it's a bit too much expensive for now). Should this be powerful enough

That's similar to what we use for our storage boxes (8-core AMD Opteron 6000 CPU, 16 GB RAM to start). Works quite nicely, even with LZJB and GZip compression enabled. You won't be CPU-limited. :)

By the way I'd like to know some not-too-much expensive SATA controllers which run well under FreeBSD (I bough an Adaptec RAID 1430SA [4 ports] for around 120$ at the time but 3 of those are a bit too much for me).

We've standardised on SuperMicro AOC-USAS2-L8i (8-port multi-lane) SAS/SATA controllers. They're under $200 CDN, fully supported by the mps(4) driver in FreeBSD 8+ (and the mps(4) driver in FreeBSD 9-STABLE is actually from LSI, the maker of the chipset on these).
 
Peetaur, thank you for your answer, first of all.

&quot said:
About the RAID controller: with ZFS, you don't need to, but you should be using IT (initiator target) firmware instead of IR (integrated RAID), or a non-raid controller, such as the LSI 9211-8i. (That's what I use... I have 4 36-disk boxes (not all filled to 36) using those, but haven't tried other HBAs). That card is much more expensive than $120 though. It also comes in a 4 port version (maybe this is a rebranded one). I converted a hardware RAID system into a ZFS system, and I couldn't get the RAID card to act properly... for example, smartctl couldn't get direct access to the disks. But I'm no expert on things like passthrough, which I assume others would use. And also, again don't really know, but perhaps with a RAID card that doesn't simply have an "AHCI" mode (which mine didn't), it is possible that if your card died and you put a new slightly different one in to replace it, your disks wouldn't work correctly, because of whatever unstandard format the RAID card used.
The controller doesn't have to be RAID since I'll use raidz over ZFS anyway (and so I don't have to buy another controller if that one dies to be able to get my data back).

It doesn't cost as much as I though ... not nearly as much (240$ here). There migh be an error with the pricing though. What I looked for was a controller such that I would not spend over 500$ for hardware controller and filling up all PCIe slots. By using the Adaptec like I did it would've costed 5x100 = 500$ and no more PCIe slots. With the LSI you suggested it'll take only one PCIe slot and cost half that. That's good enough for me if it'll work flawlessly.

&quot said:
About dedup, phoenix can tell you much, and I'm sure he would encourage you to use it your system has the RAM/cache, as he posted here. My experience with older FreeBSD 8.2 builds and dedup wasn't so positive, but briefly testing a build from Feb. 2012, it seems to work well (and got about 1.65x dedup ratio on satellite data files; backups from multiple servers are probably better deduped).
The thing is: I will eventually have (a lot more than) 10TB of data and can only put 16GB of ECC RAM in that board (8GB ECC unbuffered modules are way too much expensive).

However I just found a good quality case from supermicro at the price of one norco's here. It would seem here in Switzerland sometimes they put out prices too low (600$ instead of over 1600$ on amazon or newegg) and I just happen at the right time. I bough a Konica Minolta 4750DN at half the price too (by another retailer), just to see it increasing 5 days later (when they saw their mistake). However I'll need to pair it with a dual processor motherboard supporting registered ECC ram ... That's > 400$ for just the MB, >200$ for each CPU (dual core, six cores are >500$) and 80$ for each stick of 8GB registered ECC. That's a lot of money. Supermicro seems to have a far better reputation than Norco's (and for the same price I get 50% more HDD bays) and surely it's better to have a how-swap case, though it'll be under the desk :e on no rack mounting. 36 hard drives ! What case are you using BTW?

Dedup shouldn't really be necessary though because backuppc already does dedup pretty well (see their website), reducing by a factor of 10 the used space. **If** it'll work with FreeBSD.

Thank you for your answers.

Do you have any experiences with supermicro cassis (or the one I specifically linked)?
 
&quot said:
We've standardised on SuperMicro AOC-USAS2-L8i (8-port multi-lane) SAS/SATA controllers. They're under $200 CDN, fully supported by the mps(4) driver in FreeBSD 8+ (and the mps(4) driver in FreeBSD 9-STABLE is actually from LSI, the maker of the chipset on these).
That's about 60$ less than what peetaur reccomended. Either way I'm fine with it (considering I spent 100$ for a simple 4-SATA devices controller, a ~200$ controller 8-SAS ports compatible with up to 36-SATA devices is pretty damm good). Though the cables may be costly.

Any opinions on SSH unlocking at boot (since I plan on encrypting the whole system drive as well as all drives which will be part of vdev) or you don't have experiences ? Suggested alternatives or workarounds?

See my previous post (~ 10 minutes ago) in which I detailed an alternative setup to peetaur.

Thank you very much again ;) I guess this is the case where you should really plan everything from the start (like I do most of the times) and not "just get the box running". In this aspect I find ZFS a bit "strange" (in the sense you cannot grow the vdev), you basically just have to keep that in mind.
 
Bough the case, need something to put in

OK I ordered the 36 hot-swap case from supermicro since it sounded like a deal.
Now it's time to see which Mobo & CPU to put in there.
Since phoenix suggested that ZFS can eat MUCH RAM (>> 16GB, or was that only with dedup enabled?) I was thinking I should get a server-grade motherboard.
If the backup server (with 36 HDD) should perform well and stay on 24/7 (ideally) in my bedroom (and consider I sleep with a Thermaltake Frio with two fans at take-off power, so I don't think anything else will be a problem :e) without producing too much heat and make my energy bill explode.

That said, I think we can suppose a configuration with >64GB of ECC Registered RAM (here I can get them at ~ $60 / each 8GB module) I think there are currently only three configurations possible:
a) (Cheapest, worst energy efficiency) AMD G34 single socket (like for instance the Supermicro H8SGL-F for ~ $300) with AMD Opteron 6234 [12 x 2.4GHz for ~ $400] or AMD Opteron 6212 [8 x 2.6GHz for ~ $300]. Total cost: 700$ / 600$ + RAM
b) (More expensive, best energy efficiency) INTEL 2011 sigle socket (like for instance the Supermicro X9SRi-F for ~ $400) with INTEL Xeon E5-1620 for ~ $300 (this should be on par with the i7-3820, though it's not yet available). A quad core should be enough. Total cost: 700$ + RAM
c) (Most expensive, best performance) INTEL 2011 dual socket (like for instance the Supermicro X9DR3-F for ~ $600) with 2 x INTEL Xeon E5-2620 (2 x 6 cores, 2 x 12 threads) for ~$450 each. Total cost: 1500$ (dual processor) / 1050$ (single processor)

This would be used at home, theorically only for storage purposes, though virtualisation with Virtualbox could also be implemented (where the configuration c will be useful).
Based on items above and the fact that there'll be a time :)e) where the ZFS pool will contain 72-108 TB (yes, Terabytes - although only theoretically as for now), encryption with Geli will be used, no dedup with ZFS (only dedup with backuppc), which configuration would you reccomend (if any) and how much RAM? 32GB? 64GB? 128GB? Unfortunately if you need more than 16GB ECC you need registered memory, and there you will need expensive motherboards (single stick 8GB ECC unbuffered cost more than twice their registered equivalent, i.e. ~ 160 $ / 8GB stick unbuffered).

Thank you very much. You can also reccomend any other brand of motherboard, any other CPU, as long as I have enough computing power and ECC Registered is supported. Let's say I want to keep the budget for this under control. ATX and E-ATX boards can be used.
 
luckylinux said:
Based on items above and the fact that there'll be a time :)e) where the ZFS pool will contain 72-108 TB (yes, Terabytes - although only theoretically as for now), encryption with Geli will be used, no dedup with ZFS (only dedup with backuppc), which configuration would you reccomend (if any) and how much RAM? 32GB? 64GB? 128GB? Unfortunately if you need more than 16GB ECC you need registered memory, and there you will need expensive motherboards (single stick 8GB ECC unbuffered cost more than twice their registered equivalent, i.e. ~ 160 $ / 8GB stick unbuffered).

Thank you very much. You can also reccomend any other brand of motherboard, any other CPU, as long as I have enough computing power and ECC Registered is supported. Let's say I want to keep the budget for this under control. ATX and E-ATX boards can be used.
I've been using the Supermicro X8DTH-iF motherboards in my large ZFS servers for some time. They have worked out quite well. It supports up to 192GB of RAM, and I'm running them with 48GB (6 x 8GB). One of the best features of this board is that all of the slots are PCIe x8 (with a x16 connector), so expansion boards can be rearranged for optimum airflow. This was my biggest complaint with the Tyan board I used in my previous generation of servers. The X8DTH-iF also has a large number of fan connectors and lots of hardware monitoring, including PMBus. It also includes full remote management (virtual console and media). Here is a large picture of the assembled system.

This is a Socket 1366 system, which is a bit old if you're starting a new build. Prices on the E55xx CPUs haven't dropped as much as I'd expect, so you should compare the price for that configuration against one with newer parts.

One thing that has dropped a lot in price is the memory. When I built these, the cost for the 8GB memory modules was around $250 each. They're now around $75 each. That's for the fancy ones with thermal monitoring and so on.

Regardless of what hardware you get, I'd suggest going with a smaller number of higher-capacity memory modules, since that will let you expand simply by installing more of the same type later on. Depending on the motherboard and CPU used, certain numbers of modules will give a performance boost compared with the same amount of memory in a different number of modules. That's why I'm using 6 modules in my servers, for example.
 
Terry_Kennedy, thank you for your answer.
Unfortunately yes, since I'm starting a new build I'd like to avoid the 1366 socket, for reasons of power consumption and the fact that cpu prices are way too high. A 6C E5-2620 at 2.0GHz cost 100$ less than its 1366 counterpart at 2.4GHz.
Just found that the Supermicro X9DR3-F is interesting, since it can be had for 500$. 3 x16 slots PCI-3.0 as well as 3 x8 slots PCI-3.0. It's not like yours, but I can cope with that. Anyway big SATA/SAS controllers will fit into the x16 slots, while ethernet NICs will fit in the x8 slots (well, they're x1 for the single NIC and x4 for the double INTEL NIC).
This will require about 500$ + 2x450$ ~ 1450$ (please note that I'm doing a fast currency conversion and prices are higher here anyway). This will be a 16 C / 32 T beast and at that point I think it's a pretty good idea to do virtualisation as well.

Unfortunately though I don't see 16GB ECC sticks at online retailes in my country. Do you have some products numbers for those (except thesewhich are quite epensive and cannot ship to my country)? Otherwise I think 16 x 8GB = 96GB ECC Registered RAM is pretty damm good 8(though that'll cost another 16 x 70$ ~ 1120$), therefore I think for the moment I'm only gonna buy 8 sticks of those.
Do you think a setup like this one is gonna work? Or are there less costly alternatives that could work? I'm still waiting for the 36 bay supermicro case-beast right now, though it should arrive in a few days.
 
If you're worried about costs, look into the AMD boards. You'll save on the cost of the motherboard, the CPU, and especially on the RAM (standard DDR3 is cheap like borscht right now, and you only need pairs of DIMMs).

A SuperMicro H8DGi-F motherboard, an AMD Opteron 6128 CPU (8 cores), and 16 GB of DDR3 RAM should come out to about half of the Intel option. Plus, you can stuff 256 GB of RAM into the board. :)

A SuperMicro 836 chassis (24 hot-swap bays with multi-lane backplane), 24x 2.0 TB WD drives, 32 GB SSD, 3x AOC-USAS2-8Li controllers (with multi-lane cables), and the above mobo/CPU and 32 GB RAM comes out to under $3000 CDN.
 
Strange

&quot said:
If you're worried about costs, look into the AMD boards. You'll save on the cost of the motherboard, the CPU, and especially on the RAM (standard DDR3 is cheap like borscht right now, and you only need pairs of DIMMs).
I would like this to be true.
Not sure how are things where you live, but here unfortunately the prices are almost the same.
Sorry if I put the prices in USD, though I think this is the standard currency to discuss.

&quot said:
A SuperMicro H8DGi-F motherboard, an AMD Opteron 6218 CPU (8 cores), and 16 GB of DDR3 RAM should come out to about half of the Intel option. Plus, you can stuff 256 GB of RAM into the board.
Where did you find that CPU ? Did you misspell it maybe ? I see there is the 6212 or the 6220 for 8 cores.

For AMD setup (Interlagos)
1 x SuperMicro H8DGi-F ~ 650$
2 x AMD Opteron 6234 (12x2.4 GHz) ~ 2 x 430$ ~ 860$

For INTEL setup (Sandy Bridge-E)
1 x Supermicro X9DR3-F ~ 570$
2 x INTEL XEON E5-2620 (6x2.0 GHz / 8 Threads) ~ 2 x 470$ ~ 940$

Therefore it's not such a big difference.
Power consumption, though not the most important factor, is a concern.
The Interlagos series matched the power consumption of the 1366 dual socket series. The Sandy Bridge-E chips are about 20-30% less power hungry with respect to the previous (INTEL) generation at idle (and since processor are know to be in idle state most of the time ...).

Concerning memory I did not understood your point. Do you mean using non-ECC memory ?
Or using ECC-unbuffered memory ?
In the case of ECC-unbuffered memory the price difference is just ridicolus: about 200$ for the unbuffered vs about 80$ for the registered (for ONE 8GB DIMM). Regarding 1x8GB ECC-registered vs 2x4GB ECC-unbuffered the price difference is small.

I really wanted to use ECC because I'd like to avoid memory corruption. It is furthermore know that ECC-registered memory tends to be more stable and reliable tanks to the "registers" mechanism. Non-ECC memory costs about 60$, not that much difference IMHO.

A SuperMicro 836 chassis (24 hot-swap bays with multi-lane backplane), 24x 2.0 TB WD drives, 32 GB SSD, 3x AOC-USAS2-8Li controllers (with multi-lane cables), and the above mobo/CPU/RAM comes out to under $3000 CDN.
What to say other than that I should move to Canada :e
How come do you find 2TB drives so cheap ? Do you take them in huge quantities by special sellers or have a special price ? Here a 2TB drive costs about 165$ (online standard retailers), therefore 24x2TB ~ 4000$ by theirselfs. The SuperMicro 836 costs about 1200$ here.
The 36 bay Supermicro SuperChassis 847E16-R1400UB cost me about 700$ (special price / discounted). Do you think its worth to buy another one for the future ? That price seems so damm good for what it is (with a 1400W PSU).
 
Sorry, wrong CPU. It's a 6128 (G34 socket), Magny-Cours architecture, not Interlagos. And we only use 1 CPU (the second CPU isn't needed yet, as we don't have more than 128 GB of RAM, or a need for more than 3 PCIe slots; the other CPU slot manages 128 GB of RAM and 3 PCIe slots).

RAM is generic DDR3 ECC registered DIMMs. For instance, Kingston KVR1333D3D4R9SK3/24G (8 GB DIMMs, kit of 3, and then we bought one more DIMM to make 32 GB RAM total).

We buy from a local computer shop (Kamloops Computer Centre) that special orders parts for us. Or we buy direct from CDW, an online retailer. We do buy in bulk, though, so we do get some discounts.

I don't have the purchase order anymore (that goes upstairs), but the last one I saw was under $3000 CDN for the hardware in my last post.
 
OK, so you buy in bulk. Well I don't think this is possible for a private like me ... unless I order 24HDD in one order then maybe. Will have to ask a friend of mine for a very cheap seller ;)

However the point is that the price difference between AMD and INTEL-last-generation here is next to nothing (both for single socket than dual socket). I don't know about PCIe slots, maybe using all of them will come in handy. As for the RAM I can "only" use 64GB with one CPU using 8GB registered ECC memory.

I would like to ask another time
a) Do you think its worth buying another supermicro case like the one I cited in my previous post (36 x 3.5'' HDD bay - 24 on the front and 12 at the rear for ~ 700$). Amazon and newegg list it for 1600$. At this point I'm asking myself what is wrong with this case ... other than its good price (here)

b) What 8GB modules were you suggesting? For comparison 8GB ECC registered modules are about 80$ here, while the 16GB ECC registered modules are (rare) at about 240$

c) The CPU you suggest is indeed cheap. See here for my (local) choises. Still think an E5-2609 or other 4C/8T will be better on the power consumption since it's the same price (well ... not quite compared to the 6128, but to every other G34 CPUs yes). Is it my imagination or INTEL did some good price work here? Or are AMD prices just bad where I live? INTEL's performance per watt and performance per dollar seems good.

Please note: I own several AMD machines, so I'm not an INTEL fanboy. However if for the same price I end with better performance, a non-dead socket (G34 will be replaced in the next architecture) and pay less electrical bills, I don't see why not.

d) What do you think about the case ? Should I buy another one for the future since it's so cheap ?

Thank you very much again.
 
Sorry phoenix.
I just saw the part about the ECC memory. Sorry about that. I think I'm going to order that or something similar.

Last (hopefully) question: in one of you previous posts (as well as one of previous' peetaur's posts) you/him suggested both a SATA/SAS controller with 8 ports, claiming that it could manage up to 32 drives. I'm really a noob in this area as I'm using standard SATA HDDs, but how is it possible that a 8 port controller can manage 32 drives? Some claim they can support 122 drives or something like that. SAS interface can not be splitted into 4 SATA channels, can it?
Now you're suggesting using 3 x 8-ports controllers, however last time you said I could manage 32 drives with only one ~ 200$ controller. Am I missing something (special SF-8087 or SF-8088 cables)?
 
Wasn't me that mentioned using 1 controller to manage more than 8 drives.

Some SAS controllers support "SAS Expanders" where 1 SAS channel can be used by (I think, might be more) 16 harddrives. You plug the drives into the expander, you connect the expander to the controller, and voila! Obviously, the bandwidth of the channel is shared amongst all the drives connected to it, so it's not a good setup if you want raw performance.

Some SATA controller support "SATA Port Multipliers" where 1 SATA channel can be used by (I think, might be more) 8 harddrives. The support for SATA PM is not as robust as SAS Expanders, though.

Personally, we use 1 SATA channel per harddrive. We've switched from using discrete backplanes (individual SATA connector for each harddrive bay) to multi-lane backplanes (4-lane cable from controller to backplane to support 4 harddrive bays) to eliminate some cable clutter.

A 36-bay chassis for $700 sounds pretty good to me. :) More drive bays is always better. :)
 
Might be a good idea to setup a test vitual machine in (virtualbox for example) with small virtual disks so you can get used to all the concepts and commands available when working with zfs.
 
&quot said:
Personally, we use 1 SATA channel per harddrive. We've switched from using discrete backplanes (individual SATA connector for each harddrive bay) to multi-lane backplanes (4-lane cable from controller to backplane to support 4 harddrive bays) to eliminate some cable clutter.
I'm not really looking for high level performance. However if a controller fails, how does ZFS behave? I mean, something like 8 HDD are going off. Replacing the controller with a new one and connecting back the cables should do it (provided that the server is turned off as soon as the failing controller is detected)?
Unfortunately the controller from supermicro you suggested is not available at my place. Any reccomended alternatives?

&quot said:
A 36-bay chassis for $700 sounds pretty good to me. More drive bays is always better.
OK I'll order a second one if available. 2 for the price of 1 kinda ...
This way I won't bother when I have to expand it/them.
 
rabfulton said:
Might be a good idea to setup a test vitual machine in (virtualbox for example) with small virtual disks so you can get used to all the concepts and commands available when working with zfs.
I agree with you. Right now there is this deal of hardware so a bunch of questions were thrown up as well.
I'll do a virtualbox test as you suggested though before doing any real world usage.
 
luckylinux said:
Do you have any experiences with supermicro cassis (or the one I specifically linked)?

I have FreeBSD and ZFS on 4 of what looks like the chassis you linked/equivalent. The part number for 2 of them is 847E16-R1400LPB (others possibly same). One of them is 1.5 years old, upgraded to ZFS, I think with SATA 3Gbps expanders. The first ZFS one was installed around October. The old one was upgraded less than a month ago.

One thing I love about Supermicro boards (talking about boards now, not cases), is the onboard IPMI stuff includes a KVM over IP with web interface, and there is some software you can install that manages it all in one app instead of a web separate interface for each.

And I'm not so sure you should be so picky about getting cheap CPUs... the disks cost more than the rest of the machine. ;)
 
peetaur said:
I have FreeBSD and ZFS on 4 of what looks like the chassis you linked/equivalent. The part number for 2 of them is 847E16-R1400LPB (others possibly same). One of them is 1.5 years old, upgraded to ZFS, I think with SATA 3Gbps expanders. The first ZFS one was installed around October. The old one was upgraded less than a month ago.

One thing I love about Supermicro boards (talking about boards now, not cases), is the onboard IPMI stuff includes a KVM over IP with web interface, and there is some software you can install that manages it all in one app instead of a web separate interface for each.

And I'm not so sure you should be so picky about getting cheap CPUs... the disks cost more than the rest of the machine. ;)
It's not exactly the same then (mine is 847E16-R1400UB).
Yeah the IPMI feature is nice and all, but I don't think that is really a must for me.

About the disks I agree they are expensive (expecially all this speculation right now due to low stocks), however I'm going to buy only the ones I need right now and more when they'll be (a lot) cheaper. Still I don't think I need a hyper-power CPU and, for both INTEL and AMD, getting 100MHz-200MHz more per CPU's core ends up in costing more than 50% compared to similar CPUs (see INTEL XEON E5-2630 vs E5-2620, AMD Opteron 6276 vs Opteron 6272, ...) and that's just nuts in my opinion.

I think for harddrives it's better to find a less expensive site to buy them ... my preferred shop is super expensive about HDDs right now.
 
luckylinux said:
I'm not really looking for high level performance. However if a controller fails, how does ZFS behave? I mean, something like 8 HDD are going off. Replacing the controller with a new one and connecting back the cables should do it (provided that the server is turned off as soon as the failing controller is detected)?
It should "just work", possibly with you needing to issue a command to get ZFS to go check to see if disks have reappeared. For best results, glabel each of your disk drives with a unique label (possibly including the controller number and chassis slot), then when you create the ZFS pool, specify the members as labeled devices. That should let you shuffle drives between controllers and mounting slots and have ZFS figure it out for you.

Unfortunately the controller from supermicro you suggested is not available at my place. Any reccomended alternatives?
I assume this is one of the bracket-on-the-wrong-side LSI cards that Supermicro offers? I think Intel sells them with the right-side brackets. I'm not sure why those are considered a great card for FreeBSD - the drivers emit rather obscure messages. As an example, the mpt driver says things like:
Code:
Jul 20 14:00:57 new-gate kernel: mpt0: mpt_cam_event: 0x21
and thinks that is a perfectly useful error message. And the management tools aren't that great either, particularly compared with some of the other products from the same manufacturer. I like the 3Ware 9650 family of cards. While they're a true hardware raid controller, they're perfectly happy exporting each drive individually (which is what you want for ZFS). Just make sure that you export them as created volumes, not JBOD, or you won't be able to take advantage of the controller's cache memory (and optional BBU). The 3Ware drivers for FreeBSD (twa, twe, and tws) get active support from 3ware engineers. There's both a CLI and a web-based GUI for managing everything having to do with the controller, drives, etc. And if you're using a compatible chassis, the 3Ware controller will communicate with the drive bays to show real-time drive status. On the systems I'm using, there are 3 different colors of LEDs to show drive enabled, drive activity, and drive error.

OK I'll order a second one if available. 2 for the price of 1 kinda ...
This way I won't bother when I have to expand it/them.
Yes, once you're happy with a server-style chassis, order a second one (or more) right away if you plan on expanding or needing any spares. One well-known vendor shipped me 3 chassis (same model number) over a period of around 6 months and no 2 were the same - one even was 3" shallower than the other two!

I think for harddrives it's better to find a less expensive site to buy them ... my preferred shop is super expensive about HDDs right now.
The combination of the flooding in Thailand and the consolidation going on in the HDD industry has had a huge effect on pricing and availability.

Whenever there's a shortage or high prices, some shady characters come out of the woodwork. A common tactic is to sell OEM versions of the drives (which may have different firmware, and almost always have a shorter warranty than retail / distributor models). Everything looks good until you go to the manufacturer's web site in a few months to RMA a drive that's failed, and when you enter the model and serial number, you get told that it was sold to an OEM and that all warranty claims need to go through that OEM. And neither the manufacturer nor the discounter that sold you the drive will help you.

Another thing that's been showing up lately is batches of drives marked "Refurbished", "Recertified", or so on. These may have undergone inspection by the manufacturer, or some enterprising store along the line may have simply looked at the SMART stats and went "Yup, looks OK, slap a certified sticker on it".

Lastly, the manufacturers have been decreasing the quality of their shipping materials. Generally, the larger the number of drives in a box, the better job of packing they do. Of course, when those drives get to a reseller, all bets are off. I've received bare drives in antistatic bags rattling around in boxes, drives with packing peanuts or air pouches (both of which get crushed when the drive starts moving around). I try to buy drives for my servers in manufacturer 20-packs, from a reseller that won't break them down.
 
Thank you for your useful information Terry_Kennedy.

Do you know how to look for resellers? Sorry for the dumb question, but retailers and resellers suggested by WD and Samsung have the same prices than regular shops (at least here in Switzerland). When buying in 20-pack quantities, how much of discount do you expect generally? In 10-pack? Here in Switzerland I don't find retailers which big discounts on huge quantities. Maybe since you're (far) more in the business than me I can ask you if you know some places (or in Europe).

Unfortunately the 3Ware 9650 series is expensive like no other. Just too much. Its more than 2 times the price of the LSI (basically you get 3 drives for the price of 8).
Does something less expensive (but with a fairly good quality and support on FreeBSD) exist? I'm not asking for top-notch quality (enterprise) since this is for home, but a good support would be good. I cannot afford however 3-4x600$ SAS/SATA controller (plus 2xCPU, MB, RAM, chassis, HDDs).
 
luckylinux said:
Do you know how to look for resellers? Sorry for the dumb question, but retailers and resellers suggested by WD and Samsung have the same prices than regular shops (at least here in Switzerland). When buying in 20-pack quantities, how much of discount do you expect generally? In 10-pack? Here in Switzerland I don't find retailers which big discounts on huge quantities. Maybe since you're (far) more in the business than me I can ask you if you know some places (or in Europe).
I tend to use Tech Data for this sort of thing. In the US, you need to have a business tax ID and provide credit references. I'm not sure how the requirements vary in their international offices.

Unfortunately the 3Ware 9650 series is expensive like no other. Just too much. Its more than 2 times the price of the LSI (basically you get 3 drives for the price of 8).
Does something less expensive (but with a fairly good quality and support on FreeBSD) exist? I'm not asking for top-notch quality (enterprise) since this is for home, but a good support would be good. I cannot afford however 3-4x600$ SAS/SATA controller (plus 2xCPU, MB, RAM, chassis, HDDs).
I understand you completely. While some of my servers are in my home (office), they're all mission-critical and the remote management the 3Ware cards provide* is some of the best in the business. Product support has also been, overall, excellent - I've reported the occasional hard-to-reproduce issue to them and they've taken the time to track it down and fix it, and have published the beta firmware for anyone to test if they like. They also provide full-source drivers to the FreeBSD source tree, as well as binaries of FreeBSD-native management utilities.

I do have some systems which use the LSI SASxxxx family of controllers. In fact, the 32TB servers with the 9650 cards also use 2 each of SAS1068 cards - 1 on the PCIe SSD and one on the external SAS ports that connect the servers to the drive ports in the robotic tape library. I'd characterize the driver and utility support for thise cards as "adequate" - the one on the SSD reports an error when asked about RAID properties and doesn't pass SMART commands through, and the one that talks to the tape drive tends to emit those cryptic "event 0xNN" messages.

* I footnoted the 9650 management utility because it has been broken since January's Microsoft Patch Tuesday. Attempting to connect to 3dm2 gets you an Internet Explorer "page cannot be found" error. I reported it the night Microsoft released the patch, with the first reply being "this just hit us - stay tuned" and one 2 weeks later saying that a new release was coming "iminently". The other day I asked them to pin down "iminently" and they said they couldn't say. In the meantime, Firefox adopted the same code fix as IE, so the old workaround of using Firefox to view these pages doesn't work now, either.
 
I tend to use Tech Data for this sort of thing. In the US, you need to have a business tax ID and provide credit references. I'm not sure how the requirements vary in their international offices.
Unfortunately Tech Data requires tax & company details which I don't have because I'm a private. I don't understand why they're making this ... I mean ... only companies can buy 10+ HDDs for a good price while a private has to spend a lot more in regular retailers ?
Do you think there is an alternative (for privates like me) ?
 
Back
Top