ZFS Large file server

Hi,
I need to deploy a large file server on a physical server (15 x 750 GB sata drives).
I can't rely on hardware raid, so I will probably use ZFS.
I never used this filesystem before; how can I be aware of physical raid failure, for example?
thank you very much
 
Use zpool status or zpool status -x if you want something that is silent when everything is OK.

You can also put daily_status_zfs_enable="YES" into /etc/periodic.conf to have output in daily logs. Also consider daily_scrub_zfs_enable="YES" for automatic scrubs (consistency checks) every 35 days (by default.)
 
I believe FreeNAS has built in alerts for failures. In FreeBSD you can enable the periodic script which will check every day, which is "good enough" for more users. (It also wouldn't take much to write a script to run from cron every 5 minutes to check status and email faults).**

If you're going to use ZFS for the first time though make sure you read up on it, and maybe even familiarise yourself with the commands first. I played around with memory disks for quite a while before I actually used ZFS on a real system.

With 15 drives your redundancy options are striped-mirrors or raidz. Striped mirrors are fairly straight forward and you'll have 7 mirror pairs + a spare. If you want to run raidz, I would suggest 2 or 3 vdevs, and possibly raidz2 if it's an important system. I generally like raidz2 (raid6) as it protects you during a rebuild. With raidz1 (raid5), when you have a disk down and are trying to rebuild you have no redundancy, so an error from another disk will be unrecoverable. (ZFS will still rebuild but will report that you have errors)

Also if it's anything other than temporary working data you'll want to get backup working before you start putting terabytes of data on it. zfs send/recv is the best option as it's very quick and doesn't have to thrash disks looking for changes. That does mean you need a second system running ZFS though. For that you could probably get away with 2 big sata disks in a mirror, then just add more mirrors as you need space. The alternative is to just use file based backup like rsync which means you can store the backup pretty much anywhere. rsync will take a long time every run though if you have several TB across a few million files.

**Looks like 11-REL might have zfsd which is supposed to report/handle errors as they happen, although I've not used it yet so don't know much about it.
 
zfs send/recv is the best option as it's very quick and doesn't have to thrash disks looking for changes. That does mean you need a second system running ZFS though.

????? o_O ????? ZFS can back up to a spare disk just fine without needing a second computer.
 
Yeah I guess you could run a second pool on the same server. I guess I'm just a stickler for having backups stored on entirely different hardware, ideally in a different location.
 
Hi,
I need to deploy a large file server on a physical server (15 x 750 GB sata drives).
I can probably help, as I've built a bunch of 128TB ZFS servers. 8-)
I can't rely on hardware raid, so I will probably use ZFS.
That is likely your best choice.
I never used this filesystem before; how can I be aware of physical raid failure, for example?
1) Monitor your drives with sysutils/smartmontools.
2) Add daily periodic scripts that report on your controller, enclosure, ZFS, SMART, etc. I have a bunch of scripts for this, but they're pretty simple. I can add them to the above article if you'd like.
3) If your controller/chassis supports it, run sesd(8).

One thing to watch out for is that "set autoreplace=on" isn't connected to anything in FreeBSD (and probably other non-Oracle operating systems). There was some talk of having devd(8) look for errors and trigger the replacement, but I don't know if anything came of that.

Regarding backups, you'll hear a lot of "you don't need them, ZFS has snapshots" and similar. What you need is up to you, not others. My systems replicate both locally and offsite, as well as get backed up to LTO6 tapes which are sent to a different offsite location. My scripts for these operations are in the above article. I get about 700MByte/sec (yup, bytes not bits) replicating over a 10GbE link.
 
One thing to watch out for is that "set autoreplace=on" isn't connected to anything in FreeBSD (and probably other non-Oracle operating systems). There was some talk of having devd(8) look for errors and trigger the replacement, but I don't know if anything came of that.

zfsd() seems to be in base now although I've not managed to try it myself or come across anything from anyone else who's used it.
 
Back
Top