questions about users experience with ZFS on FreeBSD

wonslung · Aug 27, 2009

bigearsbilly said:
anectdotal evidence:
I've used linuxes for about 1o years, solaris 10 for a while and
for the last 2 years or so primarily BSD.
the only catastrophic loss of data I have ever had was with ZFS on solaris 10.

not sure why. I still have the same disks but a new mobo.
1 disk has uncorrectable sectors (SMARTD)

also be careful not to rm zfs.cache.

yah, thats part of why i started this thread. I follow the ZFS mailing list for solaris and every other thread is about some sort of pool loss or data loss. I've watched these threads in this forum and it's just not even close. I use ZFS under FreeBSD and it's rock solid for me. I'm sure part of the reason is that it is experimental and only used by a few in FreeBSD, and those users probably go into it expecting there might be an issue, where as on Solaris it's the default and people expect it to work. BUT even taking that into account, it seems to me, that it just doesn't have the issues on FreeBSD. I wonder if it's due to better device drivers?

trash · Oct 13, 2009

"works great until it doesn't" if you know what i mean

yes! i know what you mean. im in the doesnt stage right now!

anyone have experience of trying to import a solaris v13 zfs mirror into freebsd stable 7.2 ? i get 'corrupt gpt tables' im a bit worried about doing anything to them because 1) i dont know what they are 2) theres no backup of my mirror - in fact the mirror *was* the backup

astadtler · Oct 16, 2009

I've been running it for quite a while I had 7.2 w/ a single vdev pool for a long time just to play around. Upgraded to 8.0 in the beta stages a few months ago and made a 4 vdev raidz1 and its great.

Current Specs:
Athlon 64 3200+
Supermicro H8SSL-i
2gb 4x512mb DDR266 RAM
4x1.5tb Seagate 7200.11 SATA in Raidz1
1x120gb Western Digital IDE (system drive)

Only problem I really had was it seems to run out of ram running rtorrent with 600 torrents and afp/smb/nfs/iscsi. I had to get rid of my gmirror for the system drive since geom ate even more memory. Unless anyone has any suggestions I had to tweak down the memory usage which took a performance hit but it doesn't run out of ram.

Code:

vm.kmem_size_max="1024M"
vfs.zfs.arc_max="512M"
vfs.zfs.prefetch_disable="1"

dennylin93 · Oct 16, 2009

I'm setting up a new mail server right now, and I'm planning to use ZFS for the storage.

Currently, I'm waiting for 2 other 500 GB drives so that I can use RAID-Z. I didn't do any tuning (apart from setting kern.maxvnodes=400000 in /etc/sysctl.conf) and it has worked nicely so far.

My specs (HP DL320 G5p):

Code:

FreeBSD 7.2-RELEASE amd64
Intel Xeon X3210 2.13GHz
4 GB RAM

wonslung · Oct 18, 2009

yeah, i'm LOVING my raidz media NAS

currently i have the following setup
intel q9550
8gb ddr2 800
12 1tb hard drives in 3 raidz vdevs with 4 drives each (3x4=12)
FreeBSD is installed to a gmirror of compact flash cards, / /usr and /etc are on the cf cards,
/usr/local /usr/ports /usr/src /tmp and /var are on ZFS filesystems with a lot of other filesystems....multiple jails running via ZFS clones which is REALLY cool.performance is much better than i'd imagined it would be with such cheap hardware and so much stuff....i'm running 7 jails, and a TON of music/Video....it's been great. I have about 10 clients on my network with a peak of about 6 at a time accessing the data, a lot of 1080p video and haven't had any problems to speak of.

ents · Oct 26, 2009

killasmurf86 said:
ON my Desktop PC (25GB ram, 1x160GB + 1x250GB HDD) I almost have no problems....

I do have some small lags, when writing fast to HDD, but They are much smaller and less since Beta1...

Also I have this problem:
http://www.freebsd.org/cgi/query-pr.cgi?pr=137037
But even with that... I still use ZFS and I have never lost even a single bit of my data......

I don't plan to switch to old GPT/UFS anymore

Have you solved your "small lags" problem yet?
I believe I have/had the same problem and found this from opensolaris mailinglist:

http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg26485.html

I solved my problem with vfs.zfs.txg.synctime=1 in the loader.conf

I don't know if this was right solution but there are no stalls anymore.

ernie · Nov 9, 2009

Is it possible to make a ZFS vdev from a network volume like an NFS share or iSCSI?

The tradition method of ZFS seems to use a single chassis with lots of drives, but I am trying to think of a high availability solution using multiple chassis, say 3 of them, so if one fails from say a motherboard fault, the other two can keep going.

Obviously a network volume is going to have a performance hit, but that's the price you pay.

- Ernie.

phoenix · Nov 9, 2009

Yes, you can create vdevs using iSCSI exports. And if Solaris/FreeBSD ever get support for AoE (ATA over Ethernet), FoE (FiberChannel over Ethernet), and other such storage technologies, then those could be used as well.

You can create vdevs using any block device (physical hard drive, iSCSI export, hardware RAID array/LUN, etc).

You can also create vdevs using files, however, that puts you at the mercy of the host filesystem. This is generally recommended only for testing purposes.

One could create an NFS or SMB/CIFS share, create files on those shares, and then create vdevs using those files. But that would defeat the end-to-end checksumming and other features of ZFS, as it would no longer control the entire storage stack.

VictorM · Dec 28, 2009

just my 2c (currently managing around 30TB of redundant data under FreeBSD and NetBSD) - don't use ZFS with off-the-shelf controllers (integrated and/or cheap). 3ware for bootable RAID devices and Marvell for non-bootable JBODs are the entry level for reliable operation in production environment. always make sure you have the latest firmware, working cache battery and UPS. if it does not sound like a home setup, then stop complaining or go to some cheap ReiserFS.

dmdx86 · Jan 2, 2010

VictorM said:
just my 2c (currently managing around 30TB of redundant data under FreeBSD and NetBSD) - don't use ZFS with off-the-shelf controllers (integrated and/or cheap). 3ware for bootable RAID devices and Marvell for non-bootable JBODs are the entry level for reliable operation in production environment. always make sure you have the latest firmware, working cache battery and UPS. if it does not sound like a home setup, then stop complaining or go to some cheap ReiserFS.

My understanding is that with RAIDZ, there is no RAID-5 write hole, therefore worrying about keeping a battery-backed cache to protect your data is no longer an issue. Right?

wonslung · Jan 2, 2010

VictorM said:
just my 2c (currently managing around 30TB of redundant data under FreeBSD and NetBSD) - don't use ZFS with off-the-shelf controllers (integrated and/or cheap). 3ware for bootable RAID devices and Marvell for non-bootable JBODs are the entry level for reliable operation in production environment. always make sure you have the latest firmware, working cache battery and UPS. if it does not sound like a home setup, then stop complaining or go to some cheap ReiserFS.

I don't think there is a problem with using cheap or dumb controllers.

ZFS works well at many levels. I know PLENTY of home users who use cheap 4 port pci cards across 3 or 4 pci slots OR
use the 8 port pci-x cards in normal pci slots and have no problems.

They don't get massive speeds or anything but all the care about is being able to watch a single hd stream (4-10 MB/s) across samba.

I think it is all about what you need and what you're willing to put into it.

FLAGEL · Jan 10, 2010

dmdx86 said:
My understanding is that with RAIDZ, there is no RAID-5 write hole, therefore worrying about keeping a battery-backed cache to protect your data is no longer an issue. Right?

I might very well be misinformed, but if you're using SSD-devices for ZIL you will want to turn off the SSD write cache because it's volatile (unless you have really expensive battery backed SSD-devices) and a power-loss will very likely end in disaster. The downside of turning off write cache is a huge loss i IOPS, hence the need of a battery backed write cache (RAID-card) that takes care of the IOPS-problem.

danbi · Apr 25, 2010

After days of downtime, because of corrupted 3ware RAID arrays, I decided to move all these setups to ZFS. All for good! Especially with regards to storage management. The only downside is the need for more RAM. But this is trouble only with old servers using now exotic RAM types. RAM is very cheap these days.

In my opinion, the only significant benefit of using battery backed RAID controller for ZFS is the added layer of management. You may, for example verify each drive separately. But you may do the same with smartmontools with any controller. The IOP benefit is something I did not observe --- but plan to test more extensively soon.
Of course, there is nothing wrong to use 3ware-type RAID controlers with ZFS, even if you don't use any of their RAID functionality. If you attach more disks, this is the way to go.

For the ZIL, probably the best solution is battery backed RAM disk. You don't need huge capacity here and small capacity, fast SSDs are probably not easy to find anymore. For the lower end/cheap systems -- any flash device is good for this purpose. Trouble is, you cannot remove the ZIL after you start using it.

Same for L2ARC. Here, an SSD is wonderful! But for lower end systems, you may use USB flash tokens, even many of these. Commodity motherboards come with 6 or more USB ports. If you get say 20 MBps read from an $10 4GB flash stick, with 6 of these you will get 120 MBps for $60 at 24GB. Not bad, eh? Still, for larger sizes SSDs win.

I would like to strongly second the statement that Phoenix made "use glabel"! This has saved my day several times and I now routinely label any storage media. I was wondering, if this breaks the ZFS philosophy "use entire disk devices for better performance" -- but haven't seen any degradation so far.

My boot devices of late are .. USB sticks

in gmirror configuration. For several reasons:
- recent motherboards don't even have PATA anymore, especially the desktop models. This makes CF cards useless.
- I happen to use lots of Supermicro systems recently and these have almost always two internal USB slots. Just find small size USB sticks that fit there -- no danger to disconnect these, as when they hang out of the case.
- these are disposable in principle, although I have yet to see one fall on me.
There are also USB flash modules, that directly attach on the motherboard pins (for front panel USB ports for example). These too stay inside the case, are somewhat faster and of course more expensive.
The only trouble with USB flash for boot devices is that FreeBSD 8.0 doesn't play nicely with them at boot time. There is of course an easy fix.

phoenix · Apr 26, 2010

danbi said:
In my opinion, the only significant benefit of using battery backed RAID controller for ZFS is the added layer of management. You may, for example verify each drive separately.

With ZFS on top, you want to disable the auto-verify features of the controller. Let ZFS "scrub" handle that. Scrub checks the actual data on the drives, and can detect (and repair) these errors without impacting disk I/O as much. You really don't want the controller doing a verify while ZFS is doing a scrub while you are doing normal disk I/O.

Trouble is, you cannot remove the ZIL after you start using it.

ZFSv19 introduced the ability to remove log devices. Prior to ZFSv19, though, you need to make sure to use mirrored log devices. If a non-mirrored log device dies, it's possible to lose the entire pool.

I would like to strongly second the statement that Phoenix made "use glabel"! This has saved my day several times and I now routinely label any storage media. I was wondering, if this breaks the ZFS philosophy "use entire disk devices for better performance" -- but haven't seen any degradation so far.

If you label the disk device (/dev/ad0) and not a slice (/dev/ad0s1), then you are still using the entire disk. glabels use 1 sector of the disk for metadata.

wonslung · Apr 29, 2010

my cf cards didn't use pata.

I used a cheap sata=>cf card adapter.