Improving ZFS Resilver time

olav · Jul 11, 2012

I found this:
http://broken.net/uncategorized/zfs-performance-tuning-for-scrubs-and-resilvers/

Do anyone know what the equivalent commands are for FreeBSD?

Savagedlight · Jul 12, 2012

You may find some of those settings by executing
# sysctl vfs.zfs

t1066 · Jul 12, 2012

All those variables are in the file: /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_scan.c

But they are not tunables in FreeBSD. And FreeBSD does not yet support changing variables on the fly as in Solaris. So if you really need them, you have to change the values of the variables in the above file, compile and install a new zfs.ko, and finally, reboot to have these values to take effect.

olav · Jul 12, 2012

I think I can live with recompiling the zfs module.
Do you know if it is possible to rebuild just the zfs.ko module? What steps would I need to take to do so?

usdmatt · Jul 13, 2012

As Savagedlight said, most of the ZFS tunables are implemented as sysctl variables in FreeBSD, which generally works quite nicely.

Funnily enough, exactly what you are asking for was implemented on the 2nd of this month:

Code:

SVN rev 237972 on 2012-07-02 07:27:14Z by mm

Expose scrub and resilver tunables.
This allows the user to tune the priority trade-off between scrub/resilver
and other ZFS I/O.

Doesn't look like it's made it into anything other than current head but hopefully it'll be in STABLE soon (still means a rebuild although you should get a load of other fixes/improvements by upgrading) and maybe even the next releases.

olav · Jul 13, 2012

That's cool! I guess I will just have to wait a bit more then. I guess FreeBSD 9-STABLE is in code freeze right now for preparing the release of FreeBSD 9.1? I really hope it will make it for FreeBSD 9.1-RELEASE.

Terry_Kennedy · Jul 15, 2012

olav said:
That's cool! I guess I will just have to wait a bit more then. I guess FreeBSD 9-STABLE is in code freeze right now for preparing the release of FreeBSD 9.1? I really hope it will make it for FreeBSD 9.1-RELEASE.

What sort of scrub / resilver times are you seeing? I have a number of systems with 32TB of disk, 22TB usable (3 x 5-drive raidz1's + spare), about half full, and have the following results (3 different systems):

Code:

scan: resilvered 917G in 3h26m with 0 errors on Wed Oct 26 03:38:46 2011
scan: scrub repaired 0 in 6h44m with 0 errors on Mon Nov 21 11:36:19 2011
scan: resilvered 1019G in 4h4m with 0 errors on Sat Jul  7 08:38:09 2012

The only thing that I've found to absolutely kill scrub / resilver performance is dedup. With dedup on I see estimated completion times of several weeks, and that's with 48GB of RAM.

Here's the complete status output from one of the systems, showing the pool configuration:

Code:

(0:1) host:/sysprog/terry# zpool status
  pool: data
 state: ONLINE
  scan: resilvered 1019G in 4h4m with 0 errors on Sat Jul  7 08:38:09 2012
config:

        NAME             STATE     READ WRITE CKSUM
        data             ONLINE       0     0     0
          raidz1-0       ONLINE       0     0     0
            label/twd0   ONLINE       0     0     0
            label/twd1   ONLINE       0     0     0
            label/twd2   ONLINE       0     0     0
            label/twd3   ONLINE       0     0     0
            label/twd4   ONLINE       0     0     0
          raidz1-1       ONLINE       0     0     0
            label/twd5   ONLINE       0     0     0
            label/twd6   ONLINE       0     0     0
            label/twd7   ONLINE       0     0     0
            label/twd8   ONLINE       0     0     0
            label/twd9   ONLINE       0     0     0
          raidz1-2       ONLINE       0     0     0
            label/twd10  ONLINE       0     0     0
            label/twd11  ONLINE       0     0     0
            label/twd12  ONLINE       0     0     0
            label/twd13  ONLINE       0     0     0
            label/twd14  ONLINE       0     0     0
        logs
          da0            ONLINE       0     0     0
        spares
          label/twd15    AVAIL   

errors: No known data errors

This is with very few kernel / ZFS tunables set:

Code:

zfs_load="YES"
vm.kmem_size_max="40G"
vm.kmem_size="36G"
vfs.zfs.arc_max="32G"

I could probably improve on it, but with scrubs / resilvers running at > 500MB/sec, I haven't bothered.

The controller is a 3Ware 9650 with 16 WD RE4 drives attached, each exported to FreeBSD as a single volume.

Savagedlight · Jul 15, 2012

Terry_Kennedy said:
lots of stuff

A bit off topic... But wouldn't you risk losing all your pool data if that log device dies?

Terry_Kennedy · Jul 15, 2012

Savagedlight said:
A bit off topic... But wouldn't you risk losing all your pool data if that log device dies?

No. Aside from offsite replication and tape backups, I actually did have the log device (a PCIe SSD) fail, back before ZFS allowed removing the log device. The pool was still readable, but any attempt to write to the pool caused a panic. With the latest ZFS imported into FreeBSD, I can remove the log device - I've actually tried this, and it works.

olav · Jul 16, 2012

I have a raidz1 with 5 2TB drives, 8Tb usable space and 5GB used. Scrub takes up to 30 hours. I do have one fs with around 200GB which is deduped though.