9.0 ZFS resilver - 64K/s - 100K/s

Good Morning,

I've been having issues with ZFS v28 and FreeBSD 9. I've recently upgraded from 8.2-RELEASE and I just had a bad disk, I replaced it and it's resilvering. The thing is:

Code:
# zpool status
  pool: data
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Wed Jan 18 02:36:04 2012
    158M scanned out of 939M at 63.1K/s, 3h31m to go
    29.4M resilvered, 16.86% done
config:

	NAME                       STATE     READ WRITE CKSUM
	data                       DEGRADED     0     0     0
	  raidz1-0                 DEGRADED     0     0     0
	    replacing-0            DEGRADED     0     0     2
	      1771085465873421254  UNAVAIL      0     0     0  was /dev/ada0/old
	      ada0                 ONLINE       0     0     0  (resilvering)
	    ada1                   ONLINE       0     0     1  (resilvering)
	    ada2                   ONLINE       0     0     1  (resilvering)
	    ada3                   ONLINE       0     0     0
	    ada4                   ONLINE       0     0     0

errors: No known data errors

I've already played around with various loader.conf settings, mine currently looks like this:
Code:
aio_load=YES
ahci_load=YES

vm.kmem_size=4g
vm.kmem_size_max=4g
vfs.zfs.arc_min=512m
vfs.zfs.arc_max=2g
vfs.zfs.vdev.min_pending=4
vfs.zfs.vdev.max_pending=4
vfs.zfs.prefetch_disable=1
#vfs.zfs.txg.synctime_ms=5 <- if I set this to 5 (default is 1000) then it goes down to 9-15K/s.

kern.maxfiles=950000

This worked before. :(

I know it's not much data, but what happens if I dump my 6TB Data on there and another disk dies.. if that happens, should I wait like 5 weeks? :/

Any help appreciated.

best regards
 
Why would you have to wait 5 weeks? It says it will finish in 3 hours. That's too long for you? The drive that's currently resilvering in one of our storage boxes show 60 hours left. And the drive that just finished resilvering in the other storage box took almost 200 hours.

Anything showing on the console? In dmesg? In /var/log/messages?

Also, remove the kmem lines from your /boot/loader.conf. The default is over 64 GB.
 
phoenix said:
Also, remove the kmem lines from your /boot/loader.conf. The default is over 64 GB.

Would you mind clarifying that for me, please?

The numbers seem to be by the book. If OP has added them, there is probably a valid point for them being there.

@fbettag

How much RAM is there in that machine, and are there any other applications running that would be competing for it?

/Sebulon
 
The default kmem_size_max on 64-bit systems is over 64 GB, and has little relation to the actual amount of RAM in the system. This is just an address space used by the kernel; not actual memory used by the kernel. Here's the output on a system with 24 GB of RAM:
Code:
$ sysctl vm. | grep kmem
vm.kmem_map_free: 7481978880
vm.kmem_map_size: 9049112576
vm.kmem_size_scale: 1
vm.kmem_size_max: 329853485875
vm.kmem_size_min: 0
vm.kmem_size: 20808007680
That's a default kmem_size_max of over 320 GB.

I'm certain that if the OP removes that entry from loader.conf and reboots, they'll find the number to be way above the 4 GB they were limiting it to. And probably find a lot of strange lockups disappear.
 
@phoenix

Thank you, that is good to know.

Although, in my experience, I have had to limit kmem to resolve lockups caused by ZFS allocating too much when there are other applications competing for it.

Also one machine that has 8GB RAM completely locked up when trying to destroy a file system with snapshots. I had to force reboot and set down:

/boot/loader.conf
Code:
vm.kmem_size_max="1024M"
vfs.zfs.arc_max="512M"
To keep it from stalling during the destroy. Before the tuning, you could clearly see in top the RAM disappearing and then locked up when it reached the bottom.

/Sebulon
 
The only tunable you may have to change on a 64-bit machine on FreeBSD 8.2 or newer is vfs.zfs.arc_max, leave the vm.kmem* settings alone.
 
Sebulon said:
Although, in my experience, I have had to limit kmem to resolve lockups caused by ZFS allocating too much when there are other applications competing for it.
We had to tune kmem_size_max with 64-bit FreeBSD pre-8.0, and we still have to tune it a bit on 32-bit FreeBSD. pre-8.0, kmem_size_max was around 1.5x RAM-ish. In 8.0, it was bumped to 64 GB by default. And then it's increased since then. You should not have to touch it on 64-bit 8.x/9.x systems.
 
Wuaaah, lots of responses. Jesus ;)

The box has 4GB RAM.

My issue is not the time it takes, but the Speed it resyncs. It's like Modem-Speed at times, that shouldn't be.

I'll keep you posted what different modifications bring.
 
Now I've put some data on it, removed the kmem lines and a scrub is working nicely.

Code:
 scan: scrub in progress since Sat Jan 21 17:56:15 2012
    2.03T scanned out of 3.44T at 242M/s, 1h41m to go
    1.06M repaired, 58.99% done
Thanks!
 
Back
Top