ZFS resilvering VERY slow - 8.3 running on VMware

We had some zpool issues while running 8.2, basically every time we ran any zpool / zfs command the system would kernel panic and reboot, upgraded to 8.3 and this went away.

I'm running 8.3-RELEASE on VMware, with two pools, a 10TB and a 6TB.
After upgrading to 8.3 from 8.2, there were no issues, however we have NEVER scrubbed our pools because the ETA under 8.2 was 7,000,000,000 days.

I started a scrub and that was going nicely, then it detected data errors on the pools and has started resilvering them, It's been going for around 6 hours so far:

Code:
 int-freebsd-backup# zpool status -v
  pool: backup
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Dec 11 18:30:32 2012
        71.4G scanned out of 9.23T at 11.8M/s, 225h23m to go
        56K resilvered, 0.76% done
config:

	NAME        STATE     READ WRITE CKSUM
	backup      ONLINE   5.33K     8     0
	  da1       ONLINE       0     0     0
	  da2       ONLINE       0     0     3
	  da3       ONLINE   5.32K   351 2.04K  (resilvering)
	  da4       ONLINE       0     0     8
	  da5       ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198a7>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198ab>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198ad>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198b9>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198bd>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198be>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198bf>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198c3>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198c4>
        backup/somecooldesktop@auto.2012-11-16-11-04:<0x198c6>

  pool: backup2
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Dec 11 14:45:12 2012
        26.8G scanned out of 5.45T at 1/s, (scan is slow, no estimated time)
        0 resilvered, 0.48% done
config:

	NAME        STATE     READ WRITE CKSUM
	backup2     ONLINE   1.00K     9     0
	  da6       ONLINE     961     0   691
	  da7       ONLINE      64     0     0
	  da8       ONLINE    1002    35   538

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <metadata>:<0x1>
        <metadata>:<0x12e>
        <metadata>:<0x13a>
        <metadata>:<0xdc>
        <metadata>:<0xe2>
        <metadata>:<0xe3>
        backup2/db-somecoolapp@auto.2012-11-04-13-45:<0x0>
        backup2/db-somecoolapp@auto.2012-11-04-13-45:<0x5078>
        backup2/db-somecoolapp@auto.2012-11-04-13-45:<0x507f>
int-freebsd-backup#

The server has 8 cores 18GB of RAM and ZFS is tuned as such:

Code:
cat /boot/loader.conf 
# Beginning of the block added by the VMware software
vmxnet_load="YES"
# End of the block added by the VMware software
# Beginning of the block added by the VMware software
vmxnet3_load="YES"
# End of the block added by the VMware software

#ZFS Tweaks
#I have 16G of Ram
vfs.zfs.prefetch_disable=0

#If Ram = 4GB, set the value to 512M
#If Ram = 8GB, set the value to 1024M
vfs.zfs.arc_min="2048M"

#Ram x 0.5 - 512 MB
vfs.zfs.arc_max="15872M"

#Ram x 2
vm.kmem_size_max="32G"

#Ram x 1.5
vm.kmem_size="24G"


What exactly is going on here, should I leave it for 300+ hours to do it's thing, or do I need to intervene and fix something?

I'm quite new to ZFS and certainly not a BSD expert, should resilvering be configured differently when using virtual disks rather than physical disks?

Any help would be greatly appreciated.
 
Also, there is NOTHING else running on the server and it's not using any CPU.

Code:
last pid:  5715;  load averages:  0.00,  0.02,  0.00                                                                                                                                                                  up 0+02:33:02  20:34:58
94 processes:  1 running, 93 sleeping
CPU:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 38M Active, 17M Inact, 6936M Wired, 84K Cache, 59M Buf, 11G Free
Swap: 4096M Total, 4096M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 1748 root        1  44    0 26320K  3984K select  0   0:04  0.00% vmtoolsd
  359 root        1  44    0  5248K  3212K select  0   0:01  0.00% devd
  645 root        1  44    0 20908K  8824K select  2   0:00  0.00% perl5.10.1
  640 nagios      1  44    0  6888K  1560K select  2   0:00  0.00% nrpe2
  461 root        1  44    0  6920K  1536K select  3   0:00  0.00% syslogd
  546 root        1  44    0 11808K  2620K select  3   0:00  0.00% ntpd
 3801 root        1  48    4  6868K  1340K biord   0   0:00  0.00% fsck_ufs
  639 postfix     1  44    0  7096K  2164K kqread  0   0:00  0.00% qmgr
  634 root        1  44    0  7092K  1900K kqread  2   0:00  0.00% master
89313 root        1  44    0 38108K  5392K select  0   0:00  0.00% sshd
89315 root        1  44    0 10336K  3104K pause   1   0:00  0.00% csh
 1642 root        1  44    0  7976K  1612K nanslp  2   0:00  0.00% cron
81817 postfix     1  44    0  7092K  1892K kqread  3   0:00  0.00% pickup
 1226 root        1  44    0 26176K  4636K select  0   0:00  0.00% sshd
 5715 root        1  44    0  9372K  2312K CPU2    2   0:00  0.00% top
 
User23 said:
Is the pool more than 80% full?

No, it is only about half full at most.
The resilvering is bogging down zfs so badly that I now can't run any zfs / zpool commands on these two servers, and need to restore some backups ASAP.
 
Code:
# ps -ef | grep zpool
ps: Process environment requires procfs(5)
16186   0- DN+    0:00.00  zpool status -v
19787   1- DN+    0:00.00  zpool status
35431  10  D+     0:00.00  zpool status
 
sammcj said:
I started a scrub and that was going nicely, then it detected data errors on the pools and has started resilvering them, It's been going for around 6 hours so far:

It's still running "scrub". You can stop the scrub at any time:
# zpool scrub -s backup

Code:
 int-freebsd-backup# zpool status -v
  pool: backup
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Dec 11 18:30:32 2012
        71.4G scanned out of 9.23T at 11.8M/s, 225h23m to go
        56K resilvered, 0.76% done
config:

	NAME        STATE     READ WRITE CKSUM
	backup      ONLINE   5.33K     8     0
	  da1       ONLINE       0     0     0
	  da2       ONLINE       0     0     3
	  da3       ONLINE   5.32K   351 2.04K  (resilvering)
	  da4       ONLINE       0     0     8
	  da5       ONLINE       0     0     0

Eek! No redundancy in a backup pool? Hope you have backups of your backups. All those "permanent errors" that are listed in the status output are files that are gone, corrupt, unfixable.


Code:
pool: backup2
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Dec 11 14:45:12 2012
        26.8G scanned out of 5.45T at 1/s, (scan is slow, no estimated time)
        0 resilvered, 0.48% done
config:

	NAME        STATE     READ WRITE CKSUM
	backup2     ONLINE   1.00K     9     0
	  da6       ONLINE     961     0   691
	  da7       ONLINE      64     0     0
	  da8       ONLINE    1002    35   538

Eek again!

The server has 8 cores 18GB of RAM and ZFS is tuned as such:

Remove all those ZFS "tweaks". They aren't needed.

What exactly is going on here, should I leave it for 300+ hours to do it's thing, or do I need to intervene and fix something?

Stop the scrub. And then consider adding some redundancy to the pool.
 
I'm confused about one thing here, what does "resilvering" mean in a pool without reduncancy? In a pool with reduncancy it's clearly making a copy of the good data to the just added disk (in case of mirror) or filling in the parity disk using the existing data (raidz).
 
If I remember correctly, the metadata are stored multiple times on the pool. So scrubbing is trying to repair all those metadata.
 
phoenix said:
Eek! No redundancy in a backup pool? Hope you have backups of your backups. All those "permanent errors" that are listed in the status output are files that are gone, corrupt, unfixable.

Right, looks like both pools are JBOD.
 
kpa said:
I'm confused about one thing here, what does "resilvering" mean in a pool without reduncancy? In a pool with reduncancy it's clearly making a copy of the good data to the just added disk (in case of mirror) or filling in the parity disk using the existing data (raidz).

Yeah I'm not sure if it should resilver if the disks are virtual, then again how does it know that it's living in vmware, and if it doesn't maybe thats part of the problem, it's treating the already redundant disks are physical disks.
 
sammcj said:
Yes, but they're actually in a RAID6 array at the hardware level.

That's not possible with ZFS if your pools are really set up like you listed above. What you now have is a JBOD setup without reduncancy. The RAID6 arrays you have on the disks would be available to FreeBSD as two single disks if correct drivers were used but now there's just lots da* devices.

Hope your backups aren't too critical because it may be very hard to recover everything off the current pools.

Time to redo your pools with redundancy. Don't change anything in the hardware setup but create two RAIDZ pools, maybe the first one with five disks could be a RAIDZ2 pool and the second one a RAIDZ pool. Those with more experience with RAIDZ pools than me can maybe offer some opinions here.
 
sammcj said:
Yes, but they're actually in a RAID6 array at the hardware level.

And now you're seeing why doing it this way is bad. :) Let ZFS handle the redundancy, otherwise, it can't fix the errors the scrub detects. All those errors ... are non-recoverable, and not picked up by the hardware RAID.
 
If this box was to be built again, would there be much to gain by building it directly on the hardware rather than on top of VMware?
If we were to rebuild the server on top of VMware again, how many virtual disks would you present the disk array as?
 
I thought I'd provide a little update here...

We spoke to a VMware 'expert' and recommended the following:

In VMware, all the disks were being provided under one single virtual scsi adapter.
We have added 3 more virtual adapters and have balanced the disks between them.

We now have much better performance across the board:

  • Resilvering has sped up from '1/s' to 1GB/s-3.7GB/s.
  • Zpool Status (even -v) hasn't locked up or even been slow once.
  • We haven't had to reboot freebsd.
 
Back
Top