ZFS Very slow resilvering

Hi
I am using FreeBSD 10.3
I replaced a failing disk in a zpool some time ago - output below
Code:
pool: data2
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Apr 30 15:59:28 2021
        4,82T scanned out of 12,4T at 696K/s, (scan is slow, no estimated time)
        1,13T resilvered, 38,85% done
config:

        NAME                        STATE     READ WRITE CKSUM
        data2                       DEGRADED     0     0     0
          raidz1-0                  DEGRADED     0     0     0
            ada2                    ONLINE       0     0     0
            label/DISK21            ONLINE       0     0     0
            label/DISK22            ONLINE       0     0     0
            replacing-3             DEGRADED     0     0     0
              15957692350125946024  OFFLINE      0     0     0  was /dev/label/DISK23/old
              7735630016806534901   OFFLINE      0     0     0  was /dev/label/DISK23/old
              label/DISK23          ONLINE       0     0     0  (resilvering)
The new disk has fairly constant heavy I/O (%busy up to 500).
Code:
                               capacity     operations    bandwidth
pool                        alloc   free   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----
data                         101G  19,3G      0     66      0   318K
  ada0d                      101G  19,3G      0     66      0   318K
--------------------------  -----  -----  -----  -----  -----  -----
data2                       12,5T  1,96T    268      0  25,8M      0
  raidz1                    12,5T  1,96T    268      0  25,8M      0
    ada2                        -      -    185      0  7,98M      0
    label/DISK21                -      -    185      0  7,66M      0
    label/DISK22                -      -    185      0  7,98M      0
    replacing                   -      -      0     82      0   871K
      15957692350125946024      -      -      0      0      0      0
      7735630016806534901       -      -      0      0      0      0
      label/DISK23              -      -      0     47      0   871K
Code:
vfs.zfs.trim.max_interval: 1
vfs.zfs.trim.timeout: 30
vfs.zfs.trim.txg_delay: 32
vfs.zfs.trim.enabled: 1
vfs.zfs.vol.unmap_enabled: 1
vfs.zfs.vol.mode: 1
vfs.zfs.version.zpl: 5
vfs.zfs.version.spa: 5000
vfs.zfs.version.acl: 1
vfs.zfs.version.ioctl: 5
vfs.zfs.debug: 0
vfs.zfs.super_owner: 0
vfs.zfs.sync_pass_rewrite: 2
vfs.zfs.sync_pass_dont_compress: 5
vfs.zfs.sync_pass_deferred_free: 2
vfs.zfs.zio.exclude_metadata: 0
vfs.zfs.zio.use_uma: 1
vfs.zfs.cache_flush_disable: 0
vfs.zfs.zil_replay_disable: 0
vfs.zfs.min_auto_ashift: 9
vfs.zfs.max_auto_ashift: 13
vfs.zfs.vdev.trim_max_pending: 10000
vfs.zfs.vdev.bio_delete_disable: 0
vfs.zfs.vdev.bio_flush_disable: 0
vfs.zfs.vdev.write_gap_limit: 4096
vfs.zfs.vdev.read_gap_limit: 32768
vfs.zfs.vdev.aggregation_limit: 131072
vfs.zfs.vdev.trim_max_active: 64
vfs.zfs.vdev.trim_min_active: 1
vfs.zfs.vdev.scrub_max_active: 2
vfs.zfs.vdev.scrub_min_active: 1
vfs.zfs.vdev.async_write_max_active: 10
vfs.zfs.vdev.async_write_min_active: 1
vfs.zfs.vdev.async_read_max_active: 3
vfs.zfs.vdev.async_read_min_active: 3
vfs.zfs.vdev.sync_write_max_active: 10
vfs.zfs.vdev.sync_write_min_active: 10
vfs.zfs.vdev.sync_read_max_active: 10
vfs.zfs.vdev.sync_read_min_active: 10
vfs.zfs.vdev.max_active: 1000
vfs.zfs.vdev.async_write_active_max_dirty_percent: 60
vfs.zfs.vdev.async_write_active_min_dirty_percent: 30
vfs.zfs.vdev.mirror.non_rotating_seek_inc: 1
vfs.zfs.vdev.mirror.non_rotating_inc: 0
vfs.zfs.vdev.mirror.rotating_seek_offset: 1048576
vfs.zfs.vdev.mirror.rotating_seek_inc: 5
vfs.zfs.vdev.mirror.rotating_inc: 0
vfs.zfs.vdev.trim_on_init: 1
vfs.zfs.vdev.cache.bshift: 16
vfs.zfs.vdev.cache.size: 0
vfs.zfs.vdev.cache.max: 16384
vfs.zfs.vdev.metaslabs_per_vdev: 200
vfs.zfs.txg.timeout: 5
vfs.zfs.space_map_blksz: 4096
vfs.zfs.spa_slop_shift: 5
vfs.zfs.spa_asize_inflation: 24
vfs.zfs.deadman_enabled: 1
vfs.zfs.deadman_checktime_ms: 5000
vfs.zfs.deadman_synctime_ms: 1000000
vfs.zfs.recover: 0
vfs.zfs.spa_load_verify_data: 1
vfs.zfs.spa_load_verify_metadata: 1
vfs.zfs.spa_load_verify_maxinflight: 10000
vfs.zfs.check_hostid: 1
vfs.zfs.mg_fragmentation_threshold: 85
vfs.zfs.mg_noalloc_threshold: 0
vfs.zfs.condense_pct: 200
vfs.zfs.metaslab.bias_enabled: 1
vfs.zfs.metaslab.lba_weighting_enabled: 1
vfs.zfs.metaslab.fragmentation_factor_enabled: 1
vfs.zfs.metaslab.preload_enabled: 1
vfs.zfs.metaslab.preload_limit: 3
vfs.zfs.metaslab.unload_delay: 8
vfs.zfs.metaslab.load_pct: 50
vfs.zfs.metaslab.min_alloc_size: 33554432
vfs.zfs.metaslab.df_free_pct: 4
vfs.zfs.metaslab.df_alloc_threshold: 131072
vfs.zfs.metaslab.debug_unload: 0
vfs.zfs.metaslab.debug_load: 0
vfs.zfs.metaslab.fragmentation_threshold: 70
vfs.zfs.metaslab.gang_bang: 16777217
vfs.zfs.free_bpobj_enabled: 1
vfs.zfs.free_max_blocks: 18446744073709551615
vfs.zfs.no_scrub_prefetch: 0
vfs.zfs.no_scrub_io: 0
vfs.zfs.resilver_min_time_ms: 5000
vfs.zfs.free_min_time_ms: 1000
vfs.zfs.scan_min_time_ms: 1000
vfs.zfs.scan_idle: 5
vfs.zfs.scrub_delay: 0
vfs.zfs.resilver_delay: 0
vfs.zfs.top_maxinflight: 8192
vfs.zfs.zfetch.array_rd_sz: 1048576
vfs.zfs.zfetch.max_distance: 8388608
vfs.zfs.zfetch.min_sec_reap: 2
vfs.zfs.zfetch.max_streams: 8
vfs.zfs.prefetch_disable: 1
vfs.zfs.delay_scale: 500000
vfs.zfs.delay_min_dirty_percent: 60
vfs.zfs.dirty_data_sync: 67108864
vfs.zfs.dirty_data_max_percent: 10
vfs.zfs.dirty_data_max_max: 4294967296
vfs.zfs.dirty_data_max: 1710285619
vfs.zfs.max_recordsize: 1048576
vfs.zfs.mdcomp_disable: 0
vfs.zfs.nopwrite_enabled: 1
vfs.zfs.dedup.prefetch: 1
vfs.zfs.l2c_only_size: 0
vfs.zfs.mfu_ghost_data_lsize: 0
vfs.zfs.mfu_ghost_metadata_lsize: 0
vfs.zfs.mfu_ghost_size: 0
vfs.zfs.mfu_data_lsize: 3586724864
vfs.zfs.mfu_metadata_lsize: 216798720
vfs.zfs.mfu_size: 3854173696
vfs.zfs.mru_ghost_data_lsize: 0
vfs.zfs.mru_ghost_metadata_lsize: 0
vfs.zfs.mru_ghost_size: 0
vfs.zfs.mru_data_lsize: 2467870720
vfs.zfs.mru_metadata_lsize: 461038592
vfs.zfs.mru_size: 3458391040
vfs.zfs.anon_data_lsize: 0
vfs.zfs.anon_metadata_lsize: 0
vfs.zfs.anon_size: 1748480
vfs.zfs.l2arc_norw: 1
vfs.zfs.l2arc_feed_again: 1
vfs.zfs.l2arc_noprefetch: 1
vfs.zfs.l2arc_feed_min_ms: 200
vfs.zfs.l2arc_feed_secs: 1
vfs.zfs.l2arc_headroom: 2
vfs.zfs.l2arc_write_boost: 8388608
vfs.zfs.l2arc_write_max: 8388608
vfs.zfs.arc_meta_limit: 3886932992
vfs.zfs.arc_free_target: 28182
vfs.zfs.arc_shrink_shift: 7
vfs.zfs.arc_average_blocksize: 8192
vfs.zfs.arc_min: 1943466496
vfs.zfs.arc_max: 15547731968
My pool is at 85% capacity
What could be the problem?
Any help would be greatly appreciated. Thanks!
 
I am using FreeBSD 10.3
Please keep in mind that 10.3 has been End-of-Life since April 2018 and is not supported anymore. Please upgrade to a supported version.
Topics about unsupported FreeBSD versions

What could be the problem?
Why do you think there is a problem? Resilvering is a slow process. It will be even slower if that pool is actively being used.

Code:
  scan: resilver in progress since Fri Apr 30 15:59:28 2021
        4,82T scanned out of 12,4T at 696K/s, (scan is slow, no estimated time)
        1,13T resilvered, 38,85% done
4 disks, 12TB RAID-Z, can I assume these are 4 x 3TB disks? And it's 85% full, so there's a lot of data to sync. That's going to take a while, especially if the pool is in use in the mean time.
 
While I agree with SirDice that you should seriously update asap, I'd like to suggest you double check whether your device's ashift is set properly.

It's been a while that I've worked on 10, but I'm fairly certain you had to fiddle around with gnop to get 4k disks properly set up - at least during pool set up. Can't remember whether this was necessary for adding disks for replacement.

Some reference commands that may help:

If your disk is not properly bit aligned, your performance (and your resilvering speed) will suffer tremendously and there may be other unexpected side effects. You should be able to check via zdb | grep ashift. Your ashift should be 12.
 
Back
Top