We have a fairly large zpool on our production imap server and we are experiencing miserable performance while a replacement spindle is resilvered into RAIDz.
The resilvering has been reporting about 28 minutes remaining for the last 3 days. And the percentage complete has not changed.
The machine has 256G RAM and most measurements seem to report a working machine:
Code:
$ sudo zpool status
Password:
pool: zpool2
state: ONLINE
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Tue Sep 19 16:05:14 2017
31.2T scanned out of 31.3T at 43.2M/s, 0h27m to go
1.69T resilvered, 99.78% done
config:
NAME STATE READ WRITE CKSUM
zpool2 ONLINE 0 0 0
raidz2-0 ONLINE 0 0 0
da45 ONLINE 0 0 0 (resilvering)
da9 ONLINE 0 0 0
da44 ONLINE 0 0 0
da29 ONLINE 0 0 0
da30 ONLINE 0 0 0
da31 ONLINE 0 0 0
raidz2-1 ONLINE 0 0 0
da32 ONLINE 0 0 0
da33 ONLINE 0 0 0
da34 ONLINE 0 0 0
da35 ONLINE 0 0 0
da36 ONLINE 0 0 0
da37 ONLINE 0 0 0
raidz2-2 ONLINE 0 0 0
da38 ONLINE 0 0 0
da39 ONLINE 0 0 0
da40 ONLINE 0 0 0
da41 ONLINE 0 0 0
da42 ONLINE 0 0 0
da43 ONLINE 0 0 0
logs
ada0 ONLINE 0 0 0
cache
ada1 ONLINE 0 0 0
spares
da47 AVAIL
errors: No known data errors
The resilvering has been reporting about 28 minutes remaining for the last 3 days. And the percentage complete has not changed.
The machine has 256G RAM and most measurements seem to report a working machine:
Code:
$ sudo zfs-stats -a
------------------------------------------------------------------------
ZFS Subsystem Report Thu Sep 28 10:35:12 2017
------------------------------------------------------------------------
System Information:
Kernel Version: 1002000 (osreldate)
Hardware Platform: amd64
Processor Architecture: amd64
ZFS Storage pool Version: 5000
ZFS Filesystem Version: 5
FreeBSD 10.2-RELEASE-p7 #0: Mon Nov 2 14:19:39 UTC 2015 root
10:35AM up 20 days, 23:36, 11 users, load averages: 1.66, 1.55, 2.09
------------------------------------------------------------------------
System Memory:
2.99% 7.45 GiB Active, 56.46% 140.81 GiB Inact
37.94% 94.60 GiB Wired, 0.05% 115.59 MiB Cache
2.57% 6.40 GiB Free, 0.00% 1.34 MiB Gap
Real Installed: 256.00 GiB
Real Available: 99.98% 255.94 GiB
Real Managed: 97.43% 249.37 GiB
Logical Total: 256.00 GiB
Logical Used: 42.45% 108.68 GiB
Logical Free: 57.55% 147.32 GiB
Kernel Memory: 9.01 GiB
Data: 99.70% 8.98 GiB
Text: 0.30% 27.56 MiB
Kernel Memory Map: 249.37 GiB
Size: 14.55% 36.30 GiB
Free: 85.45% 213.08 GiB
------------------------------------------------------------------------
ARC Summary: (HEALTHY)
Memory Throttle Count: 0
ARC Misc:
Deleted: 204.23m
Recycle Misses: 57.43m
Mutex Misses: 54.73k
Evict Skips: 882.39m
ARC Size: 24.58% 61.05 GiB
Target Size: (Adaptive) 24.59% 61.08 GiB
Min Size (Hard Limit): 12.50% 31.05 GiB
Max Size (High Water): 8:1 248.37 GiB
ARC Size Breakdown:
Recently Used Cache Size: 93.75% 57.27 GiB
Frequently Used Cache Size: 6.25% 3.82 GiB
ARC Hash Breakdown:
Elements Max: 23.67m
Elements Current: 83.51% 19.77m
Collisions: 130.85m
Chain Max: 11
Chains: 4.05m
------------------------------------------------------------------------
ARC Efficiency: 8.40b
Cache Hit Ratio: 96.01% 8.06b
Cache Miss Ratio: 3.99% 335.10m
Actual Hit Ratio: 90.98% 7.64b
Data Demand Efficiency: 96.75% 3.23b
Data Prefetch Efficiency: 4.59% 35.17m
CACHE HITS BY CACHE LIST:
Anonymously Used: 4.52% 364.56m
Most Recently Used: 7.39% 595.82m
Most Frequently Used: 87.37% 7.05b
Most Recently Used Ghost: 0.20% 15.86m
Most Frequently Used Ghost: 0.52% 42.29m
CACHE HITS BY DATA TYPE:
Demand Data: 38.72% 3.12b
Prefetch Data: 0.02% 1.62m
Demand Metadata: 54.47% 4.39b
Prefetch Metadata: 6.79% 547.67m
CACHE MISSES BY DATA TYPE:
Demand Data: 31.26% 104.77m
Prefetch Data: 10.01% 33.56m
Demand Metadata: 33.64% 112.71m
Prefetch Metadata: 25.09% 84.06m
------------------------------------------------------------------------
L2 ARC Summary: (HEALTHY)
Passed Headroom: 112.36m
Tried Lock Failures: 908.97k
IO In Progress: 353.00k
Low Memory Aborts: 2.51k
Free on Write: 127.08k
Writes While Full: 79.91k
R/W Clashes: 7.30k
Bad Checksums: 0
IO Errors: 0
SPA Mismatch: 123.53m
L2 ARC Size: (Adaptive) 934.70 GiB
Header Size: 0.36% 3.40 GiB
L2 ARC Breakdown: 335.10m
Hit Ratio: 28.05% 94.01m
Miss Ratio: 71.95% 241.09m
Feeds: 1.81m
L2 ARC Buffer:
Bytes Scanned: 133.48 TiB
Buffer Iterations: 1.81m
List Iterations: 115.01m
NULL List Iterations: 688.18k
L2 ARC Writes:
Writes Sent: 100.00% 338.79k
------------------------------------------------------------------------
File-Level Prefetch: (HEALTHY)
DMU Efficiency: 101.15b
Hit Ratio: 88.78% 89.80b
Miss Ratio: 11.22% 11.35b
Colinear: 11.35b
Hit Ratio: 0.01% 591.54k
Miss Ratio: 99.99% 11.35b
Stride: 89.04b
Hit Ratio: 99.98% 89.03b
Miss Ratio: 0.02% 17.81m
DMU Misc:
Reclaim: 11.35b
Successes: 0.26% 29.58m
Failures: 99.74% 11.32b
Streams: 745.51m
+Resets: 0.01% 52.30k
-Resets: 99.99% 745.46m
Bogus: 0
------------------------------------------------------------------------
VDEV Cache Summary: 229.62m
Hit Ratio: 28.64% 65.76m
Miss Ratio: 59.79% 137.30m
Delegations: 11.57% 26.56m
------------------------------------------------------------------------
ZFS Tunables (sysctl):
kern.maxusers 16716
vm.kmem_size 267761856512
vm.kmem_size_scale 1
vm.kmem_size_min 0
vm.kmem_size_max 1319413950874
vfs.zfs.trim.max_interval 1
vfs.zfs.trim.timeout 30
vfs.zfs.trim.txg_delay 32
vfs.zfs.trim.enabled 1
vfs.zfs.vol.unmap_enabled 1
vfs.zfs.vol.mode 1
vfs.zfs.version.zpl 5
vfs.zfs.version.spa 5000
vfs.zfs.version.acl 1
vfs.zfs.version.ioctl 4
vfs.zfs.debug 0
vfs.zfs.super_owner 0
vfs.zfs.sync_pass_rewrite 2
vfs.zfs.sync_pass_dont_compress 5
vfs.zfs.sync_pass_deferred_free 2
vfs.zfs.zio.exclude_metadata 0
vfs.zfs.zio.use_uma 1
vfs.zfs.cache_flush_disable 0
vfs.zfs.zil_replay_disable 0
vfs.zfs.min_auto_ashift 12
vfs.zfs.max_auto_ashift 13
vfs.zfs.vdev.trim_max_pending 10000
vfs.zfs.vdev.bio_delete_disable 0
vfs.zfs.vdev.bio_flush_disable 0
vfs.zfs.vdev.write_gap_limit 4096
vfs.zfs.vdev.read_gap_limit 32768
vfs.zfs.vdev.aggregation_limit 131072
vfs.zfs.vdev.trim_max_active 64
vfs.zfs.vdev.trim_min_active 1
vfs.zfs.vdev.scrub_max_active 2
vfs.zfs.vdev.scrub_min_active 1
vfs.zfs.vdev.async_write_max_active 10
vfs.zfs.vdev.async_write_min_active 1
vfs.zfs.vdev.async_read_max_active 3
vfs.zfs.vdev.async_read_min_active 1
vfs.zfs.vdev.sync_write_max_active 10
vfs.zfs.vdev.sync_write_min_active 10
vfs.zfs.vdev.sync_read_max_active 10
vfs.zfs.vdev.sync_read_min_active 10
vfs.zfs.vdev.max_active 1000
vfs.zfs.vdev.async_write_active_max_dirty_percent60
vfs.zfs.vdev.async_write_active_min_dirty_percent30
vfs.zfs.vdev.mirror.non_rotating_seek_inc1
vfs.zfs.vdev.mirror.non_rotating_inc 0
vfs.zfs.vdev.mirror.rotating_seek_offset1048576
vfs.zfs.vdev.mirror.rotating_seek_inc 5
vfs.zfs.vdev.mirror.rotating_inc 0
vfs.zfs.vdev.trim_on_init 1
vfs.zfs.vdev.cache.bshift 16
vfs.zfs.vdev.cache.size 67108864
vfs.zfs.vdev.cache.max 65536
vfs.zfs.vdev.metaslabs_per_vdev 200
vfs.zfs.txg.timeout 5
vfs.zfs.space_map_blksz 4096
vfs.zfs.spa_slop_shift 5
vfs.zfs.spa_asize_inflation 24
vfs.zfs.deadman_enabled 1
vfs.zfs.deadman_checktime_ms 5000
vfs.zfs.deadman_synctime_ms 1000000
vfs.zfs.recover 0
vfs.zfs.spa_load_verify_data 1
vfs.zfs.spa_load_verify_metadata 1
vfs.zfs.spa_load_verify_maxinflight 10000
vfs.zfs.check_hostid 1
vfs.zfs.mg_fragmentation_threshold 85
vfs.zfs.mg_noalloc_threshold 0
vfs.zfs.condense_pct 200
vfs.zfs.metaslab.bias_enabled 1
vfs.zfs.metaslab.lba_weighting_enabled 1
vfs.zfs.metaslab.fragmentation_factor_enabled1
vfs.zfs.metaslab.preload_enabled 1
vfs.zfs.metaslab.preload_limit 3
vfs.zfs.metaslab.unload_delay 8
vfs.zfs.metaslab.load_pct 50
vfs.zfs.metaslab.min_alloc_size 33554432
vfs.zfs.metaslab.df_free_pct 4
vfs.zfs.metaslab.df_alloc_threshold 131072
vfs.zfs.metaslab.debug_unload 0
vfs.zfs.metaslab.debug_load 0
vfs.zfs.metaslab.fragmentation_threshold70
vfs.zfs.metaslab.gang_bang 16777217
vfs.zfs.free_max_blocks -1
vfs.zfs.no_scrub_prefetch 0
vfs.zfs.no_scrub_io 0
vfs.zfs.resilver_min_time_ms 3000
vfs.zfs.free_min_time_ms 1000
vfs.zfs.scan_min_time_ms 1000
vfs.zfs.scan_idle 50
vfs.zfs.scrub_delay 4
vfs.zfs.resilver_delay 2
vfs.zfs.top_maxinflight 32
vfs.zfs.zfetch.array_rd_sz 1048576
vfs.zfs.zfetch.block_cap 256
vfs.zfs.zfetch.min_sec_reap 2
vfs.zfs.zfetch.max_streams 8
vfs.zfs.prefetch_disable 0
vfs.zfs.delay_scale 500000
vfs.zfs.delay_min_dirty_percent 60
vfs.zfs.dirty_data_sync 67108864
vfs.zfs.dirty_data_max_percent 10
vfs.zfs.dirty_data_max_max 4294967296
vfs.zfs.dirty_data_max 4294967296
vfs.zfs.max_recordsize 1048576
vfs.zfs.mdcomp_disable 0
vfs.zfs.nopwrite_enabled 1
vfs.zfs.dedup.prefetch 1
vfs.zfs.l2c_only_size 977364976128
vfs.zfs.mfu_ghost_data_lsize 1179619840
vfs.zfs.mfu_ghost_metadata_lsize 4112842752
vfs.zfs.mfu_ghost_size 5292462592
vfs.zfs.mfu_data_lsize 1801214976
vfs.zfs.mfu_metadata_lsize 14357702656
vfs.zfs.mfu_size 26676616192
vfs.zfs.mru_ghost_data_lsize 8783111680
vfs.zfs.mru_ghost_metadata_lsize 49993475072
vfs.zfs.mru_ghost_size 58776586752
vfs.zfs.mru_data_lsize 2081698816
vfs.zfs.mru_metadata_lsize 2410155520
vfs.zfs.mru_size 6321306112
vfs.zfs.anon_data_lsize 0
vfs.zfs.anon_metadata_lsize 0
vfs.zfs.anon_size 1586417152
vfs.zfs.l2arc_norw 1
vfs.zfs.l2arc_feed_again 1
vfs.zfs.l2arc_noprefetch 1
vfs.zfs.l2arc_feed_min_ms 200
vfs.zfs.l2arc_feed_secs 1
vfs.zfs.l2arc_headroom 2
vfs.zfs.l2arc_write_boost 8388608
vfs.zfs.l2arc_write_max 8388608
vfs.zfs.arc_meta_limit 66672028672
vfs.zfs.arc_free_target 453222
vfs.zfs.arc_shrink_shift 5
vfs.zfs.arc_average_blocksize 8192
vfs.zfs.arc_min 33336014336
vfs.zfs.arc_max 266688114688
------------------------------------------------------------------------