ZFS Resilver: 100.02% done

Hello All,

I'm replacing a drive in a raidz2-0; resilvering almost finished:

Code:
# zpool status
  pool: data
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Feb  1 18:08:26 2021
        6.18T scanned out of 7.21T at 298M/s, 1h0m to go
        1.23T resilvered, 85.73% done
config:

    NAME                        STATE     READ WRITE CKSUM
    data                        DEGRADED     0     0     0
      raidz2-0                  DEGRADED     0     0     0
        ada0p3                  ONLINE       0     0     0
        ada1p3                  ONLINE       0     0     0
        ada3p3                  ONLINE       0     0     0
        ada2p3                  ONLINE       0     0     0
        replacing-4             UNAVAIL      0     0     0
          17806141083394833242  UNAVAIL      0     0     0  was /dev/ada5p3/old
          ada5p3                ONLINE       0     0     0  (resilvering)

errors: No known data errors

About an hour later:

Code:
# zpool status
  pool: data
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Feb  1 18:08:26 2021
        7.21T scanned out of 7.21T at 291M/s, (scan is slow, no estimated time)
        1.44T resilvered, 100.02% done
config:

    NAME                        STATE     READ WRITE CKSUM
    data                        DEGRADED     0     0     0
      raidz2-0                  DEGRADED     0     0     0
        ada0p3                  ONLINE       0     0     0
        ada1p3                  ONLINE       0     0     0
        ada3p3                  ONLINE       0     0     0
        ada2p3                  ONLINE       0     0     0
        replacing-4             UNAVAIL      0     0     0
          17806141083394833242  UNAVAIL      0     0     0  was /dev/ada5p3/old
          ada5p3                ONLINE       0     0     0  (resilvering)

errors: No known data errors

Could someone please explain?
 
Last edited by a moderator:
IIRC this comes from the additional meta- and parity-data that needs to be re-calculated for raidz devices. This is also the reason why raidz devices are (very) slow to resilver and the estimates aren't reliable or not available (-> the info about slow scan).
 
Thanks for your reply. Eventually, resilvering finished completely with no errors.

This was the 3rd drive I have sequentially replaced in this array. Resilvering the first drive took almost 20 hours. The second drive took about 6 hours. This 3rd one took 7 hours 14 minutes.

Each time it was overnight or the weekend when I replaced the drives, so the server was under minimum load. Yet resilvering times have varied a great deal.
 
If you want/need fast resilvering times (and overall faster vdev and pool performance...) go for mirrors. zraid is much more biased towards space efficiency/economy but sacrifices a lot of performance.
 
Each time it was overnight or the weekend when I replaced the drives,
It might be related to the periodic scripts running overnight or at the weekend or based on time of month. Different scripts at different times might have affected it by changing the system loads and/or disk access. Some do quite a bit of disk access. Likewise, you might have cron jobs running at certain times.
 
Could someone please explain?
Explain what? That it goes above 100%? That is normal:

ZFS is copy-on-write. So, everytime something is written (and that includes metadata, too), it is allocated anew in the pool. And so the amount of allocation grows, and scrub or resilver has to process these newly allocated areas which have appeared while the task was already running, and therefore cannot be included in the 100% that were calculated at the beginning. (That other allocations become obsolet at the same time does not matter: those may already have been processed.)
 
Back
Top