ZFS odd 'insufficient' replicates after swapping a single disk on a Z3 raid

dirkx · Oct 21, 2025

I've got several Z3 RAIDs; and recently had to do a routine swap on one of the clusters. That all went without error - but a few seconds into resilvering I got an unexpected 'insufficient replica' error; desite this being a Z3 raid with all other disk in good order.

The process followed was:

Code:

# Swap in identical disk; boot; check initial situation
# zpool status
  pool: zroot
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
    Sufficient replicas exist for the pool to continue functioning in a
    degraded state.
action: Online the device using 'zpool online' or replace the device with
    'zpool replace'.
  scan: scrub repaired 64K in 11:43:38 with 0 errors on Tue Oct 21 00:29:02 2025
config:
    NAME        STATE     READ WRITE CKSUM
    zroot       DEGRADED     0     0     0
      raidz3-0  DEGRADED     0     0     0
        ada0p3  OFFLINE      0     0     0
        ada1p3  ONLINE       0     0     0
        ada2p3  ONLINE       0     0     0
        ada3p3  ONLINE       0     0     0
        ada4p3  ONLINE       0     0     0
        ada5p3  ONLINE       0     0     0
        ada6p3  ONLINE       0     0     0

Then add the new disk in:

Code:

gpart create -s GPT ada0
# Take the partition setup from another disk in the cluster
gpart backup ada1 > b
# And restore it
gpart restore -F ada0 < b
# Add it to the pool
zpool replace zroot ada0p3 ada0p3

All without error; but then zfs status shows:

Code:

 zpool status
  pool: zroot
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Oct 21 09:42:50 2025
    2.83T / 11.7T scanned at 4.62G/s, 290G / 11.7T issued at 474M/s
    0B resilvered, 2.43% done, no estimated completion time
config:

    NAME              STATE     READ WRITE CKSUM
    zroot             DEGRADED     0     0     0
      raidz3-0        DEGRADED     0     0     0
        replacing-0   UNAVAIL      0   309     0  insufficient replicas
          ada0p3/old  OFFLINE      0     0     0
          ada0p3      REMOVED      0     0     0
        ada1p3        ONLINE       0     0     0
        ada2p3        ONLINE       0     0     0
        ada3p3        ONLINE       0     0     0
        ada4p3        ONLINE       0     0     0
        ada5p3        ONLINE       0     0     0
        ada6p3        ONLINE       0     0     0

What is going on here - I've not seen this 'insuffiicent replica's error before; and there should be enough ;as it is a Z3 raid. Any suggestions ?

Raid functions as expected and able to "zfs send" backups, etc. This is on stock FreeBSD14.3-RELEASE with recent updates.

Gpart seems happy too

Code:

# gpart show
=>        40  7814037088  ada1  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048    16777216     2  freebsd-swap  (8.0G)
    16779264  7797256192     3  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)

=>        40  7814037088  ada2  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048    16777216     2  freebsd-swap  (8.0G)
    16779264  7797256192     3  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)

=>        40  7814037088  ada3  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048    16777216     2  freebsd-swap  (8.0G)
    16779264  7797256192     3  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)

=>        40  7814037088  ada4  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048    16777216     2  freebsd-swap  (8.0G)
    16779264  7797256192     3  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)

=>        40  7814037088  ada5  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048    16777216     2  freebsd-swap  (8.0G)
    16779264  7797256192     3  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)

=>        40  7814037088  ada6  GPT  (3.6T)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048    16777216     2  freebsd-swap  (8.0G)
    16779264  7797256192     3  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)

=>        34  7814037101  ada0  GPT  (3.6T)
          34           6        - free -  (3.0K)
          40        1024     1  freebsd-boot  (512K)
        1064         984        - free -  (492K)
        2048    16777216     2  freebsd-swap  (8.0G)
    16779264  7797256192     3  freebsd-zfs  (3.6T)
  7814035456        1679        - free -  (840K)

sko · Oct 21, 2025

dirkx said:
# Add it to the pool
zpool replace zroot ada0p3 ada0p3

you are not supposed to remove the disk you want to zpool-replace(8):

Code:

DESCRIPTION
     Replaces device with new-device.  This is equivalent to attaching
     new-device, waiting for it to resilver, and then detaching device.  Any
     in progress scrub will be cancelled.

edit (sorry, was a bit in a hurry earlier):

if you already replaced the disk or it faulted, you don't specify [new-device]; only [device]. The replaced provider needs to have the same 'name', i.e. in your case it has to be named ada0p3. that's why it's usually best to use gpt-labels, so you can make sure the name matches. (zfs actually doesn't care about the name representation e.g. when assembling a pool at boot, but in this case for that command it needs the same name)

I'm not sure if you can issue the correct zpool replace zroot ada0p3 while that faulty replace job is lingering. You could also try to zpool offline that ada0p3 to abort the replace first.

dirkx · Oct 21, 2025

Thanks for that hint - so it turns out that:

Code:

zpool offline zroot ada0p3
zpool remove zroot ada0p3
zpool replace zroot ada0p3

gets things back on track (without reboot or other complexity). And I am guessing this is a special case (or bug) when 'not using gpt-labels' AND 'the name is identical'. As I've been using the zpool replace old new routinely with labels.

Is there, by the way, a clean/easy way to switch to gpt-labels without having to rebuild/scrub ? Or is that risky surgery

ralphbsz · Oct 21, 2025

dirkx said:
Is there, by the way, a clean/easy way to switch to gpt-labels without having to rebuild/scrub ? Or is that risky surgery

If you are using partitions, as you are: In the past, I have added the labels after ZFS was already using the disks. ZFS will continue using the same partitions, because it doesn't search for disks by name, but by enumerating and probing all disks it finds. But the output of "zpool status" commands may continue to show the old names.

(For people not using partitions, and giving the whole disk to ZFS: without a partition table, there is no place to store labels.)

atax1a · Oct 21, 2025

ralphbsz said:
(For people not using partitions, and giving the whole disk to ZFS: without a partition table, there is no place to store labels.)

we would recommend against doing this, and always putting a partition table on a disk. we have never heard a good argument for doing "dangerously dedicated" disks (on a multi-terabyte disk the amount of space used by a partition table is approximately nil) and we've seen it break in weird ways.

PMc · Oct 21, 2025

dirkx said:

All without error; but then zfs status shows:

Code:

 zpool status
  pool: zroot
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
    continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Tue Oct 21 09:42:50 2025
    2.83T / 11.7T scanned at 4.62G/s, 290G / 11.7T issued at 474M/s
    0B resilvered, 2.43% done, no estimated completion time
config:

    NAME              STATE     READ WRITE CKSUM
    zroot             DEGRADED     0     0     0
      raidz3-0        DEGRADED     0     0     0
        replacing-0   UNAVAIL      0   309     0  insufficient replicas
          ada0p3/old  OFFLINE      0     0     0
          ada0p3      REMOVED      0     0     0
        ada1p3        ONLINE       0     0     0
        ada2p3        ONLINE       0     0     0
        ada3p3        ONLINE       0     0     0
        ada4p3        ONLINE       0     0     0
        ada5p3        ONLINE       0     0     0
        ada6p3        ONLINE       0     0     0

What is going on here - I've not seen this 'insuffiicent replica's error before; and there should be enough ;as it is a Z3 raid. Any suggestions ?

The 'insufficient replicas' error is technically correct here, because it does concern the 'replacing-0' vnode, not the entire 'raidz3-0' vnode. The raidz itself is still operational and only degraded at this point.
What actually went wrong is another question I cannot answer rightaway, only that there is no imminent damage reflected.

sko · Oct 21, 2025

dirkx said:
And I am guessing this is a special case (or bug)

no. You are simply not supposed to remove the drive before issueing a zpool replace with *two* provider names. Then zfs can simply resilver the data from the still present drive to the new drive.
If you rip out the drive (or it fails), you only specify the new drive name (zfs figures out that you want to replace the missing one) and zfs has to rebuild that drive from the parity data on all other drives - i.e. it becomes a much more complex and *a lot* slower task. (raidz resilvers can take several days or weeks on busy pools with large providers...)

ZFS odd 'insufficient' replicates after swapping a single disk on a Z3 raid

dirkx

sko

dirkx

ralphbsz

atax1a

PMc

sko