Solved zpool missing after increasing disk size (AWS)

dvl@ · Aug 25, 2025

FreshPorts runs in an AWS instance. Earlier today, I modified two storage devices from 200GB to 250GB. The host contains two zpools: zroot and data01.

The zroot update went fine. The data01 zpool just disappeared. The drive is still there, but the zpool cannot be seen. I'm sure this can be recovered, but I don't know how.

Details I have collected:

 

[15:27 aws-1 dan ~] % zpool status data01                  

cannot open 'data01': no such pool





[15:37 aws-1 dan ~] % sudo zpool import

   pool: data01

     id: 17238602793760673894

  state: UNAVAIL

status: One or more devices are missing from the system.

 action: The pool cannot be imported. Attach the missing

    devices and try again.

   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-3C

 config:



    data01      UNAVAIL  insufficient replicas

      nda2p1    UNAVAIL  cannot open



[15:38 aws-1 dan ~] % ls -l /dev/nda2p1

crw-r-----  1 root operator 0x66 2025.08.25 15:22 /dev/nda2p1

[15:39 aws-1 dan ~] % gpart show nda2p1

gpart: No such geom: nda2p1.

[15:39 aws-1 dan ~] % gpart show nda2  

=>       40  524287920  nda2  GPT  (250G)

         40  524287920     1  freebsd-zfs  (250G)



[15:39 aws-1 dan ~] % 









[15:30 aws-1 dan ~] % gpart show

=>       40  524287920  nda0  GPT  (250G)

         40       1024     1  freebsd-boot  (512K)

       1064        984        - free -  (492K)

       2048  524285912     2  freebsd-zfs  (250G)



=>      40  16777136  nda1  GPT  (8.0G)

        40  16777136     1  freebsd-swap  (8.0G)



=>       40  524287920  nda2  GPT  (250G)

         40  524287920     1  freebsd-zfs  (250G)





[15:31 aws-1 dan ~] % zpool list

NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT

zroot   250G  36.2G   213G        -         -    28%    14%  1.00x    ONLINE  -



From /var/log/daily.log I know there is a data01:



Backup of boot partition content:

nda0p1



Disk status:

Filesystem                                                      Size    Used   Avail Capacity  Mounted on

zroot/ROOT/default                                              174G     17G    157G    10%    /

devfs                                                           1.0K      0B    1.0K     0%    /dev

data01/jails                                                     33G     96K     33G     0%    /jails

zroot/tmp                                                       157G    313K    157G     0%    /tmp

zroot/var/mail                                                  157G     24K    157G     0%    /var/mail

zroot/usr/home                                                  171G     14G    157G     8%    /usr/home

zroot/usr/src                                                   157G     23K    157G     0%    /usr/src

zroot/var/crash                                                 157G     23K    157G     0%    /var/crash

zroot/mkjail                                                    157G     23K    157G     0%    /mkjail

zroot/usr/ports                                                 157G     23K    157G     0%    /usr/ports

zroot/freebsd_releases                                          160G    3.3G    157G     2%    /var/db/mkjail

zroot/var/audit                                                 157G     23K    157G     0%    /var/audit

zroot/var/tmp                                                   157G    278K    157G     0%    /var/tmp

zroot/var/log                                                   157G     61M    157G     0%    /var/log

data01/jails/ingress01                                           44G     11G     33G    25%    /jails/ingress01

data01/jails/nginx01                                             37G    4.6G     33G    12%    /jails/nginx01

zroot/mkjail                                                    157G     23K    157G     0%    /mkjail

data01/rsyncer                                                   40G    7.1G     33G    18%    /usr/home/rsyncer/backups

data01/jails/ingress01/usr/src                                   33G     96K     33G     0%    /jails/ingress01/usr/src

data01/freshports/ingress01/var/db/freshports                    34G    1.7G     33G     5%    /jails/ingress01/var/db/freshports

data01/mkjail/14.2-RELEASE                                       34G    852M     33G     2%    /mkjail/14.2-RELEASE

data01/freshports/ingress01/var/db/ingress                       33G    180K     33G     0%    /jails/ingress01/var/db/ingress

data01/freshports/ingress01/var/db/freshports/cache              33G     96K     33G     0%    /jails/ingress01/var/db/freshports/cache

data01/freshports/ingress01/var/db/freshports/message-queues     37G    4.4G     33G    12%    /jails/ingress01/var/db/freshports/message-queues

data01/freshports/ingress01/var/db/ingress/message-queues        33G    1.2M     33G     0%    /jails/ingress01/var/db/ingress/message-queues

data01/freshports/ingress01/var/db/ingress/repos                 42G    9.2G     33G    22%    /jails/ingress01/var/db/ingress/repos

data01/freshports/ingress01/var/db/freshports/cache/spooling     33G    360K     33G     0%    /jails/ingress01/var/db/freshports/cache/spooling

data01/freshports/ingress01/var/db/freshports/cache/html         33G    204K     33G     0%    /jails/ingress01/var/db/freshports/cache/html

devfs                                                           1.0K      0B    1.0K     0%    /jails/ingress01/dev

data01/freshports/jailed/ingress01/jails                         33G    104K     33G     0%    /jails/ingress01/jails

data01/freshports/jailed/ingress01/mkjail                        35G    1.8G     33G     5%    /jails/ingress01/var/db/mkjail

data01/freshports/jailed/ingress01/jails/freshports             117G     85G     33G    72%    /jails/ingress01/jails/freshports

data01/freshports/jailed/ingress01/mkjail/14.2-RELEASE           34G    852M     33G     2%    /jails/ingress01/var/db/mkjail/14.2-RELEASE

devfs                                                           1.0K      0B    1.0K     0%    /jails/ingress01/jails/freshports/dev

/jails/ingress01/var/db/freshports/cache/html                    33G    204K     33G     0%    /jails/nginx01/var/db/freshports/cache/html

devfs                                                           1.0K      0B    1.0K     0%    /jails/nginx01/dev

data01/freshports/nginx01/var/db/freshports/cache/daily          33G    133M     33G     0%    /jails/nginx01/var/db/freshports/cache/daily

data01/freshports/nginx01/var/db/freshports/cache/news           33G     11M     33G     0%    /jails/nginx01/var/db/freshports/cache/news

data01/freshports/nginx01/var/db/freshports/cache/commits        59G     26G     33G    45%    /jails/nginx01/var/db/freshports/cache/commits

data01/freshports/nginx01/var/db/freshports/cache/spooling       33G    128K     33G     0%    /jails/nginx01/var/db/freshports/cache/spooling

data01/freshports/nginx01/var/db/freshports/cache/packages       33G     40M     33G     0%    /jails/nginx01/var/db/freshports/cache/packages

data01/freshports/nginx01/var/db/freshports/cache/ports          39G    6.6G     33G    17%    /jails/nginx01/var/db/freshports/cache/ports

data01/freshports/nginx01/var/db/freshports/cache/general        33G    8.0M     33G     0%    /jails/nginx01/var/db/freshports/cache/general

data01/freshports/nginx01/var/db/freshports/cache/pages          33G     96K     33G     0%    /jails/nginx01/var/db/freshports/cache/pages

data01/freshports/nginx01/var/db/freshports/cache/categories     33G     29M     33G     0%    /jails/nginx01/var/db/freshports/cache/categories

VladiBG · Aug 25, 2025

try to import it by device id using
zpool import -d /dev/nda2p1

Then if the import is successful check if "autoexpand" is on data01

zpool get autoexpand data01

dvl@ · Aug 25, 2025

VladiBG said:
try to import it by device id using
zpool import -d /dev/nda2p1

Then if the import is successful check if "autoexpand" is on data01

Code:

[16:29 aws-1 dan /var/backups] % sudo zpool import -d /dev/nda2p1
   pool: data01
     id: 17238602793760673894
  state: UNAVAIL
status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
    devices and try again.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-3C
 config:

    data01      UNAVAIL  insufficient replicas
      nda2p1    UNAVAIL  cannot open

dvl@ · Aug 25, 2025

I've created a new volume vol-0a9652ed611542c31 from a snapshot taken at 2025/08/25 05:40 GMT-4

It is not attached to the instance.

edit: 2025-08-29 : this volume was not required and has since been deleted.

dvl@ · Aug 25, 2025

rwp on IRC led us to the cause: devices were renumbered and the host also contained two single-partiition drives, one for swap and one for the data01 zpool

In short, swap was mounted over the data01 zpool:

Code:

[17:09 aws-1 dan ~] % swapinfo -h
Device              Size     Used    Avail Capacity
/dev/nda2p1         250G       0B     250G     0%
[17:10 aws-1 dan ~] % cat /etc/fstab
/dev/nvd2p1 none swap sw         0 0

Note how swap is 250G, which is the size of the drive for data01.

Code:

[17:17 aws-1 dan ~] % gpart show
=>       40  524287920  nda0  GPT  (250G)
         40       1024     1  freebsd-boot  (512K)
       1064        984        - free -  (492K)
       2048  524285912     2  freebsd-zfs  (250G)

=>      40  16777136  nda1  GPT  (8.0G)
        40  16777136     1  freebsd-swap  (8.0G)

=>       40  524287920  nda2  GPT  (250G)
         40  524287920     1  freebsd-zfs  (250G)

[17:22 aws-1 dan ~] %

dvl@ · Aug 25, 2025

As kevens pointed out, "i wonder if we could add some guardrails that would've prevented this"

Alain De Vos · Aug 25, 2025

Create the zpool using GUID so it don't get renumbered ?

T-Aoki · Aug 25, 2025

dvl@ said:
As kevens pointed out, "i wonder if we could add some guardrails that would've prevented this"

It would be "Use label instead of raw partition name" as I've commented on Mastodon.

T-Aoki · Aug 25, 2025

Example:
I've created partition for freebsd-zfs on NVMe SSD attached via USB adapter.

Code:

# gpart add -t freebsd-zfs -a 1M -l <unique label> -s 3661G da0

And created pool as follows.

Code:

# zpool create -R /mnt <pool name> /dev/gpt/<unique label>

After transferring everything from my previous SSD in my notebook and fixed up anything specifying labels of previous SSD to labels of new SSD, swapped old and new SSD (now on my NVMe slot in my notebook, recongized as nda0*) and now I'm working on new larger SSD. Not bothered by geom provider changes (/dev/da0* to /dev/nda0).

T-Aoki · Aug 25, 2025

Alain De Vos said:
Create the zpool using GUID so it don't get renumbered ?

It would be fine, too, but harder to remember at least for me.
Using my own naming rules are easier to remember.

dvl@ · Aug 25, 2025

Swap was turned off:

Code:

[18:17 aws-1 dan ~] % sudo swapoff -a   
swapoff: removing /dev/nvd2p1 as swap device

[18:17 aws-1 dan ~] % swapinfo
Device          1K-blocks     Used    Avail Capacity

[18:17 aws-1 dan ~] % sudo zpool import         
   pool: data01
     id: 17238602793760673894
  state: ONLINE
status: Some supported features are not enabled on the pool.
    (Note that they may be intentionally disabled if the
    'compatibility' property is set.)
 action: The pool can be imported using its name or numeric identifier, though
    some features will not be available without an explicit 'zpool upgrade'.
 config:

    data01      ONLINE
      nda2p1    ONLINE

[18:18 aws-1 dan ~] % zpool list
NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
zroot   250G  36.2G   213G        -         -    28%    14%  1.00x    ONLINE  -

[18:18 aws-1 dan ~] % sudo zpool import data01

[18:18 aws-1 dan ~] % zpool list             
NAME     SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
data01   200G   161G  38.9G        -       50G    85%    80%  1.00x    ONLINE  -
zroot    250G  36.2G   213G        -         -    28%    14%  1.00x    ONLINE  -
[18:18 aws-1 dan ~] %

dvl@ · Aug 25, 2025

With the missing pool mounted, let's scrub:

Code:

[18:19 aws-1 dan ~] % sudo zpool scrub data01
[18:19 aws-1 dan ~] % zpool status data01   
  pool: data01
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
    The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(7) for details.
  scan: scrub in progress since Mon Aug 25 18:19:39 2025
    7.17G / 161G scanned at 188M/s, 3.13G / 161G issued at 82.2M/s
    0B repaired, 1.95% done, 00:32:42 to go
config:

    NAME        STATE     READ WRITE CKSUM
    data01      ONLINE       0     0     0
      nda2p1    ONLINE       0     0     0

errors: No known data errors
[18:20 aws-1 dan ~] %

dvl@ · Aug 25, 2025

T-Aoki said:
It would be fine, too, but harder to remember at least for me.
Using my own naming rules are easier to remember.

Turning off swap and importing worked: https://forums.freebsd.org/threads/zpool-missing-after-increasing-disk-size-aws.98986/post-714444

dvl@ · Aug 25, 2025

Code:

[18:35 aws-1 dan ~] % zpool status data01
  pool: data01
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
    The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
    the pool may no longer be accessible by software that does not support
    the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:31:09 with 0 errors on Mon Aug 25 18:50:48 2025
config:

    NAME        STATE     READ WRITE CKSUM
    data01      ONLINE       0     0     0
      nda2p1    ONLINE       0     0     0

errors: No known data errors
[19:04 aws-1 dan ~] %

dvl@ · Aug 25, 2025

I started the webserver jail, the website came up.

Code:

[19:05 aws-1 dan ~] % sudo service jail start nginx01     
Starting jails: nginx01.

Then shutdown -r now. Now waiting for Nagios to clear out to green.

dvl@ · Aug 25, 2025

All green that should be green. A few maintenance items will take a while to go green.

cracauer@ · Aug 25, 2025

dvl@ said:
As kevens pointed out, "i wonder if we could add some guardrails that would've prevented this"

Linux does act on this, by marking block devices that are meant for swap. If the marker is missing, the kernel will not swap to that block device.

This would be trivial to do in FreeBSD. Except that existing systems would be cut off from their existing swap until somebody marks the devices or turns this mechanism off.

kevans · Aug 26, 2025

cracauer@ said:
Linux does act on this, by marking block devices that are meant for swap. If the marker is missing, the kernel will not swap to that block device. This would be trivial to do in FreeBSD. Except that existing systems would be cut off from their existing swap until somebody marks the devices or turns this mechanism off.

Something else we'd kind of batted around was the idea of refusing to use it if it has valid UFS/ZFS metadata/magic, but there's a usability hiccup there in that you there's a risk for false positives and you might need a transitional step to convert a now-discarded filesystem partition into swap.

cracauer@ · Aug 26, 2025

I also thought about a flag to `swapon` and a corresponding option in fstab to only do the actual swap if a signature is present on the device. That would at least protect against device mixups on future-forward installed systems.

atax1a · Aug 26, 2025

kevans said:
Something else we'd kind of batted around was the idea of refusing to use it if it has valid UFS/ZFS metadata/magic, but there's a usability hiccup there in that you there's a risk for false positives and you might need a transitional step to convert a now-discarded filesystem partition into swap.

in the zfs case, the tool you're looking for is zfs labelclear $DEVICE

Alain De Vos · Aug 26, 2025

it is very dangerous if used incorrectly.

Data loss: The most significant danger is that zpool labelclear will make all data on the disk inaccessible. While it doesn't zero out the entire disk (it only erases the ZFS labels), without the labels, ZFS has no way to recognize the disk as part of a pool and cannot access the data on it. You will lose access to all the data on the device.
Pool corruption: Running this command on a device that is still an active part of a running ZFS pool can lead to pool degradation or even destruction. ZFS has built-in safeguards to prevent this from happening (it will refuse to run on an active device unless you use the -f force option), but you should never attempt this on a device that is part of a pool you care about.
Accidental misuse: Because it's a powerful and destructive command, it's crucial to be absolutely sure you are running it on the correct device. A typo in the device path could lead to catastrophic data loss on the wrong disk.

atax1a · Aug 26, 2025

...yes? i don't think anyone asked for a chatgpt explanation of what the command does

Alain De Vos · Aug 26, 2025

Wrong, two persons did liked it. And you did not mention the danger of losing all your data.

atax1a · Aug 26, 2025

if you're running commands like zpool labelclear without understanding what they do, i think that's on you

dvl@ · Aug 26, 2025

Overnight, I had recurring dreams/hallucinations of how using device labels (e.g. gtp/zfs0) instead of partition names (e.g. /dev/nda2p1) would have avoided this. It kept going through my head (I was dealing with a fever) most of the night.

Today I found I have done this in the past: https://dan.langille.org/2019/10/15/going-from-partition-to-label-in-zpool-status/

The above solution deals with a mirror. The procedure might not translate well to a single drive zpool (remember this is an AWS host, so it's not a physical drive).