ZFS zdb always fails with "Device not configured" on 13.0-RELEASE host

I just ran into a situation where deduplication might be beneficial on one of the datasets on my workstation. So I wanted to check the predicted effectiveness with zdb but it completely fails on this host:
Code:
# zpool status
  pool: zroot
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        zroot       ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            nda1p4  ONLINE       0     0     0
            nda2p4  ONLINE       0     0     0

errors: No known data errors
# zfs list | grep navi
zroot/usr/home/sko/hyundai-naviupdates           77.3G   173G     74.7G  /usr/home/sko/hyundai-naviupdates
# zdb -S zroot/usr/home/sko/hyundai-naviupdates
failed to hold dataset 'zroot/usr/home/sko/hyundai-naviupdates': Device not configured
zdb: can't open 'zroot/usr/home/sko/hyundai-naviupdates': Device not configured
# zdb -C zroot
zdb: can't open 'zroot': Device not configured

This host is running 13.0-RELEASE. On several other hosts (all running 12.x-RELEASE versions) zdb is working as expected, however on this host it acts as if the pool doesn't even exist...

I couldn't find anything related/useful about this error message from zdb apart from cases with corrupted/deleted pools.
Are there any known regressions with zdb on 13.0-RELEASE and/or the new ZoL-Codebase it is using (IIRC)? If I zds send|recv the dataset to other hosts (running 12.2-RELEASE and illumos) I can issue a zdb -S on it perfectly fine, so this problem seems to be related to 13.0-RELEASE and/or its zfs version/variant I suspect...
 
This pool has been created when installing the system from scratch with 13.0-RELEASE last year and the pool configuration hasn't been touched since then... (zpool.cache dates to June 15 2021)
Some of the pools on other systems have been used and abused for years and over several versions; some were even migrated between illumos and FreeBSD and none of them shows any erroneous behaviour with zdb.

To narrow down if this is a problem with only this pool or something else, I created a new pool on another SSD I've had lying around:
Code:
# zpool create test /dev/ada0
# zpool status test
  pool: test
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        test        ONLINE       0     0     0
          ada0      ONLINE       0     0     0

errors: No known data errors
# zdb -C test
zdb: can't open 'test': Device not configured

Then I rebooted the system with a 12.2-RELEASE memstick image and imported both pools readonly (because of zfsonlinux properties and features unknown to zfs on 12.2, it wouldn't let me import them r/w anyways), and on both pools I still got "Device not configured" from zdb.
I then deleted the test pool; created a new one on that disk labeled 'test2' and lo and behold - zdb was working on that pool but still not on 'zroot'.

At that point I would have pointed at openzfs 2.0 on 13.0-RELEASE as the culprit, because both pools created by it weren't accessible for zdb. BUT - after rebooting to the installed system (13.0-R) to confirm if only test2 is accessible for zdb:
Code:
# zdb -C zroot

MOS Configuration:
        version: 5000
        name: 'zroot'
        state: 0
        txg: 10918768
        pool_guid: 6705216084785628768
        errata: 0
        hostname: ''
        com.delphix:has_per_vdev_zaps
        vdev_children: 1
        vdev_tree:
            type: 'root'
            id: 0
            guid: 6705216084785628768
            create_txg: 4
            children[0]:
                type: 'mirror'
                id: 0
                guid: 17911238872688934156
                metaslab_array: 256
                metaslab_shift: 33
                ashift: 12
                asize: 995693690880
                is_log: 0
                create_txg: 4
                com.delphix:vdev_zap_top: 129
                children[0]:
                    type: 'disk'
                    id: 0
                    guid: 11240046095922330961
                    path: '/dev/nda0p4'
                    whole_disk: 1
                    create_txg: 4
                    com.delphix:vdev_zap_leaf: 130
                children[1]:
                    type: 'disk'
                    id: 1
                    guid: 13669314474416363495
                    path: '/dev/nda1p4'
                    whole_disk: 1
                    create_txg: 4
                    com.delphix:vdev_zap_leaf: 131
        features_for_read:
            com.delphix:hole_birth
            com.delphix:embedded_data

I have no clue what might have changed between importing (RO!!) on 12.2 and rebooting to 13.0, but now zdb is working as expected again on that pool and I can't reproduce the error neither on 13.0 nor 12.2.
I've previously rebooted the system twice (first to confirm the problem persists, then again after bringing pkgs up to date) without any effect on zdb being unable to access zroot.

My best guess would be that the RO-import followed by a "normal" import cleared some stale metadata - but as I never really dealt with the inner workings of zfs apart from a sysadmins perspective I'm way out of my territory with that...
If anyone knowing about the inner workings of zfs finds the time and leisure to make an elaborate guess about what might most likely have happened, please go on. I'd really like to know how to solve this (without random testing/rebooting) should I ever run into this problem again.
 
The guids in "Cached configuration" and "MOS Configuration" are identical. And as said: that pool has been set up last year when freshly installing 13.0-RELEASE onto that system with a new pair of NVMe disks and pool configuration hasn't been touched since; the system only got it's occasional pkg upgrade and freebsd-update.
The only "non-default" configuration that comes to mind is the 12.2-RELEASE loader that's still in place to circumvent this bug with the ASUS B560M board in that system.
 
Back
Top