Solved Remove (once and for all) an unavailable zpool

giorgiov

New Member


Messages: 13

Hi all,
I have a test pool that I know is gone for good since a reformat the drive and create another zpool with it. Is there a way to remove this from the supposedly available list of pools?

Things I have already tried:
- zpool export -f
- zpool destroy -f
- reboot

Code:
   pool: usb-test
     id: 6403289418777496352
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
    devices and try again.
   see: http://illumos.org/msg/ZFS-8000-3C
 config:

    usb-test              UNAVAIL  insufficient replicas
     15491483894671884218  UNAVAIL  cannot open
Thank you in advance.
 

kpa

Beastie's Twin

Reaction score: 1,820
Messages: 6,318

Export won't do if the there are leftover labels on the disks and none of the devices listed in the labels are available. You have to use zpool labelclear -f <device> in such a case.
 

Crest

Active Member

Reaction score: 64
Messages: 211

Just keep in mind that it's dangerous to run zpool labelclear on random devices if they may be in use.
 
OP
giorgiov

giorgiov

New Member


Messages: 13

labelclear -f does not work because it doesn’t find the device. Or maybe should I attach the device (which now is associated with another pool) and use that device label? But then, what happens to the new pool?
 

kpa

Beastie's Twin

Reaction score: 1,820
Messages: 6,318

You have to use the device as it appears on your system, for example if the disk with the leftover labels is da0 you would do this:

# zpool labelclear -f da0

You can't use the pool name or the device name from the zpool list output because the pool is not imported for obvious reasons.
 
OP
giorgiov

giorgiov

New Member


Messages: 13

There must be a way out of this. I can’t do anything with this dead pool except that I cannot create another pool with the same name.
 

kpa

Beastie's Twin

Reaction score: 1,820
Messages: 6,318

You should run zdb -l <device> on all disks on your systems to see where the offending label resides. If it's not found on a disk try every partition of the disks as well.
 

usdmatt

Daemon

Reaction score: 544
Messages: 1,459

A lot of comments here about clearing the disk...
Clearly he has already used the disk as part of another pool so clearing labels is impossible. Also, the system is not reading the pool details from disk as the pool only has one device, and that is unavailable. I can only assume it's getting the pool info from /boot/zfs/zpool.cache.

You could probably delete that if you have no other pools on the system (or are not booting off ZFS, in which case you should be able to just re-import the other pools). If booting off ZFS I wouldn't like to risk deleting the cache file, so you may be a bit stuck. (Although does FreeBSD boot without the cache file now or is it still required?).

In fact I'm not certain the cache file is required at all any more? When ZFS loads it should import any pool that has a hostid matching the current host.
 
OP
giorgiov

giorgiov

New Member


Messages: 13

Mmh, that’s interesting. Here’s the output of zdb:
Code:
[root@file-server-01 /usr/home/admin]# zdb -l /dev/da0p1
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
failed to unpack label 1
--------------------------------------------
LABEL 2
--------------------------------------------
failed to unpack label 2
--------------------------------------------
LABEL 3
--------------------------------------------
    version: 5000
    name: 'usb-backup'
    state: 1
    txg: 3423
    pool_guid: 6403289418777496352
    hostid: 4266313884
    hostname: 'file-server-01.local'
    top_guid: 15491483894671884218
    guid: 15491483894671884218
    vdev_children: 1
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 15491483894671884218
        path: '/dev/gpt/zfs-usb-bkp-a'
        phys_path: '/dev/gpt/zfs-usb-bkp-a'
        whole_disk: 1
        metaslab_array: 34
        metaslab_shift: 35
        ashift: 12
        asize: 4000745783296
        is_log: 0
        create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
And that matches GUID-wise with the import output. However, two things stand out to me:
- There’s nothing under /dev/gpt/zfs-usb-bkp-a
- zpool labelclear -f /dev/da0p1 errs out with a Unable to open /dev/da0p1
 

usdmatt

Daemon

Reaction score: 544
Messages: 1,459

Hmm, I though you had already created a new, functioning pool on the same disk?
You can try and manually wipe the label with dd but I'd be wary of doing that if the disk is currently part of a new working pool. (Although the fact that the system lists it but "cannot open" the device suggests to me it may be expecting the pool to exist due to the cache file)
 
OP
giorgiov

giorgiov

New Member


Messages: 13

Yes, I had created one and then destroy during the various tests. So, no, it is not part of a pool right now. I’ll try this ASAP.
 
OP
giorgiov

giorgiov

New Member


Messages: 13

Zeroing out the device fixed the issue. Thanks to all that helped out.
 

alestrix

New Member


Messages: 2

Hi!

I know this is an old thread, but it seems to be the go-to-thread when searching fot this topic in the Interwebs.
The commands here helped me get rid of a "zombie" test pool that was not available anymore. However, in my case I received more than I bargained for. In my case, the singe partition of a drive was in use by a pool, but the drive itself had remnants of an unavailabe pool:

Pool "murdock" unavailable, label can be found on ada0 (and ada1):
Code:
 nas4free: /proc# zpool import
   pool: murdock
     id: 16803376907535956312
  state: UNAVAIL
 status: One or more devices are missing from the system.
 action: The pool cannot be imported. Attach the missing
        devices and try again.
   see: http://illumos.org/msg/ZFS-8000-3C
 config:

        murdock                 UNAVAIL  insufficient replicas
          12889769627032609724  UNAVAIL  cannot open
          6904202065765623145   UNAVAIL  cannot open
 nas4free: /proc# zdb -l /dev/ada0
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
failed to unpack label 1
--------------------------------------------
LABEL 2
--------------------------------------------
    version: 5000
    name: 'murdock'
    state: 1
    txg: 6687893
    pool_guid: 16803376907535956312
    hostid: 3343067376
    hostname: 'nas4free.local'
    top_guid: 12889769627032609724
    guid: 12889769627032609724
    vdev_children: 2
    vdev_tree:
        type: 'disk'
        id: 0
        guid: 12889769627032609724
        path: '/dev/gpt/se3t_1p2'
        phys_path: '/dev/gpt/se3t_1p2'
        whole_disk: 1
        metaslab_array: 37
        metaslab_shift: 32
        ashift: 12
        asize: 800588300288
        is_log: 0
        DTL: 375
        create_txg: 4
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
--------------------------------------------
LABEL 3
--------------------------------------------
failed to unpack label 3
ada0p1 (and ada1p1) are in use by pool "hannibal":
Code:
 nas4free: /proc# gpart list ada0
Geom name: ada0
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 5860533134
first: 34
entries: 128
scheme: GPT
Providers:
1. Name: ada0p1
   Mediasize: 3000592940544 (2.7T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e2
   rawuuid: 14752c4a-b1f1-11e3-b466-38eaa7a49c0d
   rawtype: 516e7cba-6ecf-11d6-8ff8-00022d09712b
   label: se3t_1p1
   length: 3000592940544
   offset: 24576
   type: freebsd-zfs
   index: 1
   end: 5860533134
   start: 48
Consumers:
1. Name: ada0
   Mediasize: 3000592982016 (2.7T)
   Sectorsize: 512
   Stripesize: 4096
   Stripeoffset: 0
   Mode: r1w1e3
My assumption was that the labelclear command would clear the label info only on the device/geometry given as parameter. The error message when starting the command without "-f" option seems to confirm this assumption (i.e. only the label of the unavailable pool "murdock" will be removed):
Code:
[root@nas4free /proc]# zpool export hannibal
[root@nas4free /proc]# zpool labelclear /dev/ada0
labelclear operation failed.
        Vdev /dev/ada0 is a member of the exported pool "murdock".
        Use "zpool labelclear -f /dev/ada0" to force the removal of label
        information.
[root@nas4free /proc]# zpool labelclear -f /dev/ada0
[root@nas4free /proc]# zpool labelclear /dev/ada1
labelclear operation failed.
        Vdev /dev/ada1 is a member of the exported pool "murdock".
        Use "zpool labelclear -f /dev/ada1" to force the removal of label
        information.
[root@nas4free /proc]# zpool labelclear -f /dev/ada1
[root@nas4free /proc]# zdb -l /dev/ada0
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
failed to unpack label 1
--------------------------------------------
LABEL 2
--------------------------------------------
failed to unpack label 2
--------------------------------------------
LABEL 3
--------------------------------------------
failed to unpack label 3
[root@nas4free /proc]# zdb -l /dev/ada1
--------------------------------------------
LABEL 0
--------------------------------------------
failed to unpack label 0
--------------------------------------------
LABEL 1
--------------------------------------------
failed to unpack label 1
--------------------------------------------
LABEL 2
--------------------------------------------
failed to unpack label 2
--------------------------------------------
LABEL 3
--------------------------------------------
failed to unpack label 3
Ok, time to re-import the "hannibal" pool from ada0p1/ada1p1 again:
Code:
[root@nas4free /proc]# zpool import hannibal
cannot import 'hannibal': no such pool available
[root@nas4free /proc]# zdb -l /dev/ada
ada0     ada1     ada2     ada2p1   ada3     ada3p1   ada4     ada4s1   ada4s1a  ada4s2
[root@nas4free /proc]# gpart show ada0
gpart: No such geom: ada0.
[root@nas4free /proc]# gpart show /dev/ada0
gpart: No such geom: /dev/ada0.
[root@nas4free /proc]# gpart list ada0
gpart: No such geom: ada0.
[root@nas4free /proc]# gpart list /dev/ada0
gpart: No such geom: /dev/ada0.
[root@nas4free /proc]# fdisk /dev/ada0
******* Working on device /dev/ada0 *******
parameters extracted from in-core disklabel are:
cylinders=5814021 heads=16 sectors/track=63 (1008 blks/cyl)

Figures below won't work with BIOS for partitions not in cyl 1
parameters to be used for BIOS calculations are:
cylinders=5814021 heads=16 sectors/track=63 (1008 blks/cyl)

fdisk: invalid fdisk partition table found
Media sector size is 512
Warning: BIOS sector numbering starts with sector 1
Information from DOS bootblock is:
The data for partition 1 is:
sysid 165 (0xa5),(FreeBSD/NetBSD/386BSD)
    start 63, size 1565565057 (764436 Meg), flag 80 (active)
        beg: cyl 0/ head 1/ sector 1;
        end: cyl 755/ head 15/ sector 63
The data for partition 2 is:
<UNUSED>
The data for partition 3 is:
<UNUSED>
The data for partition 4 is:
<UNUSED>
[root@nas4free /]# gpart list -a | grep name
Geom name: ada2
Geom name: ada3
Geom name: ada4
Geom name: ada4s1
[root@nas4free /]# gpart show -r
=>        34  5860533101  ada2  GPT  (2.7T)
          34        4062        - free -  (2.0M)
        4096  5859373056     1  516e7cba-6ecf-11d6-8ff8-00022d09712b  (2.7T)
  5859377152     1155983        - free -  (564M)

=>        34  5860533101  ada3  GPT  (2.7T)
          34        4062        - free -  (2.0M)
        4096  5859373056     1  516e7cba-6ecf-11d6-8ff8-00022d09712b  (2.7T)
  5859377152     1155983        - free -  (564M)

=>       63  117231345  ada4  MBR  (56G)
         63    1654632     1  165  [active]  (808M)
    1654695         63        - free -  (32K)
    1654758  115576587     2  165  (55G)
  117231345         63        - free -  (32K)

=>      0  1654632  ada4s1  BSD  (808M)
        0     8129          - free -  (4.0M)
     8129  1638400       1  7  (800M)
  1646529     8103          - free -  (4.0M)
Oops! Seems like "labelclear" did not only clear the lable, but it wiped all GPT info from the device, taking the partitions and the production pool residing on them with it.

I know I could have done a few things smarter than I did (like testing the process on only one of the two drives that formed my mirrored pool "hannibal" first, I then yould have re-created the mirror), but the fact that the labelclear command takes its clearing job this serious is absolutely not clear from the very "light" (to put it mildly) info in the manpages. I'm glad I had a backup.

This post is for others like me who stumble over zombie pools on used devices in the hopes they do not make the same mistake as I did.

Cheers
Alex
 

ShelLuser

Son of Beastie

Reaction score: 1,800
Messages: 3,600

Oops! Seems like "labelclear" did not only clear the lable, but it wiped all GPT info from the device, taking the partitions and the production pool residing on them with it.
It doesn't.

I get the impression that you're not using FreeBSD but a derivative, FreeNAS perhaps? Because labelclear does not behave this way on FreeBSD. So either you made a mistake somewhere yourself or... you used a derivative and assumed that it worked the same as FreeBSD while it didn't.

This is also one of the reasons most derivatives are not supported on these forums: they don't necessarily behave in the same way as FreeBSD does. FreeNAS for example is based on FreeBSD 12-CURRENT which is basically a developer snapshot which provides no guarantees at all that it will actually run. In fact: its use is even discouraged because of the snapshot aspect; this means that there could be bugs in the OS which could theoretically damage your system or worse.

But as mentioned: supported versions of FreeBSD do not behave in this way.

(edit)

Re-reading the post I now see what happened: a mistake on your end. See, you clearly showed us that ada0p1 was where the ZFS pool was located (at least parts of it). Yet the commands you used were:

[root@nas4free /proc]# zpool labelclear /dev/ada0
[root@nas4free /proc]# zpool labelclear -f /dev/ada0
[root@nas4free /proc]# zpool labelclear /dev/ada1
labelclear operation failed.
Vdev /dev/ada1 is a member of the exported pool "murdock".
Use "zpool labelclear -f /dev/ada1" to force the removal of label
information.
[root@nas4free /proc]# zpool labelclear -f /dev/ada1
If you had used # zpool labelclear ada0p1 then there wouldn't have been any problem. Also: this behavior is fully by design and not some kind of glitch.

See: you can use different storage media within a ZFS pool. Either you can use different partitions on a HD (such as ada0p1) or you can use an entire disk. However... If you decide to use the entire disk then that disk won't have any valid data on it. It won't have a boot record, it won't have a partition table, it wouldn't contain anything. Only data related to a ZFS pool.

With that in mind the result of the command, especially with the -f (force) flag is somewhat to be expected.

Other than the fact that as I mentioned before: FreeBSD doesn't behave in this way. This command would not fully destroy your partitioning scheme.
 
Last edited:
Top