Solved Geli protected raidz1 weird status

Hello Community ,

last year I had a drive failure and I just replaced the disk and for me everything was ok. Yesterday I tried did a update to freebsd13 but after the update I was not able to import my Storage Pool.

Code:
freebsd# geli attach -k /root/geli.key /dev/ada2
geli: Cannot read metadata from /dev/ada2: Invalid argument.
geli: There was an error with at least one provider.
freebsd# geli attach -j -k /root/geli.key /dev/ada2
geli: Cannot read metadata from /root/geli.key: Inappropriate file type or format.
geli: Cannot read metadata from /dev/ada2: Invalid argument.
geli: There was an error with at least one provider.
freebsd# geli attach -j - -k /root/geli.key /dev/ada2
geli: Cannot read metadata from /dev/ada2: Invalid argument.
geli: There was an error with at least one provider.
freebsd# gpart show ada2
gpart: No such geom: ada2.

I did a rollback to 12.2 and luckily my Storage Pool mounted successfully , but zpool list -v shows a weird status.

What I don't understand how my pool actually work at the moment.

Code:
╰─ zpool list -v
NAME             SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
Storage         10.9T  7.60T  3.27T        -         -    10%    69%  1.00x  ONLINE  -
  raidz1        10.9T  7.60T  3.27T        -         -    10%  69.9%
    ada2            -      -      -        -         -      -      -
    ada3p2.eli      -      -      -        -         -      -      -
    ada1p2.eli      -      -      -        -         -      -      -
bootpool  1.98G   835M  1.17G        -         -    46%    41%  1.00x  ONLINE  -
  ada0p3  1.98G   835M  1.17G        -         -    46%  41.1%
zroot          228G   173G  55.2G        -         -    69%    75%  1.00x  ONLINE  -
  ada0p5.eli   228G   173G  55.2G        -         -    69%  75.8%

It looks like that ada2 is in raidz1 , but geli was not able to encrypt the drive , so gpart is not able to see the partition table .
Code:
╰─ sudo gpart show ada2
gpart: No such geom: ada2.

╰─ sudo gpart show ada1
=>        34  7814037101  ada1  GPT  (3.6T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)
  7814037128           7        - free -  (3.5K)

 BSD  sharky@freebsd  ~                                                                                                                                                                                                                                                                           9015
╰─ sudo gpart show ada3
=>        34  7814037101  ada3  GPT  (3.6T)
          34          94        - free -  (47K)
         128     4194304     1  freebsd-swap  (2.0G)
     4194432  7809842696     2  freebsd-zfs  (3.6T)
  7814037128           7        - free -  (3.5K)

Still
Code:
freebsd# geli  attach -k /root/geli.key /dev/ada2
geli: Cannot read metadata from /dev/ada2: Invalid argument.
geli: There was an error with at least one provider.

How to deal with this situation ?

I would remove /dev/ada2 from raidz1

Wipe it and start from scratch ?
 
Just for documentation .

I removed the drive from the Pool

zpool offline Storage ada2

# Destroy the partition
gpart destroy ada2

# Created new partitions
gpart create -s gpt ada2
gpart add -a 4k -t freebsd-swap -s 2G ada2
gpart add -a 4k -t freebsd-zfs ada2

# I tried geli init ...... , but ZFS was so kind to tell me there is a backup of the metadata use this !!!!
# awesome :)
geli restore /var/backups/ada2p2.eli /dev/ada2p2

# attach geli
geli attach -p -k /root/geli.key /dev/ada2p2

# replace disk
zpool replace -f Storage 8129401093822745805 ada2p2.eli

Now it is resilvering . I think I'm back on track !
 
What I don't understand how my pool actually work at the moment.

Code:
╰─ zpool list -v
NAME             SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
Storage         10.9T  7.60T  3.27T        -         -    10%    69%  1.00x  ONLINE  -
  raidz1        10.9T  7.60T  3.27T        -         -    10%  69.9%
    ada2            -      -      -        -         -      -      -
    ada3p2.eli      -      -      -        -         -      -      -
    ada1p2.eli      -      -      -        -         -      -      -
bootpool  1.98G   835M  1.17G        -         -    46%    41%  1.00x  ONLINE  -
  ada0p3  1.98G   835M  1.17G        -         -    46%  41.1%
zroot          228G   173G  55.2G        -         -    69%    75%  1.00x  ONLINE  -
  ada0p5.eli   228G   173G  55.2G        -         -    69%  75.8%
Apparently, ada2 was part of the pool without any encryption: It's listed as ada2 instead of ada2p2.eli (like the other two). gpart found no partitions on ada2, because there were no partitions. ZFS will happily use a raw block device without any partition tables.

What you probably did when you added ada2 to the pool originally, was: you created partitions and set up encryption with geli, but then accidentally added ada2 to the pool instead of ada2p2.eli. ZFS then happily adopted ada2 as a raw block device, destroying your partition table and encryption layer in the process. It worked, but not the way you thought it did ;)

Lucky for you, raidz1 can lose on drive without dying, so you were able to take out the accidentally-unencrypted ada2 from your raidz1 vdev (risky, but possible), re-partition it, get the encryption back on track and add the drive as ada2p2.eli, like you had originally wanted.

A tip for the future: In your setup, adding another drive could shift the system's drive naming (make ada3 into ada4 etc.), which could badly confuse your pool and require manual intervention. You can avoid this by not using device names for your pool drives, but partition labels. geli label can assign names to partitions, and you can use those in your zpool as /dev/gpt/somelabel. This makes your pool immune to changes in device numbering. (But I don't know how to change this for an existing pool.)
 
Another disk died tried to use my steps above.

In the end it worked pretty quick, but I my assumption was that the meta data backup is always there.

But it was not .

I had to

```
root@freebsd:~ # geli init -s 4096 -P -K geli_Storage.key -l 256 /dev/vtbd2p2

Metadata backup for provider /dev/vtbd2p2 can be found in /var/backups/vtbd2p2.eli
and can be restored with the following command:

# geli restore /var/backups/vtbd2p2.eli /dev/vtbd2p2
```

After this command it was created an I could use it.
 
Back
Top