ZFS Testing replacing a failed zroot Disk

I've followed bunch of advice from different threads on here and I tried to test replacing a disk for zroot in case I get into that situation.

I'm testing this on a FreeBSD VM running on vm-bhyve with uefi bootloader and uefivars enabled.
VM has 3 zvol disks. All 3 disks are created at the time of VM creation.

Code:
loader="uefi"
cpu=8
memory=8G
network0_type="virtio-net"
network0_switch="public"
disk0_type="virtio-blk"
disk0_name="disk0"
disk0_dev="sparse-zvol"
disk1_name="disk1"
disk1_dev="sparse-zvol"
disk1_type="virtio-blk"
disk2_name="disk2"
disk2_dev="sparse-zvol"
disk2_type="virtio-blk"
graphics="yes"
xhci_mouse="yes"
graphics_res="1600x900"
zfs_zvol_opts="volblocksize=128k"
uefi_vars="yes"

I installed FreeBSD on two disks in a mirror.

Then I copied the GPT

Code:
gpart backup vtbd0 | gpart restore -F vtbd2

Table did look a bit different

Code:
root@:~ # gpart show -l
=>      40  41942960  vtbd0  GPT  (20G)
        40    532480      1  efiboot0  (260M)
    532520      2008         - free -  (1.0M)
    534528  41406464      2  zfs0  (20G)
  41940992      2008         - free -  (1.0M)

=>      40  41942960  vtbd1  GPT  (20G)
        40    532480      1  efiboot1  (260M)
    532520      2008         - free -  (1.0M)
    534528  41406464      2  zfs1  (20G)
  41940992      2008         - free -  (1.0M)

=>      34  41942973  vtbd2  GPT  (20G)
        34         6         - free -  (3.0K)
        40    532480      1  efiboot2  (260M)
    532520      2008         - free -  (1.0M)
    534528  41406464      2  zfs2  (20G)
  41940992      2015         - free -  (1.0M)

Edited the labels

Code:
gpart modify -i 1 -l efiboot2 vtbd2
gpart modify -i 2 -l zfs2 vtbd2

So then I did
Code:
dd if=/dev/vtbd0p1 of=/dev/vtbd2p1

Edited fstab

from
Code:
# Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/gpt/efiboot0               /boot/efi       msdosfs rw              2       2

to
Code:
# Device                Mountpoint      FStype  Options         Dump    Pass#
/dev/gpt/efiboot2               /boot/efi       msdosfs rw              2       2

Then attached the partition to zroot

Code:
zpool attach zroot /dev/vtbd0p2 /dev/vtbd2p2

zpool status also looked weird

Code:
root@:~ # zpool status
  pool: zroot
 state: ONLINE
status: One or more devices are configured to use a non-native block size.
        Expect reduced performance.
action: Replace affected devices with devices that support the
        configured block size, or migrate data to a properly configured
        pool.
  scan: resilvered 935M in 00:00:29 with 0 errors on Fri Jul 26 09:38:23 2024
config:

        NAME         STATE     READ WRITE CKSUM
        zroot        ONLINE       0     0     0
          mirror-0   ONLINE       0     0     0
            vtbd0p2  ONLINE       0     0     0
            vtbd2p2  ONLINE       0     0     0  block size: 4096B configured, 16384B native

second time arount zpool looked fine

Code:
root@:~ # zpool status
  pool: zroot
 state: ONLINE
  scan: resilvered 934M in 00:00:20 with 0 errors on Fri Jul 26 10:13:55 2024
config:

        NAME         STATE     READ WRITE CKSUM
        zroot        ONLINE       0     0     0
          mirror-0   ONLINE       0     0     0
            vtbd0p2  ONLINE       0     0     0
            vtbd1p2  ONLINE       0     0     0
            vtbd2p2  ONLINE       0     0     0


This worked. I could boot into FreeBSD.






However, I could not get it to work using the below commands.


Code:
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/zfsloader  -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/efi/efi/freebsd/loader.efi -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/efi/efi/boot/bootx64.efi -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/efi.4th -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/loader.efi -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/gptboot -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/zfsboot -p /boot/gptzfsboot -i 1 vtbd2

Some of those commands said file is too large so I guess either gpart restore did something wrong or I'm not supposed to use those files.

When I installed FreeBSD I chose GPT(UEFI). So which files are the correct files to copy over with gpart bootcode. I did read that /boot/pmbr will load gptzfsboot, but I can't confirm that's the one because it didn't work for me.


When I did `gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd2` and edited fstab, the bootloader starts but the partition can not get mounted. It says invalid signature in boot block.

I can see all of the datasets but I can not write to fstab file.


I also did "dd if=/boot/gptzfsboot of=/dev/vtbd2p1" and that didn't work either.


Screenshot from 2024-07-26 11-43-28.png
 
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/zfsloader -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/efi/efi/freebsd/loader.efi -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/efi/efi/boot/bootx64.efi -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/efi.4th -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/loader.efi -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/gptboot -p /boot/gptzfsboot -i 1 vtbd2
gpart bootcode -b /boot/zfsboot -p /boot/gptzfsboot -i 1 vtbd2[/CODE]

All of these commands are wrong and would lead to some disasters. Why did you run these commands while your system was booting well?

Never use gpart for efi loaders. Put only /boot/pmbr as "-b loader". pmbr and gptzfsboot are for BIOS booting only. If you are in EFI booting, changing these loaders has no effect.

If you want to know how to update boot loaders, see here: https://forums.freebsd.org/threads/update-of-the-bootcodes-for-a-gpt-scheme-x64-architecture.80163/
 
Why did you run these commands while your system was booting well?

:) This is just a VM for practice so when my real Disk dies on the real machine I don't create a disaster as you said. This is the way my UEFI real machinesare installed so I used the same setup for this VM.

I basically read about 10 Threads on here and most of them mentioned that command while also talking about zroot. Then it didn't seem right that I would need a mbr partition so I tried to copy some of the other ones. I couldn't figure out which ones.

So now that I read that link. It would seem that I just need to mount the partition on the new disk and copy the /boot/loader.efi to that partition as a file.

So if I add a new Disk, use gpart backup and restore, is that part correct?
It's just that after I ran it, the partition table looked the same but there was that extra bit of free space. The rest starts on sector 40 and is identical.
But what about the
Code:
        34         6         - free -  (3.0K)
part above? Why is it showing this free space, and on the other two disks it does not.

I'll go test copying the .efi bootloader. Thanks a lot, I appreciate it.
 
Yeah that didn't work
Code:
root@:~ # mount -t msdosfs /dev/vtbd2p1 /mnt
mount_msdosfs: /dev/vtbd2p1: Invalid argument

ok I did not format the fs, just copied the GPT, that might do it.
 
I have it working now thanks again.

I did this part.
Code:
newfs_msdos -F16 /dev/partition-name
mount -t msdosfs /dev/partition-name /mnt
mkdir -p /mnt/efi/boot
cp /boot/loader.efi /mnt/efi/boot/bootx64.efi
umount /mnt

But on there there's this part
Very important note: if you started with FreeBSD 14, the installer has activated a EFI boot entry that refers to this loader: /efi/freebsd/loader.efi. It means the system won't boot anymore on the default /efi/boot/bootx64.efi. So, update the file bootx64.efi will have no effect. I think it's safer to update both bootcodes.
Code:
cp /boot/loader.efi /boot/efi/efi/freebsd/loader.efi
I didn't do that part and the OS booted fine.

Code:
root@:~ # efibootmgr -v
Boot to FW : false
BootCurrent: 0008
Timeout    : 0 seconds
BootOrder  : 0008, 0000, 0001, 0002, 0003, 0004, 0005, 0006, 0007
+Boot0008* FreeBSD HD(1,GPT,2aed37f9-4b59-11ef-84c0-589cfc057827,0x28,0x82000)/File(\efi\freebsd\loader.efi)
                      vtbd0p1:/efi/freebsd/loader.efi (null)
 Boot0000* UiApp Fv(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(462caa21-7614-4503-836e-8ab6f4662331)
 Boot0001* UEFI BHYVE SATA DVD ROM BHYVE-AB57-BA0A-4AE9 PciRoot(0x0)/Pci(0x3,0x0)/Sata(0x0,0xffff,0x0)
 Boot0002* UEFI Misc Device PciRoot(0x0)/Pci(0x4,0x0)
 Boot0003* UEFI Misc Device 2 PciRoot(0x0)/Pci(0x5,0x0)
 Boot0004* UEFI Misc Device 3 PciRoot(0x0)/Pci(0x6,0x0)
 Boot0005* UEFI PXEv4 (MAC:589CFC057827) PciRoot(0x0)/Pci(0x7,0x0)/MAC(589cfc057827,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)
 Boot0006* UEFI PXEv6 (MAC:589CFC057827) PciRoot(0x0)/Pci(0x7,0x0)/MAC(589cfc057827,0x1)/IPv6(0000:0000:0000:0000:0000:0000:0000:0000,0x0,Static,0000:0000:0000:0000:0000:0000:0000:0000,0x40,0000:0000:0000:0000:0000:0000:0000:0000)
 Boot0007* EFI Internal Shell Fv(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)

I just destroyed and recreated the first two disks then efibootmgr showed the current boot was different. It works but I can't read and understand what that means.

Code:
root@:~ # efibootmgr -v
Boot to FW : false
BootCurrent: 0004
Timeout    : 0 seconds
BootOrder  : 0008, 0000, 0002, 0003, 0004, 0005, 0006, 0007
 Boot0008* FreeBSD HD(1,GPT,2aed37f9-4b59-11ef-84c0-589cfc057827,0x28,0x82000)/File(\efi\freebsd\loader.efi)
 Boot0000* UiApp Fv(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(462caa21-7614-4503-836e-8ab6f4662331)
 Boot0002* UEFI Misc Device PciRoot(0x0)/Pci(0x4,0x0)
 Boot0003* UEFI Misc Device 2 PciRoot(0x0)/Pci(0x5,0x0)
+Boot0004* UEFI Misc Device 3 PciRoot(0x0)/Pci(0x6,0x0)
 Boot0005* UEFI PXEv4 (MAC:589CFC057827) PciRoot(0x0)/Pci(0x7,0x0)/MAC(589cfc057827,0x1)/IPv4(0.0.0.0,0x0,DHCP,0.0.0.0,0.0.0.0,0.0.0.0)
 Boot0006* UEFI PXEv6 (MAC:589CFC057827) PciRoot(0x0)/Pci(0x7,0x0)/MAC(589cfc057827,0x1)/IPv6(0000:0000:0000:0000:0000:0000:0000:0000,0x0,Static,0000:0000:0000:0000:0000:0000:0000:0000,0x40,0000:0000:0000:0000:0000:0000:0000:0000)
 Boot0007* EFI Internal Shell Fv(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)
 
You type too fast. You should try to better understand before.

+Boot0008* FreeBSD HD(1,GPT,2aed37f9-4b59-11ef-84c0-589cfc057827,0x28,0x82000)/File(\efi\freebsd\loader.efi) vtbd0p1:/efi/freebsd/loader.efi (null)
Here, we see clearly that the current boot is vtbd0p1:/efi/freebsd/loader.efi. But you changed vtbd2p1 not vtbd0p1. So, this had no effect on the booting process.

+Boot0004* UEFI Misc Device 3 PciRoot(0x0)/Pci(0x6,0x0)
There, it uses the default path of a efi loader which is /efi/boot/bootx64.efi on a amd64 system.
 
You type too fast. You should try to better understand before.
True :)

Here, we see clearly that the current boot is vtbd0p1:/efi/freebsd/loader.efi. But you changed vtbd2p1 not vtbd0p1. So, this had no effect on the booting process.
That's what I don't understand. I edited fstab to say /dev/gpt/efiboot2 and I labeled the new disk's partition efiboot2, but this still showed booting from the disk0. How come this didn't have effect on booting process? Only when I destroyed the disk with partition efitoob0 then it showed the Boot0004.

I thought whatever is in fstab, that's what it boots from?
I feel like I kind of destroyed something I wasn't supposed to because.
I copied over /boot/loader.efi into /mnt/efi/boot/bootx64.efi on the Disk0 again and now I get Boot0002.

So I guess that's what the guy was talking about when he said copy the file to both locations. He said to also copy /boot/loader.efi into /boot/efi/efi/freebsd/loader.efi
I guess that's the default bootloader location and I didn't copy it to that directory. I will try that and see.

I deleted /boot/efi/efi/boot/bootx64.efi and put it in /boot/efi/efi/freebsd/loader.efi and I still get Boot0002. Then I don't understand what Boot0008 is and why is it even there as an entry.
 
It's the vars inside your EFI system that decides which loader actually boots not fstab.

You can change boot order, create new boot var and manipulate them with efibootmgr(8).
ohh haha thanks I forgot about that, yeah I needed that explanation. I even turned on efi variables so I can use them.

This now feels completed in my head :)
 
Back
Top