Solved Upgrading bootloader on ZFS mirror

jbo@

Developer
I'm upgrading a system from 13.2-RELASE to 14.0-RELEASE.
The host has root-on-ZFS using a two-disk mirror:
Code:
p_stor_01# zpool status zroot
  pool: zroot
 state: ONLINE
config:

    NAME        STATE     READ WRITE CKSUM
    zroot      ONLINE       0     0     0
     mirror-0  ONLINE       0     0     0
       ada7p3  ONLINE       0     0     0
       ada8p3  ONLINE       0     0     0

errors: No known data errors
As such, my game plan is to first mount the ESP of ada7, upload the booloader, then mount the ESP of ada8 and do the same.

Everything worked out as expected for ada7. However, I am unable to mount the ESP of ada8:
Code:
p_stor_01# mount -t msdosfs /dev/ada7p1 /boot/efi
p_stor_01# umount /boot/efi
p_stor_01# mount -t msdosfs /dev/ada8p1 /boot/efi
mount_msdosfs: /dev/ada8p1: Invalid argument

Here's gpart show of both disks:
Code:
p_stor_01# gpart show ada7
=>       40  500118112  ada7  GPT  (238G)
         40     532480     1  efi  (260M)
     532520       2008        - free -  (1.0M)
     534528   16777216     2  freebsd-swap  (8.0G)
   17311744  482805760     3  freebsd-zfs  (230G)
  500117504        648        - free -  (324K)
Code:
p_stor_01# gpart show ada8
=>       40  500118112  ada8  GPT  (238G)
         40     532480     1  efi  (260M)
     532520       2008        - free -  (1.0M)
     534528   16777216     2  freebsd-swap  (8.0G)
   17311744  482805760     3  freebsd-zfs  (230G)
  500117504        648        - free -  (324K)

Could somebody explain to me what I'm missing / doing wrong here?
 
Wasn't there an issue some time back? The installer would create a mirrored system just fine. It did create the freebsd-boot on the second drive (if it existed), but not the bootcode for the efi partition?
 
Only the ESP from the first disk is mounted. The other disks should have a valid copy of it in case the first disk goes bad/not recognized/ so you can remove the disk and boot from the any other disk from the mirror.

p.s. i will check if there was a bug some time ago regarding the copy process of the .efi file on all mirrored disks.
 
Wasn't there an issue some time back? The installer would create a mirrored system just fine. It did create the freebsd-boot on the second drive (if it existed), but not the bootcode for the efi partition?
Would you happen to have more info on this?
The system was initial clean-installed on a 13.x-RELEASE (not sure which minor version).
 
I've just checked. 13.2-RELEASE create the secondary ESP without formatting it as FAT32 and copy the EFI file to it.

DEBUG: zfs_create_diskpart: gpart add -a 4k -l efiboot0 -t efi -s 260M "da0"
DEBUG: zfs_create_diskpart: retval=0 <output below>
da0p1 added
DEBUG: zfs_create_diskpart: configuring ESP at [/dev/gpt/efiboot0]
DEBUG: zfs_create_diskpart: newfs_msdos "/dev/gpt/efiboot0"
DEBUG: zfs_create_diskpart: retval=0 <output below>
/dev/gpt/efiboot0: 532288 sectors in 16634 FAT16 clusters (16384 bytes/cluster)
BytesPerSec=512 SecPerClust=32 ResSectors=1 FATs=2 RootDirEnts=512 Media=0xf0 FATsecs=65 SecPerTrack=63 Heads=255 HiddenSecs=0 HugeSectors=532480
DEBUG: zfs_create_diskpart: printf "$FSTAB_FMT" "/dev/gpt/efiboot0" "/boot/efi" "msdosfs" "rw" "2" "2" >> "/tmp/bsdinstall_etc/fstab"
DEBUG: zfs_create_diskpart: retval=0 <no output>
And then for efiboot1 on the secondary disk it's missing the filesystem and the BOOTX64.efi
DEBUG: zfs_create_diskpart: gpart add -a 4k -l efiboot1 -t efi -s 260M "da1"
DEBUG: zfs_create_diskpart: retval=0 <output below>
da1p1 added
DEBUG: zfs_create_diskpart: gpart add -a 1m -l swap1 -t freebsd-swap -s 2147483648b "da1"
DEBUG: zfs_create_diskpart: retval=0 <output below>
da1p2 added
DEBUG: zfs_create_diskpart: zpool labelclear -f "/dev/da1p2"
DEBUG: zfs_create_diskpart: retval=255 <output below>
failed to clear label for /dev/da1p2
 
Yes it's the same, but don't use dd to create the ESP as it's suggested in the PR , use the newfs_msdos to format the partition and then copy the loader.efi as it's explained in the wiki page

p.s.
I also check the 14-RELEASE it doesn't create it there either. Maybe there's a some reason not to be created during installation on all disks.
 
Yes it's the same, but don't use dd to create the ESP as it's suggested in the PR , use the newfs_msdos to format the partition and then copy the loader.efi as it's explained in the wiki page

p.s.
I also check the 14-RELEASE it doesn't create it there either. Maybe there's a some reason not to be created during installation on all disks.
Yes and yes! I prefer newfs_msdosfs over dd here, too.;)
 
I think i found why only the first disk get FAT32 format.

There's a check in zfs_create_diskpart() function for null string on $efibootpart at line 869 in freebsd-src/usr.sbin/bsdinstall/scripts/zfsboot and when the zfs_create_diskpart() is called for the all disk from line 1193 it skip the rest of the disks because $efibootpart was already set on efibootpart="/dev/gpt/efiboot$index" on the first call. Maybe if $efibootpart is set back to null after the FAT32 format or changing the if test so it can be called for the all disk will fix this.
 
Maybe if $efibootpart is set back to null after the FAT32 format or changing the if test so it can be called for the all disk will fix this.
bsdinstall should know how many drives are used for mirror(s)/RAIDs and which drive is included in them.
So just repeating formatting and copying for each drive included seems to be needed.
 
Yes the script call zfs_create_diskpart() with disk number and Index Number so it can create ESP on all disks and format it as FAT32 but you don't want all ESP partitions to be mounted in your fstab. So it may look like this

Code:
            if [ -z "$efibootpart" ]; then
                efibootpart="/dev/gpt/efiboot$index"
                f_dprintf "$funcname: configuring ESP at [%s]" \
                          "${efibootpart}"
                        
                f_eval_catch $funcname newfs_msdos "$NEWFS_ESP"\
                             "$efibootpart" \
                             || return $FAILURE
                f_eval_catch $funcname printf "$PRINTF_FSTAB" \
                         $efibootpart /boot/efi msdosfs \
                             rw 2 2 "$BSDINSTALL_TMPETC/fstab" \
                             || return $FAILURE
            elif [ -n "$efibootpart" ]; then
                efibootpart="/dev/gpt/efiboot$index"
                f_dprintf "$funcname: configuring ESP at [%s]" \
                          "${efibootpart}"
                        
                f_eval_catch $funcname newfs_msdos "$NEWFS_ESP"\
                             "$efibootpart" \
                             || return $FAILURE
            fi

But then in bootconfig you need to mount each ESP partition one by one to copy the loader.efi into them via update_uefi_bootentry()
The formatting of all ESP and copy of the .efi was done before and removed in this commit:
 
Interesting... This is most likely indeed the issue I'm experiencing. I'm fairly sure that the host in question was initially cleanly/freshly installed with 13.2-RELEASE.

My ESPs have both a /boot/efi/efi/freebsd/loader.efi as well as a /boot/efi/efi/boot/BOOTX64.efi. Do I just populate both from /boot/loader.efi (i.e. same content)?

It this some UEFI compatibility shenanigans?
 
Interesting... This is most likely indeed the issue I'm experiencing. I'm fairly sure that the host in question was initially cleanly/freshly installed with 13.2-RELEASE.

My ESPs have both a /boot/efi/efi/freebsd/loader.efi as well as a /boot/efi/efi/boot/BOOTX64.efi. Do I just populate both from /boot/loader.efi (i.e. same content)?

It this some UEFI compatibility shenanigans?
Some UEFI setup routines rely on the name BOOTX64.efi. In this case normally you rename loader1.efi.
Other UEFI setup routines allow selecting a loader file of arbitrary name. In that case you can just copy loader1.efi.

Maybe you have the former case and just "played around" until it worked? I can hardly imagine, that both files are involved in the boot process. Probably only BOOTX64.efi, if I'd to make a guess.
 
Maybe you have the former case and just "played around" until it worked? I can hardly imagine, that both files are involved in the boot process. Probably only BOOTX64.efi, if I'd to make a guess.
The host in question was 100% a clean, fresh install of FreeBSD 13.x-RELEASE (most likely 13.2). I didn't dick around with it at all. It served as a production server until now.
 
On UEFI you can create arbitrary boot entries, which can all start different executables located on the ESP, i.e. the FreeBSD UEFI kernel loader loader.efi(8) (aka bootloader). See efibootmgr(8). If you don't have boot entries or none of them is set active, the EFI firmware tries to execute the fallback /efi/boot/bootx64.efi (on amd64) instead.

AFAIK the FreeBSD installer creates a boot entry for your installed system and points to /efi/freebsd/loader.efi as the executable to start. And yes, /efi/freebsd/loader.efi and /efi/boot/bootx64.efi are the same file /boot/loader.efi copied to two places. bootx64.efi as a fallback.

Side note: I don't set up /efi/freebsd/loader.efi and specific boot entries on my machines. I always go with simple cp -a /boot/loader.efi /boot/efi/efi/boot/bootx64.efi and be done with it.
 
You can see what loader is actually booting with efibootmgr -v.

The upgrade to 14.0 is somewhat strange on this point. It makes sometimes a new entry that uses /efi/freebsd/loader.efi, so this becomes the new loader that actually boots.

On several VirtualBox VM upgrades, it didn't make a new boot entry but it did it on a bare metal machine. And it messed up something. Now, efibootmgr -v reach a never ending loop when I use it on that machine.

A new install of 14.0-RELEASE will create and place at the first order /efi/freebsd/loader.efi to boot on.
 
Other UEFI setup routines allow selecting a loader file of arbitrary name. In that case you can just copy loader1.efi.
For this type of configuration, registering the loader to UEFI boot manager using efibootmgr and NON-BROKEN-SUPPORT-BY-UEFI-FIRMWARE are mandatory. If I recall correctly, I've heard that some (usualy old) UEFI firmware is malfunctional and cannot boot FreeBSD this way.
 
Just a thought (untested), but possibly, mirror ESP with geom mirror helps?
Usually, ESP is only read by UEFI firmware and UEFI firmware doesn't care about mirrors on GEOM layer.
One thing to be aware of would be firmware update which uses ESP as temporary storage. I've never used GEOM mirror, so not understanding what happens in such a case, except firmware image to be used for upgrade is written by FreeBSD environment.
 
Why they decided to write the EFI boot loader non-redundantly to just one drive, this is beyond my understanding.
A little loop, is this so difficult?
And BOOTx64.efi alone is sufficient, so why the other variants?
And furthermore, why not also copy the CSM booter, as it only costs 1/2 megabyte of space?
*confused*
 
And BOOTx64.efi alone is sufficient, so why the other variants?
Basically for multi-boot. Blindly overwriting UEFI default loader, such as EFI/BOOT/BOOtx64.efi for amd64, can cause coexisting OS fails to boot.
Overwriting default one SHALL be intentional by admin.
What installers, such as bsdinstall, can blindly overwrite is OS-specific one and this loader must be registered to UEFI boot manager. On FreeBSD, this is done by efibootmgr.
 
On FreeBSD, this is done by efibootmgr.
Ah great, that helped me a lot understanding the matter.

In case if the UEFI loader has to be installed on a system without UEFI, I am not sure yet whether there uefibootmgr is available.
In this case, I think the only way is to create the EFI partition with BOOTx64.efi, like I do in the SkunkCloner, to make sure the cloned drive can boot no matter whether traditional or UEFI system.

If the loader is being created on an UEFI system, the uefibootmgr manpage says it is sufficient to have the EFI partition mounted, a fstab entry seems not necessary. Thus this can be done easily in a loop walking all mirror drives.

But what I do not yet understand is the EFI path.
Apparently it is not really necessary to use efi/efi. The source I looked at when creating the boot stuff writer loop in my SkunkCloner used efi0/efi. Maybe the different paths are something multi-boot related too?

Btw, maybe it's also not necessary to use FAT32 for that small diskette-sized EFI partition, in my tests stuff worked fine with FAT12.
 
Just a thought (untested), but possibly, mirror ESP with geom mirror helps?
My understanding of GEOM mirrors are "mirror is at the device level".
You can create a mirror with ada0 and ada1, not ada0p1 and ada1p1.
All the examples I've seen are you mirror the whole device and then you create partitions on the mirror, which creates them on the individual devices.
Now I think one could create a GEOM mirror, create partitions on the GEOM mirror and then create a single zpool on the GEOM mirror partition, but that would be different from a "ZFS Mirror VDEV".

I think an answer, as others have said, is the installer needs to be more aware of mirror devices and format/do the right thing on all the boot partitions (efi or BIOS)
 
Btw, maybe it's also not necessary to use FAT32 for that small diskette-sized EFI partition, in my tests stuff worked fine with FAT12.

Yes, on both my laptop and desktop a 1MB sized FAT12 partition containing only /efi/boot/bootx64.efi works. I am not sure if this violates the UEFI standard. But I do not want to use a +260MB sized partition for a ~900KB binary. I mean, storage is cheap these days, but c'mon...
 
Back
Top