TL,dr: I'm researching how to boot and recover a
But that seems insufficient.
Long version:
Mostly because of an idea that struck me, rather than any real necessity, I am trying to construct a demonstration of how to resurrect a throughly damaged FreeBSD on ZFS installation, with the only prior knowledge being the name of a recent, valid, recursive snapshot of the pool. Sure, I have a thumbdrive I could do that with, but the idea that struck me is somewhat more awesome if I can get it to work.
That idea is to boot the kernel from a snapshot of the normal zroot/ROOT/default filesystem, while also setting vfs.root.mountfrom to that snapshot. The end effect I'm hoping for is to be able to boot into single-user mode using that snapshot as a read-only root filesystem. Once there, since
To my surprise, this works. Sort of. A test case seems to be flawless from everything I know to examine. But a real-world follow-up doesn't work. Here's what I have:
My intent is to boot from the daily snapshot taken at 00:05:00 am on Jun 17 2024, and once there, rollback all filesystems to the safety snapshot @hold-my-beer-and-watch-this.
So I reboot. Escape to loader prompt, then:
The system boots into single-user mode with errors that I pretty much expect and can live with:
I view this as first-stage success: I have loaded a kernel from a snapshot, and booted into single-user mode with that snapshot as my read-only
root filesystem.
Now what I want to do is prove that I can blow away my live root filesystem, while keeping the snapshots, and still be able to boot back into this read-only root using snapshot @20240617-000500-MACU.
So I set about
Now I have no /boot directory in my root filesystem (or anything else!), but I still have all the snapshots that I started with:
So my interpretation of the status at this point, is that I have demonstrated how to load a kernel from a snapshot, and how to boot to single-user mode using a read-only root filesystem taken from the same snapshot. "mount" and "df -h" both confirm this:
And I also know that the live filesystems on my former / and /usr filesystems are wiped:
So now I want to repeat the boot process. I didn't use the live filesystem the first time, I set
I totally get 100% of all that (I think). Since there is no longer any /boot directory, no /boot/loader.conf etc., loader.efi only choice is to look for the ZFS filesystem referenced in the pool's BOOTFS property. When it couldn't find a /boot directory there to load the next-stage loader.lua, or any other config details, it was out of options.
As I've pondered this today, the thought has occurred to me that since /boot is no longer there for the first-stage (?) loader, all the defaults in /boot/defaults/loader.conf also are not set, and it seems quite likely that more of those missing defaults is what is preventing my "real-world" example from succeeding when I try to boot from from a valid snapshot but an empty live BOOTFS filesystem. But what puzzles me most is that the kernel boots fine, but the failure is in the inability to mount the root filesystem, even though it should be mounting the same read-only filesystem that was successfully mounted in the initial test.
Unlike when I specified those options the first time, to load the kernel and the root filesystem from a snapshot, this time the root mount fails. Instead, I get the classic
Where even a last-ditch attempt fails:
So, in addition to:
what incantations do I need at the loader prompt to successfully boot into a single-user read-only root environment based on snapshot tank/ROOT/default@snapshot? Something perhaps to get loader.lua to re-read the files in loader_conf_files and loader_conf_dirs relative to the
Better still, is there a way to re-load the loader_conf settings, and then launch the boot menu, so that I can choose from the various kernels that might be present in the snapshot I'm booting from?
rm -rf
'ed system from a snapshot (hypothetically!). What are the germane loader.conf entries I need to enter by hand at the loader prompt? I'm giving it
Code:
currdev=zfs:tank/ROOT/default@snapshot:
vfs.root.mountfrom=zfs:tank/ROOT/default@snapshot
But that seems insufficient.
Long version:
Mostly because of an idea that struck me, rather than any real necessity, I am trying to construct a demonstration of how to resurrect a throughly damaged FreeBSD on ZFS installation, with the only prior knowledge being the name of a recent, valid, recursive snapshot of the pool. Sure, I have a thumbdrive I could do that with, but the idea that struck me is somewhat more awesome if I can get it to work.
That idea is to boot the kernel from a snapshot of the normal zroot/ROOT/default filesystem, while also setting vfs.root.mountfrom to that snapshot. The end effect I'm hoping for is to be able to boot into single-user mode using that snapshot as a read-only root filesystem. Once there, since
zfs
and zpool
live in /sbin which is typically in the .../ROOT/default filesystem, I should be able to do a zfs rollback -r on my file systems, provided that I must boot from from an OLDER snapshot that what I'm rolling back to -- that is, my single-user shell can't be rooted in a snapshot that would be destroyed by the rollback.To my surprise, this works. Sort of. A test case seems to be flawless from everything I know to examine. But a real-world follow-up doesn't work. Here's what I have:
Code:
MacUnix : /root# zfs list -rt snap macunix/ROOT/default
NAME USED AVAIL REFER MOUNTPOINT
macunix/ROOT/default@empty 96K - 112K -
macunix/ROOT/default@20240614-000500-MACU 684K - 205M -
macunix/ROOT/default@20240615-000500-MACU 336K - 205M -
macunix/ROOT/default@20240616-000500-MACU 320K - 205M -
macunix/ROOT/default@20240617-000500-MACU 320K - 205M -
macunix/ROOT/default@hold-my-beer-and-watch-this 508K - 205M -
My intent is to boot from the daily snapshot taken at 00:05:00 am on Jun 17 2024, and once there, rollback all filesystems to the safety snapshot @hold-my-beer-and-watch-this.
So I reboot. Escape to loader prompt, then:
Code:
OK set currdev=zfs:macunix/ROOT/default@20240617-000500-MACU:
OK ls
/
d home
d jail
d usr
d tmp
d var
d bin
d boot
d dev
d etc
d lib
d libexec
d media
d mnt
d net
d proc
d rescue
d root
d sbin
entropy
COPYRIGHT
.profile
.cshrc
OK set vfs.root.mountfrom=zfs:macunix/ROOT/default@20240617-000500-MACU
OK boot
The system boots into single-user mode with errors that I pretty much expect and can live with:
Code:
mount: macunix/ROOT/default@20240617-000500-MACU: Operation not supported
Mounting root filesystem rw failed, startup aborted
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
2024-06-17T13:40:34.207315-07:00 - init 1 - - /bin/sh on /etc/rc terminated abnormally, going to single user mode
Enter full pathname of shell or RETURN for /bin/sh:
Cannot read termcap database;
using dumb terminal settings.
: \t />
I view this as first-stage success: I have loaded a kernel from a snapshot, and booted into single-user mode with that snapshot as my read-only
root filesystem.
df
confirms that I'm running from a mounted snapshot:
Code:
: \t /> df -h .
Filesystem Size Used Avail Capacity Mounted on
macunix/ROOT/default@20240617-000500-MACU 871G 205G 871G 0% /
Now what I want to do is prove that I can blow away my live root filesystem, while keeping the snapshots, and still be able to boot back into this read-only root using snapshot @20240617-000500-MACU.
So I set about
rm
-ing the live filesystem:
Code:
: \t /> mount -t zfs macunix/ROOT/default
: \t /> chflags -R noschg /mnt/
: \t /> rm -rf /mnt/.??* /mnt/*
: \t /> ls -a /mnt
./ ../
: \t /> mkdir /mnt/usr
: \t /> mount -t zfs macunix/usr /mnt/usr
: \t /> chflags -R noschg /mnt/usr
: \t /> rm -rf /mnt/usr/.??* /mnt/usr/*
: \t /> ls -a /mnt/usr
./ ../
Now I have no /boot directory in my root filesystem (or anything else!), but I still have all the snapshots that I started with:
Code:
: \t /> zfs list -rt snap macunix/ROOT/default
NAME USED AVAIL REFER MOUNTPOINT
macunix/ROOT/default@empty 96K - 112K -
macunix/ROOT/default@20240614-000500-MACU 684K - 205M -
macunix/ROOT/default@20240615-000500-MACU 336K - 205M -
macunix/ROOT/default@20240616-000500-MACU 320K - 205M -
macunix/ROOT/default@20240617-000500-MACU 320K - 205M -
macunix/ROOT/default@hold-my-beer-and-watch-this 508K - 205M -
: \t /> zfs list -d1 -rt snap macunix/usr
NAME USED AVAIL REFER MOUNTPOINT
macunix/usr@empty 80K - 96K -
macunix/usr@20240614-000500-MACU 1.84M - 331M -
macunix/usr@20240615-000500-MACU 1000K - 331M -
macunix/usr@20240616-000500-MACU 904K - 331M -
macunix/usr@20240617-000500-MACU 904K - 331M -
macunix/usr@hold-my-beer-and-watch-this 2.12M - 331M -
So my interpretation of the status at this point, is that I have demonstrated how to load a kernel from a snapshot, and how to boot to single-user mode using a read-only root filesystem taken from the same snapshot. "mount" and "df -h" both confirm this:
Code:
: \t /> mount
macunix/ROOT/default@20240617-000500-MACU on / (zfs, local, noatime, read-only, nfsv4acls)
devfs of /dev/ (devfs)
macunix/ROOT/default on /mnt (zfs, local, nfsv4acls)
macunix/usr on /mnt/usr (zfs, local, nfsv4acls)
: \t /> df -h
Filesystem Size Used Avail Capacity Mounted on
macunix/ROOT/default@20240617-000500-MACU 871G 205G 871G 0% /
devfs 1.0K 0B 1.0K 0% /dev
macunix/ROOT/default 871G 96K 871G 0% /mnt
macunix/usr 871G 96K 871G 0% /mnt/usr
And I also know that the live filesystems on my former / and /usr filesystems are wiped:
Code:
: \t /> ls -al /mnt /mnt/usr
/mnt:
total 10
drwxr-xr-x 3 root wheel 3 Jun 17 13:50 ./
drwxr-xr-x 22 root wheel 26 Jun 14 10:42 ../
drwxr-xr-x 2 root wheel 2 Jun 17 13:51 usr/
/mnt/usr:
total 1
drwxr-xr-x 2 root wheel 2 Jun 17 13:50 ./
drwxr-xr-x 3 root wheel 3 Jun 14 10:42 ../
So now I want to repeat the boot process. I didn't use the live filesystem the first time, I set
currdev
and vfs.root.mountfrom
to point to the snapshot of macunix/ROOT/default. Let's repeat that same boot dialog again:
Code:
: \t /> reboot
Consoles: EFI console
Reading loader env vars from /efi/freebsd/loader.env
Setting currdev to disk0p1:
FreeBSD/amd64 EFI loader, Revision 1.1
Command line arguments: loader.efi
Image base: 0x81d1c000
EFI version: 2.40
EFI Firmware: Apple (rev 1.00)
Console: efi (0)
Load Path: \EFI\BOOT\BOOTX64.efi
Load Device: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(1,GPT,A3BBE450-26BB-4892-8654-E9E5AAA8C285,0x22,0x400000)
BootCurrent: 0080
BootOrder: 0080[*]
BootInfo Path: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(1,GPT,A3BBE450-26BB-4892-8654-E9E5AAA8C285,0x22,0x400000)/\E
FI\BOOT\BOOTX64.efi
Ignoring Boot0080: Only one DP found
Trying ESP: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(1,GPT,A3BBE450-26BB-4892-8654-E9E5AAA8C285,0x22,0x400000)
Setting currdev to disk0p1:
Trying: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(2,GPT,(different partID),0x400022,0x20000000)
Setting currdev to disk0p2:
Trying: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(3,GPT,(another different partID),0x2400022,0x72306D6D)
setting currdev to zfs:macunix/ROOT/default:
Failed to find bootable partition
ERROR: cannot optn /boot/lua/loader.lua: no such file or directory.
Type '?' for a list of commands, 'help' for more detailed help.
OK
I totally get 100% of all that (I think). Since there is no longer any /boot directory, no /boot/loader.conf etc., loader.efi only choice is to look for the ZFS filesystem referenced in the pool's BOOTFS property. When it couldn't find a /boot directory there to load the next-stage loader.lua, or any other config details, it was out of options.
As I've pondered this today, the thought has occurred to me that since /boot is no longer there for the first-stage (?) loader, all the defaults in /boot/defaults/loader.conf also are not set, and it seems quite likely that more of those missing defaults is what is preventing my "real-world" example from succeeding when I try to boot from from a valid snapshot but an empty live BOOTFS filesystem. But what puzzles me most is that the kernel boots fine, but the failure is in the inability to mount the root filesystem, even though it should be mounting the same read-only filesystem that was successfully mounted in the initial test.
Unlike when I specified those options the first time, to load the kernel and the root filesystem from a snapshot, this time the root mount fails. Instead, I get the classic
mountroot>
prompt:
Code:
OK set currdev=zfs:macunix/ROOT/default@20240617-000500-MACU:
OK set vfs.root.mountfrom=zfs:macunix/ROOT/default@20240617-000500-MACU
OK boot
...
Loader variables:
vfs.root.mountfrom=zfs:macunix/ROOT/default@20240617-000500-MACU
Manual root filesystem specification:
<fstype>:<device> [options]
Mount <device> using filesystem <fstype>
and with the specified (optional) option list.
eg. ufs:/dev/da0s1a
zfs:zroot/ROOT/default
cd9660:/dev/cd0 ro
(which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /)
? List valid disk boot devices
. Yield 1 second (for background tasks)
<empty line> Abort manual input
mountroot>
Where even a last-ditch attempt fails:
Code:
mountroot> zfs:macunix/ROOT/default@20240617-000500-MACU ro
Trying to mount root from zfs:macunix/ROOT/default@20240617-000500-MACU [ro]...
Mounting from zfs:macunix/ROOT/default@20240617-000500-MACU failed with error 2: unknown file system.
So, in addition to:
Code:
currdev=zfs:tank/ROOT/default@snapshot:
vfs.root.mountfrom=zfs:tank/ROOT/default@snapshot
what incantations do I need at the loader prompt to successfully boot into a single-user read-only root environment based on snapshot tank/ROOT/default@snapshot? Something perhaps to get loader.lua to re-read the files in loader_conf_files and loader_conf_dirs relative to the
currdev
snapshot, before launching the /boot/kernel/kernel
found in the snapshot?Better still, is there a way to re-load the loader_conf settings, and then launch the boot menu, so that I can choose from the various kernels that might be present in the snapshot I'm booting from?