Booting from a ZFS snapshot into a single-user read-only root shell

TL,dr: I'm researching how to boot and recover a rm -rf'ed system from a snapshot (hypothetically!). What are the germane loader.conf entries I need to enter by hand at the loader prompt? I'm giving it

Code:
currdev=zfs:tank/ROOT/default@snapshot:
vfs.root.mountfrom=zfs:tank/ROOT/default@snapshot

But that seems insufficient.

Long version:

Mostly because of an idea that struck me, rather than any real necessity, I am trying to construct a demonstration of how to resurrect a throughly damaged FreeBSD on ZFS installation, with the only prior knowledge being the name of a recent, valid, recursive snapshot of the pool. Sure, I have a thumbdrive I could do that with, but the idea that struck me is somewhat more awesome if I can get it to work.

That idea is to boot the kernel from a snapshot of the normal zroot/ROOT/default filesystem, while also setting vfs.root.mountfrom to that snapshot. The end effect I'm hoping for is to be able to boot into single-user mode using that snapshot as a read-only root filesystem. Once there, since zfs and zpool live in /sbin which is typically in the .../ROOT/default filesystem, I should be able to do a zfs rollback -r on my file systems, provided that I must boot from from an OLDER snapshot that what I'm rolling back to -- that is, my single-user shell can't be rooted in a snapshot that would be destroyed by the rollback.

To my surprise, this works. Sort of. A test case seems to be flawless from everything I know to examine. But a real-world follow-up doesn't work. Here's what I have:

Code:
MacUnix : /root# zfs list -rt snap macunix/ROOT/default
NAME                                               USED  AVAIL  REFER  MOUNTPOINT
macunix/ROOT/default@empty                          96K      -   112K  -
macunix/ROOT/default@20240614-000500-MACU          684K      -   205M  -
macunix/ROOT/default@20240615-000500-MACU          336K      -   205M  -
macunix/ROOT/default@20240616-000500-MACU          320K      -   205M  -
macunix/ROOT/default@20240617-000500-MACU          320K      -   205M  -
macunix/ROOT/default@hold-my-beer-and-watch-this   508K      -   205M  -

My intent is to boot from the daily snapshot taken at 00:05:00 am on Jun 17 2024, and once there, rollback all filesystems to the safety snapshot @hold-my-beer-and-watch-this.

So I reboot. Escape to loader prompt, then:

Code:
OK set currdev=zfs:macunix/ROOT/default@20240617-000500-MACU:
OK ls
/
 d  home
 d  jail
 d  usr
 d  tmp
 d  var
 d  bin
 d  boot
 d  dev
 d  etc
 d  lib
 d  libexec
 d  media
 d  mnt
 d  net
 d  proc
 d  rescue
 d  root
 d  sbin
    entropy
    COPYRIGHT
    .profile
    .cshrc
OK set vfs.root.mountfrom=zfs:macunix/ROOT/default@20240617-000500-MACU
OK boot

The system boots into single-user mode with errors that I pretty much expect and can live with:

Code:
mount: macunix/ROOT/default@20240617-000500-MACU: Operation not supported
Mounting root filesystem rw failed, startup aborted
ERROR: ABORTING BOOT (sending SIGTERM to parent)!
2024-06-17T13:40:34.207315-07:00 - init 1 - - /bin/sh on /etc/rc terminated abnormally, going to single user mode
Enter full pathname of shell or RETURN for /bin/sh:
Cannot read termcap database;
using dumb terminal settings.
: \t />

I view this as first-stage success: I have loaded a kernel from a snapshot, and booted into single-user mode with that snapshot as my read-only
root filesystem. df confirms that I'm running from a mounted snapshot:

Code:
: \t /> df -h .
Filesystem                                      Size    Used    Avail Capacity  Mounted on
macunix/ROOT/default@20240617-000500-MACU       871G    205G     871G     0%    /

Now what I want to do is prove that I can blow away my live root filesystem, while keeping the snapshots, and still be able to boot back into this read-only root using snapshot @20240617-000500-MACU.

So I set about rm-ing the live filesystem:

Code:
: \t /> mount -t zfs macunix/ROOT/default
: \t /> chflags -R noschg /mnt/
: \t /> rm -rf /mnt/.??* /mnt/*
: \t /> ls -a /mnt
./      ../
: \t /> mkdir /mnt/usr
: \t /> mount -t zfs macunix/usr /mnt/usr
: \t /> chflags -R noschg /mnt/usr
: \t /> rm -rf /mnt/usr/.??* /mnt/usr/*
: \t /> ls -a /mnt/usr
./      ../

Now I have no /boot directory in my root filesystem (or anything else!), but I still have all the snapshots that I started with:

Code:
: \t /> zfs list -rt snap macunix/ROOT/default
NAME                                               USED  AVAIL  REFER  MOUNTPOINT
macunix/ROOT/default@empty                          96K      -   112K  -
macunix/ROOT/default@20240614-000500-MACU          684K      -   205M  -
macunix/ROOT/default@20240615-000500-MACU          336K      -   205M  -
macunix/ROOT/default@20240616-000500-MACU          320K      -   205M  -
macunix/ROOT/default@20240617-000500-MACU          320K      -   205M  -
macunix/ROOT/default@hold-my-beer-and-watch-this   508K      -   205M  -
: \t /> zfs list -d1 -rt snap macunix/usr
NAME                                      USED  AVAIL  REFER  MOUNTPOINT
macunix/usr@empty                          80K      -    96K  -
macunix/usr@20240614-000500-MACU         1.84M      -   331M  -
macunix/usr@20240615-000500-MACU         1000K      -   331M  -
macunix/usr@20240616-000500-MACU          904K      -   331M  -
macunix/usr@20240617-000500-MACU          904K      -   331M  -
macunix/usr@hold-my-beer-and-watch-this  2.12M      -   331M  -

So my interpretation of the status at this point, is that I have demonstrated how to load a kernel from a snapshot, and how to boot to single-user mode using a read-only root filesystem taken from the same snapshot. "mount" and "df -h" both confirm this:

Code:
: \t /> mount
macunix/ROOT/default@20240617-000500-MACU on / (zfs, local, noatime, read-only, nfsv4acls)
devfs of /dev/ (devfs)
macunix/ROOT/default on /mnt (zfs, local, nfsv4acls)
macunix/usr on /mnt/usr (zfs, local, nfsv4acls)
: \t /> df -h
Filesystem                                      Size    Used    Avail Capacity  Mounted on
macunix/ROOT/default@20240617-000500-MACU       871G    205G     871G     0%    /
devfs                                           1.0K      0B     1.0K     0%    /dev
macunix/ROOT/default                            871G     96K     871G     0%    /mnt
macunix/usr                                     871G     96K     871G     0%    /mnt/usr

And I also know that the live filesystems on my former / and /usr filesystems are wiped:

Code:
: \t /> ls -al /mnt /mnt/usr
/mnt:
total 10
drwxr-xr-x   3 root wheel  3 Jun 17 13:50 ./
drwxr-xr-x  22 root wheel 26 Jun 14 10:42 ../
drwxr-xr-x   2 root wheel  2 Jun 17 13:51 usr/

/mnt/usr:
total 1
drwxr-xr-x  2 root wheel 2 Jun 17 13:50 ./
drwxr-xr-x  3 root wheel 3 Jun 14 10:42 ../

So now I want to repeat the boot process. I didn't use the live filesystem the first time, I set currdev and vfs.root.mountfrom to point to the snapshot of macunix/ROOT/default. Let's repeat that same boot dialog again:

Code:
: \t /> reboot

Consoles: EFI console
    Reading loader env vars from /efi/freebsd/loader.env
Setting currdev to disk0p1:
FreeBSD/amd64 EFI loader, Revision 1.1

   Command line arguments: loader.efi
   Image base: 0x81d1c000
   EFI version: 2.40
   EFI Firmware: Apple (rev 1.00)
   Console: efi (0)
   Load Path: \EFI\BOOT\BOOTX64.efi
   Load Device: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(1,GPT,A3BBE450-26BB-4892-8654-E9E5AAA8C285,0x22,0x400000)
   BootCurrent: 0080
   BootOrder: 0080[*]
   BootInfo Path: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(1,GPT,A3BBE450-26BB-4892-8654-E9E5AAA8C285,0x22,0x400000)/\E
FI\BOOT\BOOTX64.efi
Ignoring Boot0080: Only one DP found
Trying ESP: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(1,GPT,A3BBE450-26BB-4892-8654-E9E5AAA8C285,0x22,0x400000)
Setting currdev to disk0p1:
Trying: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(2,GPT,(different partID),0x400022,0x20000000)
Setting currdev to disk0p2:
Trying: PciRoot(0x0)/Pci(0x17,0x0)/Sata(0x2,0x0,0x0)/HD(3,GPT,(another different partID),0x2400022,0x72306D6D)
setting currdev to zfs:macunix/ROOT/default:
Failed to find bootable partition
ERROR: cannot optn /boot/lua/loader.lua: no such file or directory.

Type '?' for a list of commands, 'help' for more detailed help.
OK

I totally get 100% of all that (I think). Since there is no longer any /boot directory, no /boot/loader.conf etc., loader.efi only choice is to look for the ZFS filesystem referenced in the pool's BOOTFS property. When it couldn't find a /boot directory there to load the next-stage loader.lua, or any other config details, it was out of options.

As I've pondered this today, the thought has occurred to me that since /boot is no longer there for the first-stage (?) loader, all the defaults in /boot/defaults/loader.conf also are not set, and it seems quite likely that more of those missing defaults is what is preventing my "real-world" example from succeeding when I try to boot from from a valid snapshot but an empty live BOOTFS filesystem. But what puzzles me most is that the kernel boots fine, but the failure is in the inability to mount the root filesystem, even though it should be mounting the same read-only filesystem that was successfully mounted in the initial test.

Unlike when I specified those options the first time, to load the kernel and the root filesystem from a snapshot, this time the root mount fails. Instead, I get the classic mountroot> prompt:

Code:
OK set currdev=zfs:macunix/ROOT/default@20240617-000500-MACU:
OK set vfs.root.mountfrom=zfs:macunix/ROOT/default@20240617-000500-MACU
OK boot
...
Loader variables:
  vfs.root.mountfrom=zfs:macunix/ROOT/default@20240617-000500-MACU

Manual root filesystem specification:
  <fstype>:<device> [options]
      Mount <device> using filesystem <fstype>
      and with the specified (optional) option list.

    eg. ufs:/dev/da0s1a
        zfs:zroot/ROOT/default
        cd9660:/dev/cd0 ro
          (which is equivalent to: mount -t cd9660 -o ro /dev/cd0 /)

?               List valid disk boot devices
.               Yield 1 second (for background tasks)
<empty line>    Abort manual input

mountroot>

Where even a last-ditch attempt fails:

Code:
mountroot> zfs:macunix/ROOT/default@20240617-000500-MACU ro
Trying to mount root from zfs:macunix/ROOT/default@20240617-000500-MACU [ro]...
Mounting from zfs:macunix/ROOT/default@20240617-000500-MACU failed with error 2: unknown file system.

So, in addition to:

Code:
currdev=zfs:tank/ROOT/default@snapshot:
vfs.root.mountfrom=zfs:tank/ROOT/default@snapshot

what incantations do I need at the loader prompt to successfully boot into a single-user read-only root environment based on snapshot tank/ROOT/default@snapshot? Something perhaps to get loader.lua to re-read the files in loader_conf_files and loader_conf_dirs relative to the currdev snapshot, before launching the /boot/kernel/kernel found in the snapshot?

Better still, is there a way to re-load the loader_conf settings, and then launch the boot menu, so that I can choose from the various kernels that might be present in the snapshot I'm booting from?
 
Back
Top