9.0/amd64 GELI-ZFS boot fails to find init on pool/root

Intensities · Jan 22, 2012

Hello,

I've set up a lovely GELI-ZFS system by using the live USB stick. I've written data to pool/ already, and have gone through the setup of a pool/root/ filesystem on which an entire 9.0/amd64 build has been placed. Though initial setup of the system involved command line geli attach commands followed by a zpool import pool, I'd now like to be able to boot the system properly from a USB which can be detached upon successful boot. I've had this setup working in the past, but a few things are different. The kernel is telling me that it is unable to find init, after successfully prompting me to unlock the GELI partitions part of the original ZFS pool.

I had set pool/root/ to be the main mount point (as I've populated pool/other*/ already), by setting vfs.root.mountfrom to zfs:pool/root within the /boot/loader.conf on the USB drive. And I had propagated zpool.cache from the running system back to the USB's /boot/zfs/ directory. I specified verbose booting, and it seems that after prompting me to unlock each individual drive with the GELI password, the system immediately attempts to boot from zfs:pool/root. It doesn't indicate whether it has successfully imported pool/.

Whereas I originally formed the ZFS pool/ from device nodes named in gpt/, it seems that upon booting the FreeBSD kernel is asking me to unlock device entries by the raw name such as /dev/daNp1. It only asks me to unlock each device once (I think in the past there was a bug report indicating that a user was prompted multiple times for the same device: one for the raw device, one for the entry in /dev/label/, and ultimately for the entry in /dev/gpt/). But I'm wondering if the renaming of the device component (even though it's essentially the same device) may be confusing the boot process and preventing a seamless import of the ZFS pool by name. It seems that specifying zfs:pool is more common and zfs:pool/filesystem is not mentioned as often in user forums. I'm hoping that it is possible to ask FreeBSD to look for the root filesystem within a hierarchy. Maybe zfs:pool is intended to be used here, even in the case that the root is designated to live on pool/root/. The underlying system is RAID-Z2, so I think that disallows me setting zpool bootfs.

I'm wondering if I may have been "too late" in propagating the information from the running system's /boot/zfs/zpool.cache over to the USB drive. For example, if that file reflects which systems are considered currently mounted (thus also describing which pool names are available), then an out-of-date zpool.cache may prevent seamless import. Also, whether I zfs set mountpoint=legacy pool/root (or mountpoint=/) at a particular time may or may not be information that is propagated over to zpool.cache (it may be stored in the pool metadata). It's unclear when I should change the mountpoint in the recommended zpool export pool && zpool import pool sequence, especially when the intended mount point is / itself (which makes unmounting problematic because new processes use files in the pool). And so I don't know whether I should continue to keep mountpoint=legacy (and update the /etc/fstab) or whether that is moot because the very fact that the root filesystem is successfully mounted would have implied / was mounted too. Maybe mountpoint=/ is needed, and maybe it is what's confusing the system. In any case, I have a use situation which is not able to determine my intentions. I'd like to know what I can do to have the system obtain the proper root device. Maybe the startup script itself can first ensure that all possible ZFS pools are imported (with a secondary prompt to try a zpool import -f if the requested root pool is not yet loaded).

I'm very pleased at how I could zpool import pool from the live USB, and find that the system automatically mounted the system to / (including pool/root/usr and other submounts which automatically were set at their relative points without the need of an individual zfs set mountpoint directive). So it clearly seems that there is thorough logic. But in general it's unclear what the proper steps are in the zpool import pool && zpool export pool sequence, when the intended mount point is either / itself or may cause conflict with the running system. Ideally I'd like to find a way to do a lightweight zpool import pool where the mount points are not processed (only the pool itself is configured in the kernel). That would let me follow that with a zpool export without worrying about modifications to the running system (for example, once / is remounted, it is difficult to unmount it when files are in use). And so this is a side feature request: a lightweight zpool import that doesn't process mount points.

But, back to my main issue, I'm wondering if the device renaming is confusing the construction of the ZFS pool. It's good that the FreeBSD kernel is smart enough to not prompt me more than once for the same device. But maybe it can keep track of the preferred label under which the device is to be unlocked. That can help administrators recall details about which drives are attached, and can also help in the case where renumbering may occur. So this is another minor feature request. I'd like to be asked to unlock the device by its gpt/* entry, or its label/* entry, or its raw entry in /dev (whichever I prefer).

I've also specifically not added any GELI directives in /boot/loader.conf. I include that file, below. It's unclear that this is needed because I've specifically flagged each GELI device to prompt for a password upon boot (and the unlocking of each device is successful). It seems, basically, that the next step of attempting to import pool/ or its pool/root/ is not taking place successfully. Also I wonder if the state of the last import/export command may be preventing an automatic import of the pool/. A clean import/export is not as straightforward when mounting of the pool takes place automatically.

If the community has any suggestions about a next step I would appreciate feedback. I hope also that the minor tweaks and feature requests may be helpful.

Code:

boot_verbose="-v"
geom_eli_load="YES"
geom_journal_load="YES"
geom_label_load="YES"
geom_mirror_load="YES"
geom_stripe_load="YES"
geom_part_gpt_load="YES"
geom_shsec_load="YES"
geom_uzip_load="YES"
linprocfs_load="YES"
linsysfs_load="YES"
verbose_loading="YES"
vm.kmem_size=12G
zfs_load="YES"
vfs.root.mountfrom="zfs:pool/root"

9.0/amd64 GELI-ZFS boot fails to find init on pool/root

Intensities