ZFS Boot hangs mid-boot after adding new drive

I'm booting off two ada drives in a zfs mirror both with boot loader installed, using a 10.0 installer created bootpool/zroot setup using gtpid. I also have 8 da drives on an LSI in a data pool.

I installed a Samsung 950 pro NVMe, the bios sees it perfectly and FreeBSD boot process also assigning "BIOS drive M: is disk10", but then the spinner stops and the boot hangs.

Here's a picture of the failed boot
https://goo.gl/photos/Ts7wan6xFXhTMq3Q8

And a normal successful boot
https://goo.gl/photos/GJ2xhPAZ4hhB3BPu7

Feel free to stop reading at this point if you know what this is ;)

My first suspicion was that the new drive moved my drive order around, however bootpool uses gptid and as long as one of the two primary disks is in place it usually boots (I've previously pulled them individually to test).

I see you usually can't get past boot2 if the kernel is not found, so bootpool must mounted and the kernel loading, so is this error within zfsloader? Could anyone help with what zfsloader is doing at this point that it crashes?

I have various entries such as this in loader.conf setting up geli's on what should be ada0/ada1
Code:
geli_ada0p4_keyfile0_load="YES"
geli_ada0p4_keyfile0_type="ada0p4:geli_keyfile0"
geli_ada0p4_keyfile0_name="/boot/encryption.key"
Would that cause a freeze if the new drive has taken ada0 and it's completely unconfigured?
 
Last edited by a moderator:
Dkelbley, I was able to boot off non-zfs bootloader, at this point i believe it's a bug with the zfsloader. I haven't followed this up yet though.
 
I have various entries [...] setting up geli's on what should be ada0/ada1[...]
Would that cause a freeze if the new drive has taken ada0 and it's completely unconfigured?
I don't believe so. As I understand it, all partitions are scanned for GELI containers that are configured to be mounted at boot (created or configured with the -b flag -- see the geli(8) man page). Any variables in loader.conf(5) related to GELI containers that are found are then applied while the container is being attached.

My first suspicion was that the new drive moved my drive order around, however bootpool uses gptid and as long as one of the two primary disks is in place it usually boots (I've previously pulled them individually to test).
I take it this means you have your ZFS boot pool on both drives. Did you remember to set them up as a ZFS mirror too? Are you using the same keys for the GELI containers hosting your mirrored ZFS root pool? Do you have entries for both in loader.conf(5)?

A workaround might be to take a step back from (mostly) full disk encryption and only use the GELI container for whatever data you want to protect (I'm imagining it might be a database or similar). The GELI container could then be attached after the root partition is mounted (configured in rc.conf(5)) rather than before.
 
Back
Top