Solved [GELI] getting rid of bootpool?

zirias@

Developer
I installed my server with 11.0-RC1, and back then, the installer created a separate (unencrypted) bootpool for ZFS root on GELI. Now, I know this isn't needed any more for a long time. Are there any instructions around how to migrate an existing installation to get rid of the bootpool? Anything to specifically pay attention to? Or is this something to better not touch?

The machine has 4 disks all partitioned like this:
Code:
=>        40  7814037088  ada0  GPT  (3.6T)
          40        1600     1  efi  (800K)
        1640         408        - free -  (204K)
        2048     4194304     2  freebsd-zfs  (2.0G)
     4196352     4194304     3  freebsd-swap  (2.0G)
     8390656  7805644800     4  freebsd-zfs  (3.6T)
  7814035456        1672        - free -  (836K)
The bootpool is mirrored over all the second partitions.
 
It's better to not touch something that's not broken ;)
 
Well, it's not broken, but not ideal either, e.g. because of this:
Code:
# ls -ld /boot
lrwxr-xr-x  1 root  wheel  13 Aug 29  2016 /boot -> bootpool/boot

So, if possible, I'd prefer to move boot into my main encrypted pool where it nowadays belongs.

Of course, if not possible, or too risky, I'll leave it the way it is.

edit: found another interesting thing, the installer back then also created a keyfile, adding lines like this to /boot/loader.conf for all 4 disks:
Code:
geli_ada0p4_keyfile0_load="YES"
geli_ada0p4_keyfile0_type="ada0p4:geli_keyfile0"
geli_ada0p4_keyfile0_name="/boot/encryption.key"
I assume there wouldn't be another place to store this, so I'd probably have to reconfigure GELI to use *only* a passphrase as well?
 
Why not try it out:
  1. copy the stuff from your existent boot pool over to that other big ZFS pool.
  2. zpool set bootfs=pool/name
  3. gpart modify -i 2 -t freebsd-swap ada{0,1,2,3}
  4. reboot
  5. enjoy your new freedom ;)
 
Why not try it? Well, cause I'm looking for advice first. A mistake rendering the system unbootable would mean a LOT of problems. Happened to me once, and to recover, I had to modify a FreeBSD installer memstick to use the serial console. On COM2 that is (strange server board). Oh, I forgot to mention this server does virtually *everything* for me at home, including a virtual machine as router/firewall for DSL. So, back then, I had to download the memstick image on my phone, transferring it to my notebook with adb, so I could mount and modify it there. Uhhh...

So, asking here specifically about things I'll have to pay attention to when doing this change :)
 
Since a firewall/packet filter in a VM is (common, but still) malpractice, you should redesign your network setup & reinstall that Über-server anyway. Thus it won't hurt. P.S.: with freedom I meant the freedom to live w/o bits captured in silicon for a week or so.
 
The VM has exclusive (PCI level) access to the NICs, so this is the second best solution short of a separate box for firewalling. And the latter just isn't an option for me at home. No, this is not "malpractice", even if you wouldn't do it in an enterprise.
 
Anyways, I wonder what others do/did? Just keep the old 11-style layout? Full reinstall? Nobody attempted some kind of migration? :-/
 
I installed my server with 11.0-RC1, and back then, the installer created a separate (unencrypted) bootpool for ZFS root on GELI. Now, I know this isn't needed any more for a long time.
Isn't it? I'd argue that this depends on the way the system was set up. It also depends on how you expect your server to boot.

See, there's another (possible) issue here but please keep in mind that I don't work with GELI on a regular basis, so I could be overlooking something here.

The thing is... if you expect your server to boot on its own then you'll need a small unencrypted part which provides the boot process with the right key(s) in order to actually access the main system. Of course, as a side note, this also makes the whole encryption process pretty much useless because it's no longer actually protecting something.

Think about it: if "bad guys" steal your server they can simply turn the whole thing on and the system will boot, full access. If "bad guys" manage to gain remote entry to your system then... they're working with a system that has full access to the underlying disk because the system itself ensures that everything gets decrypted, otherwise the system itself wouldn't work anymore.

So the issue I wonder about here is... can you still set up that small unencrypted boot section on a pool which has already been encrypted? And without disrupting anything? I personally doubt it because you're basicly moving from one system to the other.

IMO the best idea here is to leave things as they are and keep this thing in mind for the next time you have to upgrade the whole server.
 
The thing is... if you expect your server to boot on its own then you'll need a small unencrypted part which provides the boot process with the right key(s) in order to actually access the main system. Of course, as a side note, this also makes the whole encryption process pretty much useless because it's no longer actually protecting something.
Yep, there's a little misunderstanding here. I DO have a passphrase not stored anywhere. In an enterprise environment, you'd probably have some "key server" providing the encryption keys for automatic boot. At home, I'm fine with having to enter a passphrase on a serial console to boot that server ;)
 
After realizing boot environments just don't work with this layout:

I now tried to take steps for getting rid of the bootpool. So far I did the following:
Code:
cd /
rm boot
(cd bootpool; tar cplf - boot) | tar xf -   # copy whole /boot, already has 13rc2 as "kernel", 12.2 as "kernel.old"
geli setkey -n 1 ada0p4                     # for all 4 disks, to set a passphrase without a keyfile
geli configure -b -g ada0p4                 # for all 4 disks, so loader hopefully boots directly from them
zpool set bootfs=zroot/ROOT/default zroot   # to initialize boot environments
bectl create 13rc2
bectl mount 13rc2
cd /usr/src
make DESTDIR=<13rc2-be> installworld
make DESTDIR=<13rc2-be> BATCH_DELETE_OLD_FILES=yes delete-old
vim <13rc2-be>/boot/loader.conf             # remove everything related to geli keyfiles AND vfs.root.mountfrom
cp /bootpool/boot.config <13rc2-be>
bectl umount 13rc2
bectl activate -t 13rc2

Now I'm a bit nervous whether the reboot will get me towhere I want.. did I miss anything?
 
With an 13-rc2/$SRCTOP/UPDATING for reference open, also go through vim <13rc2-be>/{{boot/loader,/etc/{rc,sysctl}}.conf,etc/fstab,usr/local/etc/anacrontab}; did I forget other vital configuration files?
 
Well, this should all be minor stuff… what I'm worried about is whether the boot environments will now work and I have a simple path back to 12.2 in case anything goes wrong.

It's all related to being unsure about the steps required to migrate from this old scheme of the FreeBSD 11 installer with the separate bootpool :eek:
 
I really don't want to nag anyone personally – but I also wonder why I couldn't find a description of a migration process for my scenario. The bootpool was necessary with FreeBSD 11 because the bootloader couldn't decrypt GELI back then, but it seems it also renders boot environments defunct, so getting rid of it must be a common requirement?

Anyways, I now also created a memstick from my source tree, wrote it to an USB key and modified it to use the serial console for boot, so I hopefully have a way for "disaster recovery" – will just try to get this working later today.
 
Getting closer!

Well, I found out that geli configure -b -g ada0p4 was wrong. With both flags set, the bootloader only asks for a passphrase, but doesn't attempt to decrypt. Fixed it with geli configure -B -g ada0p4.

But then, it still picked my bootpool for booting. I won't delete it until everything works. At least, now I had zroot also available in loader, and with a lot of trial&error on the loader prompt, I finally managed to boot into my 13.0-RC2 boot environment 🥳

Now, a few minor things to fix, but the machine is running, this is great.
 
I wouldn't be shy to ask some wizzard personally. S/he can just ignore your request; else you can get some input that might be useful & helpful. E.g. Argentum is online?
 
Well, I think I am on track now! My builder jail with poudriere, still on 12.2, is busily building packages for 13.0, yay :)

As for the migration away from a separate bootpool, with the notes I already took here in this thread, I should be able to write a little howto later.
 
I wouldn't be shy to ask some wizzard personally. S/he can just ignore your request; else you can get some input that might be useful & helpful. E.g. Argentum is online?
My handle was mentioned here, so I think I need also to reply.

My little retrospective advice here is that it seems to be a good idea not to use the default 'zroot' name for new pools when installing. That gives some extra freedom operating the zfs send and zfs receive. In that kind of situation I have usually taken another disk and sending the whole pool there. It is easy to move the whole pool from one location to another. Then doing zpool set bootfs=... after that.
 
For the next reboot, I'd like to test operation without the bootpool, but I don't want to delete it yet.

Is my assumption correct that changing the partition type with gpart modify can be easily reverted without losing any content?
 
Ok, I changed all partitions containing the bootpool to linux-data. Interestingly, zpool import bootpool still finds the pool :-/
Now I just hope the bootloader won't find it any more ;)
 
Running into a dead end now. No matter what partition type I set, the bootloader finds the pool :eek:

So, I removed the partitions, noting exact location and size. Then, the bootloader just fails, not even offering to decrypt my main pool (like it does when the bootpool is there). WTF?

As a side note, I also hit a roadblock upgrading my routing/firewall VM: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254343
 
There's still a apropos bootcode on that device & maybe you need to re-create your /boot/zfs/zpool.cache? Sorry, I'm just guessing, never had that same issue.
PS: [PR]254343[/PR] = PR 254343
 
Back
Top