Solved Not enough space to write bootcode after root zpool upgrade

`Orum · Oct 7, 2016

I recently upgraded one of my storage severs from 10.3 -> 11.0, and then went to perform the zpool upgrade to enable the new hashing methods available on the root pool. This, after enabling the features of course, offers the friendly reminder that:

Code:

If you boot from pool 'zroot', don't forget to update boot code.
Assuming you use GPT partitioning and da0 is your boot disk
the following command will do it:

        gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

So of course I run the command to update the first disk in the pool: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 diskid/DISK-S1XWJ90Z203029

And then my heart sinks as I read the output: "gpart: /dev/diskid/DISK-S1XWJ90Z203029p1: not enough space". IIRC, way back when zfs root first appeared as an option I followed a guide which only created a 64KB partition for the boot loader. However, now /boot/gptzfsboot is 87K, and obviously won't fit, so I'm in a bit of a pickle.

My questions are:

Is the only way to correct the error is to completely copy everything off the pool, completely wipe out and recreate the pool from scratch, and copy everything back? I'd really like to not have to do that if at all possible, but I don't see another way.
Will the system even boot right now, if I were to reboot it?
What is the recommended size of a boot loader partition to be future-proof against this problem?

Edit:
Also, there's no room between the partitions:

Code:

# gpart show diskid/DISK-S1XWJ90Z203029
=>        34  2930277101  diskid/DISK-S1XWJ90Z203029  GPT  (1.4T)
          34         128                           1  freebsd-boot  (64K)
         162  2930276973                           2  freebsd-zfs  (1.4T)

SirDice · Oct 7, 2016

I'm afraid so.
Probably not, the bootcode may have been partially overwritten now.

Mine's 512K, which should be more than enough. Don't make it too large though.

Code:

This partition must be larger than the bootstrap
     code (either /boot/gptboot for UFS or /boot/gptzfsboot for ZFS), but
     smaller than 545 kB since the first-stage loader will load the entire
     partition into memory during boot, regardless of how much data it
     actually contains.

From gpart(8)

generic · Oct 7, 2016

1. I guess you don't have to remove pool, what I'd do is:
add new disk, create gpt partitions large enough (I use 512K) then add it to the pool, migrate data (zpool attach) then mirror it back to old drive and you're good to go.

SirDice · Oct 7, 2016

That does require an 'extra' disk but it would indeed be a good solution.

generic · Oct 7, 2016

Another thought,

if you have any spare space why not to remove freebsd-boot partition, recreate it and write boot code again?

I've just tested it, and it seems to work:

Code:

root@b0ink:~ # gpart show
=>      40  52428720  ada0  GPT  (25G)
        40      1024     1  freebsd-boot  (512K)
      1064       984        - free -  (492K)
      2048   8388608     2  freebsd-swap  (4.0G)
   8390656  44038104     3  freebsd-zfs  (21G)

Code:

root@b0ink:~ # gpart delete -i 1 ada0
ada0p1 deleted

Code:

root@b0ink:~ # gpart show
=>      40  52428720  ada0  GPT  (25G)
        40      2008        - free -  (1.0M)
      2048   8388608     2  freebsd-swap  (4.0G)
   8390656  44038104     3  freebsd-zfs  (21G)

Code:

root@b0ink:~ # gpart add -b 40 -s 512K ada-t freebsd-boot ada0
ada0p1 added
root@b0ink:~ # gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0
partcode written to ada0p1
bootcode written to ada0
root@b0ink:~ #

System booted back up

wblock@ · Oct 7, 2016

Another option: boot partitions are not required to be first on the disk. So it should be possible to delete swap partitions and recreate them just a tiny bit smaller, or use some free space at the end of the disk. Make your boot partition 512K, no bigger, but no smaller. The small bit of wasted space will never be noticed, and you won't have to bother with it again unless something really weird happens.

`Orum · Oct 7, 2016

SirDice said:
I'm afraid so.

Probably not, the bootcode may have been partially overwritten now.

Mine's 512K, which should be more than enough. Don't make it too large though.

Actually, I tested #2 as I was going to make a second backup of the data before doing anything else, and while doing that, the machine locked up (lots of WRITE_FPDMA_QUEUED spam to the console, though I think this is due to the enclosure or the eSATA cable being bad). The machine did boot back up without issue, so I think gpart is smart enough to check that it has room available before writing anything. Also, thanks for the tips regarding the new partition size, I'll make mine 512K as well.

generic said:
1. I guess you don't have to remove pool, what I'd do is:
add new disk, create gpt partitions large enough (I use 512K) then add it to the pool, migrate data (zpool attach) then mirror it back to old drive and you're good to go.

The problem with this is the existing disks are in a raid-z1 configuration (I was only showing the first of 3 disks in the pool), so I can't attach to the pool.

generic said:
Another thought,

if you have any spare space why not to remove freebsd-boot partition, recreate it and write boot code again?

The problem here is I literally have no spare space on the disks at all--what you see in the gpart show in the original post is literally every partition on the disk, with nothing free.

wblock@ said:
Another option: boot partitions are not required to be first on the disk. So it should be possible to delete swap partitions and recreate them just a tiny bit smaller, or use some free space at the end of the disk. Make your boot partition 512K, no bigger, but no smaller. The small bit of wasted space will never be noticed, and you won't have to bother with it again unless something really weird happens.

Ah, if only I had created some swap! The reason I didn't was, this machine only has two pools (zroot of 3x 1.5 TB / 1.36 TiB disks in raid-z1, and another pool of 3x 3 TB / 2.73 TiB also in raid-z1) for an effective storage space of 9 TB / 8.19 TiB, and I figured with the RAM I have in it, 16 GB, was more than sufficient for that and my load. As I haven't had any issues running out of RAM, my only regret is not making a 512 KB boot partition.

Thinking about it more, it's probably not the worst thing to remake the entire pool, as ZFS has changed quite a bit since its debut on FreeBSD. Most notably I've noticed bsdinstall (judging from this script) now creates a different tree structure from the days of old, as lz4 has replaced the need for toggling compression on/off for various directories.

There is one last thing though that seems strange to me. In copying everything off to another disk to have a second backup before I wipe the pool, I've been using rsync to a newly created pool on another disk. However, I'm seeing huge amounts of fragmentation on it when looking at it in zpool (keep in mind the backup is still in progress):

Code:

# zpool list zroot zbackup
NAME      SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
zbackup  2.72T  1.36T  1.36T         -    31%    49%  1.00x  ONLINE  -
zroot    4.06T  3.34T   738G         -    12%    82%  1.00x  ONLINE  -

Of course, this being a fresh pool, I'd expect to see almost no fragmentation when copying everything over, so what is going on? The rsync command I'm using (minus a bunch of --excludes, and yes, I realize the --delete is unnecessary in this case) is: rsync -aHAXSv --numeric-ids --delete / /zbackup/

Edit: I did use the "newer" directory tree structure that a fresh zfsroot install would use now, so maybe that is causing it?

Edit: Ah, this thread seems to shed some light on the issue. Guess it's expected behavior.

karolyi · Oct 14, 2016

`Orum said:
I recently upgraded one of my storage severs from 10.3 -> 11.0, and then went to perform the zpool upgrade to enable the new hashing methods available on the root pool. This, after enabling the features of course, offers the friendly reminder that:

Code:

If you boot from pool 'zroot', don't forget to update boot code. Assuming you use GPT partitioning and da0 is your boot disk the following command will do it: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

So of course I run the command to update the first disk in the pool: gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 diskid/DISK-S1XWJ90Z203029

And then my heart sinks as I read the output: "gpart: /dev/diskid/DISK-S1XWJ90Z203029p1: not enough space". IIRC, way back when zfs root first appeared as an option I followed a guide which only created a 64KB partition for the boot loader. However, now /boot/gptzfsboot is 87K, and obviously won't fit, so I'm in a bit of a pickle.
...
[/CODE]

Same here, and since my swap partition is in the zpool, I'm unable to resize that and put the gptzfsboot there.

Hence, I'll need to completely backup my server and reinstall it with an enlarged boot partition.

what I'll probably do:
- create a google cloud computing instance with a huge disk for taking my backup via NFS
- export my zfs recursive snapshot, sending there while encoding it with openssl
- boot with mfsbsd 11 images (zfs root)
- receive and extract the backup to the newly partitioned disks.

been there, done that, so I know it works... but it's a huge pain in the ass, and having problems like these convinces me to move into the cloud, dissecting my monolith server.

the bad thing is, it's already paid for a year. I guess I'll have to live with that.

marvel · Nov 4, 2016

Thanks generic, that was a life saver! Ran into the same problem upgrading to FreeBSD 11.

Deleted boot (64K) and swap then recreated boot with 512K and added remaining space as swap again. Wrote bootcode, rebooted, fingers crossed and it worked

generic · Nov 5, 2016

I'm glad I could help

andros · Jul 2, 2018

Huge thank you generic and wblock@ !
Following your instructions and ideas and useful swap (swapinfo, swapoff, swapon) commands, I was able to extend boot on live system and then reboot only to be sure everything work like it should.

Solved Not enough space to write bootcode after root zpool upgrade

`Orum

SirDice

Administrator

generic

SirDice

Administrator

generic

wblock@

`Orum

karolyi

marvel

generic

andros