Can't write boot code after zpool upgrade

The upgrade to 9.2-RELEASE went smooth. After the upgrade, I did a zpool status

Code:
# zpool status rpool
  pool: rpool
 state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
        still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
        pool will no longer be accessible on software that does not support feature
        flags.
  scan: scrub repaired 0 in 0h4m with 0 errors on Sat Sep 28 03:05:19 2013
config:

        NAME           STATE     READ WRITE CKSUM
        rpool          ONLINE       0     0     0
          mirror-0     ONLINE       0     0     0
            gpt/disk0  ONLINE       0     0     0
            gpt/disk1  ONLINE       0     0     0

errors: No known data errors

Ok, time to zpool upgrade rpool

Code:
# zpool upgrade rpool
This system supports ZFS pool feature flags.

Successfully upgraded 'rpool' from version 28 to feature flags.
Enabled the following features on 'rpool':
  async_destroy
  empty_bpobj
  lz4_compress

If you boot from pool 'rpool', don't forget to update boot code.
Assuming you use GPT partitioning and da0 is your boot disk
the following command will do it:

        gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0

Right, what boot disks do I have again? I thought they were aacd0 and aacd1. gpart list should tell me that.

Code:
# gpart list
gpart: Cannot get GEOM tree: Illegal byte sequence

Now, I'm in serious trouble. I suspect that the system will not be able to reboot, so I'm certainly not rebooting now. gpart has become useless with the error
Code:
Illegal byte sequence
. What does it mean?
 
I used the freebsd-update route to upgrade and lost my ZFS pool and can't load the kernel - I end up at the BTX loader "OK" prompt. Glad it's my test system.
 
FreeBSD-9.2-RELEASE bootstrap loader problem

tzoi516 said:
I used the freebsd-update route to upgrade and lost my ZFS pool and can't load the kernel - I end up at the BTX loader "OK" prompt. Glad it's my test system.

I also used freebsd-update on FreeBSD-8.4-RELEASE-amd64 and end up at the "OK" prompt, but I didn't lose the ZFS pool. The bootstrap loader just seems to have problems finding the kernel. lsdev shows no devices.

I presume it's a glitch in the bootstrap mechanism of FreeBSD-9.2-RELEASE triggered by my hardware (HP ProLiant MicroServer Gen8 G2020T), as even the FreeBSD-9.2-RELEASE-amd64-memstick.img won't boot on the machine. The memstick images of FreeBSD-8.4-RELEASE-amd64 and FreeBSD-10.0-ALPHA4-amd64 boot with no problems, though.

So I'll do fresh install with FreeBSD-10.0-ALPHA4 and restore my data from backup.
 
c_s said:
So I'll do fresh install with FreeBSD-10.0-ALPHA4 and restore my data from backup.
I would suggest using FreeBSD-9.1 instead. It's supported until April 2014 whereas FreeBSD-10.0 isn't supported at all. At least not until it's released and even then it will only be supported until 10.1 comes out.
 
  • Thanks
Reactions: c_s
SirDice said:
I would suggest using FreeBSD-9.1 instead. It's supported until April 2014 whereas FreeBSD-10.0 isn't supported at all. At least not until it's released and even then it will only be supported until 10.1 comes out.

Thank you for your suggestion. FreeBSD-9.1-RELEASE-amd64-memstick.img does boot, so I guess the installed system will boot, too. I will follow up with a post in a few days when I know for sure.

Normally, I would be happy to help tracking down the code change that introduced this problem (if it's not a fault of the install media I created or my try of an upgrade), but at the moment I don't have the time. Especially, as I would first have to learn how to build FreeBSD from source.
 
As to the disk issue, that one got me too when I started out with FreeBSD. Studied dmesg but to no avail since my disks weren't mentioned. Now, I haven't tried this on a system running single user mode yet but I wonder what this command will do for you: sysctl kern.disks.

I stumbled across this by accident and it's quite handy to identify your hardware, in comparison to dmesg:

Code:
smtp2:/home/peter $ sysctl kern.disks
kern.disks: cd0 vtbd0
smtp2:/home/peter $ grep vtbd0 /var/run/dmesg.boot
smtp2:/home/peter $ grep cd0 /var/run/dmesg.boot
cd0 at ata1 bus 0 scbus1 target 0 lun 0
cd0: <QEMU QEMU DVD-ROM 1.3.> Removable CD-ROM SCSI-0 device
cd0: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
cd0: cd present [407156 x 2048 byte records]
Hope this can help too.
 
c_s said:
FreeBSD-9.1-RELEASE-amd64-memstick.img does boot, so I guess the installed system will boot, too. I will follow up with a post in a few days when I know for sure.
The installation went smoothly.
 
c_s said:
I also used freebsd-update on FreeBSD-8.4-RELEASE-amd64 and end up at the "OK" prompt, but I didn't lose the ZFS pool. The bootstrap loader just seems to have problems finding the kernel. lsdev shows no devices.

I presume it's a glitch in the bootstrap mechanism of FreeBSD-9.2-RELEASE triggered by my hardware (HP ProLiant MicroServer Gen8 G2020T), as even the FreeBSD-9.2-RELEASE-amd64-memstick.img won't boot on the machine. The memstick images of FreeBSD-8.4-RELEASE-amd64 and FreeBSD-10.0-ALPHA4-amd64 boot with no problems, though.

So I'll do fresh install with FreeBSD-10.0-ALPHA4 and restore my data from backup.

I wish I'd seen this earlier. I spent 3 hours today trying to fix a production server which I ran FreeBSD-update on to upgrade from 9.1 to 9.2 and it looks like exactly the same issues as you down to the lsdev showing nothing. Mine was also a HP ProLiant Microserver and I found it extremely fussy about memory sticks, so I eventually got it to boot from a FreeBSD 9.2 after writing 4 different sticks.

In the end I "solved" the issue by booting from a 9.2 stick mounting the zfs system and running FreeBSD-update rollback which fortunately resolved the issue.

As its a production machine I can't really go back and experiment with it to see what went wrong, which is a real shame as I only recently started to use FreeBSD-update and had hoped to use it more regularly.
 
I think I hit the same issue with a 9.1->9.2 upgrade. When I try to finish the ZFS root upgrade, this happens (I already knew the dev name):


Code:
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1
gpart: table 'ada1' is corrupt: Operation not permitted

Funny enough, gpart list halfway works, but lists
Code:
state: CORRUPT
for ada1.
 
gpart recover seemed to fix my issue. Now I'm not sure if it's the same issue or not.

Code:
# gpart recover ada1
ada1 recovered
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1
bootcode written to ada1
 
That means something overwrote the backup GPT table at the end of the disk. In turn, that was overwritten by the recovered backup GPT. What was there? Possibly ZFS was given the whole disk instead of a partition, or a GEOM device was created, like glabel(8) that needed to write metadata.
 
I did zpool scrub on the pool, hoping it would trigger the overwrite when it might find corrupt uberblocks at the end, but nope, gpart's still happy.

ZFS also doesn't seem to be overflowing out of the partition; asize < mediasize. I'm thinking this was either a fluke, or zpool upgrade damaged the partition table. Either way, things look ok.

Code:
# gpart list
Geom name: ada1
...
Providers:
...
2. Name: ada1p2
   Mediasize: 128033144320 (119G)
   ...
   length: 128033144320
   ...
   type: freebsd-zfs

Code:
#zdb
zroot:
    version: 28
    name: 'zroot'
    state: 0
    txg: 722386
    ...
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: ...
        children[0]:
            ...
            asize: 128028246016
            ...
 
Back
Top