Solved 13.0-RELEASE-p8 boots no more

Hello,

I just did the FreeBSD 13.0-RELEASE-p8 update today.
After reboot I no longer have access (connection timed out).
I was able to get in through a rescue console, the service doesn't have a KVM over IP.
/var/run/dmesg.boot clearly shows nothing (dated from previous reboot from january)
Code:
FreeBSD 13.0-RELEASE-p7 #0: Mon Jan 31 18:24:03 UTC 2022

As the 13.0-RELEASE-p8 includes /boot/kernel/zfs.ko, I was expecting a problem there but no feedback in /var/run/dmesg.boot let me think an hardware problem.

What do you think? Where should I investigate?
 
UEFI? (Is that a relevant question?)

Update from which version?

Packages from latest or quarterly?

pkg -vv | grep -e url -e enabled

UFS or ZFS?

tunefs -p /

I'll try from 13.0-RELEASE-p6 (and add the result to this post) …

1647461670170.png 1647462105351.png 1647462389377.png
 
Thanks for the help ;)
Update from FreeBSD 13.0-RELEASE-p7 to FreeBSD 13.0-RELEASE-p8.
File System is ZFS.

Rescue console is stuck at FreeBSD 11.2-RELEASE so I can only mount the ZFS pool read only...

sudo zpool import -o altroot=/mnt -o cachefile=/var/tmp/zpool.cache -d /dev/gpt zroot -f

Code:
This pool uses the following feature(s) not supported by this system:
    com.delphix:spacemap_v2 (Space maps representing large segments are more efficient.)
    org.zfsonlinux:userobj_accounting (User/Group object accounting.)
    com.delphix:log_spacemap (Log metaslab changes on a single spacemap and flush them periodically.)
    org.zfsonlinux:project_quota (space/object accounting based on project ID.)
All unsupported features are only required for writing to the pool.
The pool can be imported using '-o readonly=on'.
 
was this system originally installed as 13 or upgraded from 12 or older ?
if upgraded and root pool upgraded after you may need to update the boot code
 
Packages from Quaterly:

cat /mnt/etc/pkg/FreeBSD.conf

Code:
FreeBSD: {
  url: "pkg+http://pkg.FreeBSD.org/${ABI}/quarterly",
  mirror_type: "srv",
  signature_type: "fingerprints",
  fingerprints: "/usr/share/keys/pkg",
  enabled: yes
}
 
ok found it... I need

sudo gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0
 
Weird this doesn't change a thing...

System is not UEFI
sysctl machdep.bootmethod
Code:
machdep.bootmethod: BIOS

Still not possible to log in and no update in /var/run/dmesg.boot or /var/log/messages

Am I missing something?
 
gpart show

Code:
=>       40  250069600  ada0  GPT  (119G)

         40       1024     1  freebsd-boot  (512K)

       1064   16777216     2  freebsd-swap  (8.0G)

   16778280  233291360     3  freebsd-zfs  (111G)


=>       40  250069600  diskid/DISK-170792800877  GPT  (119G)

         40       1024                         1  freebsd-boot  (512K)

       1064   16777216                         2  freebsd-swap  (8.0G)

   16778280  233291360                         3  freebsd-zfs  (111G)
 
cat /mnt/etc/fstab
Code:
# Device                      Mountpoint              FStype  Options         Dump    Pass#

/dev/gpt/swap                 none                    swap    sw              0       0


zpool status
Code:
  pool: zroot

 state: ONLINE

  scan: none requested

config:


    NAME        STATE     READ WRITE CKSUM

    zroot       ONLINE       0     0     0

      gpt/data  ONLINE       0     0     0


errors: No known data errors


cat /mnt/boot/loader.conf
Code:
zfs_load="YES"
vfs.root.mountfrom="zfs:zroot/root"
kern.vty=vt
autoboot_delay="1"


sudo gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0
Code:
partcode written to ada0p1
bootcode written to ada0
 
zpool get all zroot
Code:
NAME   PROPERTY                                       VALUE                                          SOURCE

zroot  size                                           111G                                           -

zroot  capacity                                       44%                                            -

zroot  altroot                                        /mnt                                           local

zroot  health                                         ONLINE                                         -

zroot  guid                                           15217591573817377210                           default

zroot  version                                        -                                              default

zroot  bootfs                                         zroot/root                                     local

zroot  delegation                                     on                                             default

zroot  autoreplace                                    off                                            default

zroot  cachefile                                      /var/tmp/zpool.cache                           local

zroot  failmode                                       wait                                           default

zroot  listsnapshots                                  off                                            default

zroot  autoexpand                                     off                                            default

zroot  dedupditto                                     0                                              default

zroot  dedupratio                                     1.00x                                          -

zroot  free                                           61.9G                                          -

zroot  allocated                                      49.1G                                          -

zroot  readonly                                       on                                             -

zroot  comment                                        -                                              default

zroot  expandsize                                     -                                              -

zroot  freeing                                        0                                              default

zroot  fragmentation                                  0%                                             -

zroot  leaked                                         0                                              default

zroot  bootsize                                       -                                              default

zroot  checkpoint                                     -                                              -

zroot  feature@async_destroy                          enabled                                        local

zroot  feature@empty_bpobj                            active                                         local

zroot  feature@lz4_compress                           active                                         local

zroot  feature@multi_vdev_crash_dump                  enabled                                        local

zroot  feature@spacemap_histogram                     active                                         local

zroot  feature@enabled_txg                            active                                         local

zroot  feature@hole_birth                             active                                         local

zroot  feature@extensible_dataset                     active                                         local

zroot  feature@embedded_data                          active                                         local

zroot  feature@bookmarks                              enabled                                        local

zroot  feature@filesystem_limits                      enabled                                        local

zroot  feature@large_blocks                           enabled                                        local

zroot  feature@sha512                                 enabled                                        local

zroot  feature@skein                                  enabled                                        local

zroot  feature@device_removal                         enabled                                        local

zroot  feature@obsolete_counts                        enabled                                        local

zroot  feature@zpool_checkpoint                       enabled                                        local

zroot  unsupported@com.datto:bookmark_v2              inactive                                       local

zroot  unsupported@org.freebsd:zstd_compress          inactive                                       local

zroot  unsupported@org.openzfs:draid                  inactive                                       local

zroot  unsupported@com.delphix:redacted_datasets      inactive                                       local

zroot  unsupported@com.delphix:redaction_bookmarks    inactive                                       local

zroot  unsupported@org.zfsonlinux:large_dnode         inactive                                       local

zroot  unsupported@com.datto:encryption               inactive                                       local

zroot  unsupported@com.delphix:bookmark_written       inactive                                       local

zroot  unsupported@org.openzfs:device_rebuild         inactive                                       local

zroot  unsupported@com.delphix:spacemap_v2            readonly                                       local

zroot  unsupported@org.zfsonlinux:userobj_accounting  readonly                                       local

zroot  unsupported@com.datto:resilver_defer           inactive                                       local

zroot  unsupported@com.delphix:livelist               inactive                                       local

zroot  unsupported@org.zfsonlinux:allocation_classes  inactive                                       local

zroot  unsupported@com.delphix:log_spacemap           readonly                                       local

zroot  unsupported@org.zfsonlinux:project_quota       readonly                                       local
 
Is this some sort of public cloud provider that doesn't provide console access ? You need to see what's happening during the boot so you can investigate; otherwise it's guessing game.
If it is a cloud maybe it's worth copying root pool disk(s) to your private computer and test it in your VM to see what's that boot actually doing.

What does it mean "rescue console"? Can't you use something newer than 11.2 there?
 
It's mainly a cloud provider but here it is a dedicated host, the rescue console is an OS on another drive that boots the device and loads the server drive as an external drive, so you can have a `rescue` access through ssh.
The most up to date FreeBSD version they provide is 11.2 sadly.
Good idea for the VM.
Yes guess work here as it fails at boot..
 
I'm not entirely sure if this is relevant, but the one thing that looks odd above is:
Code:
zroot  bootfs                                         zroot/root                                     local
My recently updated system has:
Code:
[strand.343] $ uname -a
FreeBSD strand.my.domain 13.0-RELEASE-p8 FreeBSD 13.0-RELEASE-p8 #0: Tue Mar 15 09:36:28 UTC 2022     root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
[strand.344] $ zpool get all zroot | grep bootfs
zroot  bootfs                         zroot/ROOT/default             local
Perhaps somebody who is expert on boot environments might comment...
 
So it's a physical machine where due to lack of remote management its physical disk is removed and moved to another physical machine which has the access ? That's a bit painful. But yeah, if that's all you've got it's worth a shot to dd that disk from that machine to your computer ( dd if=/dev/ada0 bs=4k | gzip-9 | ssh user@yourlocalmachine dd of=disk.raw.gz). Assuming that rescue system can reach internet/your computer.

There was really no reason to mess up with bootfs parameter. If your /boot is on dataset zroot/root then it's set properly.
 
The zroot/ROOT/default path is a beadm way I think.
Maybe I messed up things when creating the mountpoints, this was working fine previously though..
What I did at the time
zpool set bootfs=zroot/root zroot
zfs set mountpoint=/ zroot/root
zfs set mountpoint=/zroot zroot
zfs set mountpoint=/tmp zroot/tmp
zfs set mountpoint=/usr zroot/usr
zfs set mountpoint=/var zroot/var
 
It has nothing to do with the beadm itself. Loader will try to find /boot under the dataset that bootfs points to. So if you messed that up it could be it's stuck in the loader there. Can you share the output of the zfs list so we see what you got there?

But if you have zroot/root as / and you have /boot there then it's ok. On that rescue system can you go /mnt/boot (I saw you imported pool to /mnt) and do df -m . to see which dataset it is?
 
So it's a physical machine where due to lack of remote management its physical disk is removed and moved to another physical machine which has the access ? That's a bit painful. But yeah, if that's all you've got it's worth a shot to dd that disk from that machine to your computer ( dd if=/dev/ada0 bs=4k | gzip-9 | ssh user@yourlocalmachine dd of=disk.raw.gz). Assuming that rescue system can reach internet/your computer.

There was really no reason to mess up with bootfs parameter. If your /boot is on dataset zroot/root then it's set properly.
Not exactly it's a random network drive connected to the machine, it acts as boot device/OS.
I could not mess up with the bootfs parameter as the zpool is readonly due to the 11.2-RELEASE zfs differences.
Will try that, will have to set a VM first.
 
It has nothing to do with the beadm itself. Loader will try to find /boot under the dataset that bootfs points to. So if you messed that up it could be it's stuck in the loader there. Can you share the output of the zfs list so we see what you got there?

But if you have zroot/root as / and you have /boot there then it's ok. On that rescue system can you go /mnt/boot (I saw you imported pool to /mnt) and do df -m . to see which dataset it is?
I was responding to gpw928 about the bootfs path and beadm thing ;)

ls -latrh /mnt/boot | tail -5
Code:
drwxr-xr-x   2 root  wheel   809B Mar 16 10:19 kernel

drwxr-xr-x   2 root  wheel   808B Mar 16 10:19 kernel.old

drwxr-xr-x  19 root  wheel    25B Mar 16 10:35 ..

drwxr-xr-x  15 root  wheel    68B Mar 16 10:35 .

-rw-------   1 root  wheel   4.0K Mar 16 10:35 entropy


zfs list
Code:
zroot                                            49.1G  58.4G    88K  /mnt/zroot
zroot/root                                        353M  58.4G   353M  /mnt
zroot/tmp                                         144K  58.4G   144K  /mnt/tmp
zroot/usr                                        43.9G  58.4G   971M  /mnt/usr
zroot/usr/home                                   4.98G  58.4G  4.98G  /mnt/usr/home
zroot/usr/local                                   876M  58.4G   875M  /mnt/usr/local
zroot/usr/local/etc                              1.23M  58.4G  1.23M  /mnt/usr/local/etc
zroot/usr/ports                                   875M  58.4G   875M  /mnt/usr/ports
zroot/usr/ports/distfiles                         100K  58.4G   100K  /mnt/usr/ports/distfiles
zroot/usr/ports/packages                           88K  58.4G    88K  /mnt/usr/ports/packages
zroot/usr/src                                      96K  58.4G    96K  /mnt/usr/src
zroot/var                                        4.78G  58.4G  1.43G  /mnt/var
zroot/var/crash                                    88K  58.4G    88K  /mnt/var/crash
zroot/var/db                                     3.34G  58.4G  3.31G  /mnt/var/db
zroot/var/db/pkg                                 37.4M  58.4G  37.4M  /mnt/var/db/pkg
zroot/var/db/ports                                 88K  58.4G    88K  /mnt/var/db/ports
zroot/var/db/sup                                   88K  58.4G    88K  /mnt/var/db/sup
zroot/var/empty                                    88K  58.4G    88K  /mnt/var/empty
zroot/var/log                                    5.23M  58.4G  5.23M  /mnt/var/log
zroot/var/mail                                    112K  58.4G   112K  /mnt/var/mail
zroot/var/run                                     164K  58.4G   164K  /mnt/var/run
zroot/var/spool                                  1.23M  58.4G  1.23M  /mnt/var/spool
zroot/var/tmp                                      88K  58.4G    88K  /mnt/var/tmp
/mnt/boot % df -m .
Code:
Filesystem 1M-blocks Used Avail Capacity  Mounted on

zroot/root     60167  352 59814     1%    /mnt
 
From your zfs list output having bootfs on zroot/root is expected.
For the sake of attempt to recover the machine you could try to boot the old kernel. If you do
Code:
cd /mnt/boot
mv kernel kernel.new
mv kernel.old kernel
and do the reboot. If it doesn't work you can always roll it back in the rescue system.
 
That's a good idea! I was thinking maybe I'm not using the good boot file...
I mean I used
sudo gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada0
which pulled the file from the 11.2-RELEASE from the rescue OS.
I have now done
sudo gpart bootcode -b /mnt/boot/pmbr -p /mnt/boot/gptzfsboot -i 1 ada0
Server is restarting, finger crossed...
 
Back
Top