My laptop fails to mount ZFS on root today

The boot process fails to mount ZFS on root and drops to a mountroot> prompt.

The line before the ZFS error on screen says:

Code:
sysctl_unregister_oid: failed(22) to unregister sysctl(tmpfs)

I have tried booting another boot environment, but there does not appear to be any others. There should be two previous good boot environments there, but they are not listed for some reason.

Before shutting down my laptop, I made a sysctl change to try to fix the awful sound volume on my laptop in FreeBSD. I followed advice in the following post https://forums.freebsd.org/threads/low-sound-volume.49620/
I entered the following command and the stored value of 45 was changed to 1.
Code:
sysctl hw.snd.vpc_0db=1
It made the problem worse so I shut it down the laptop with 'poweroff' command to find out if that made a difference. Then I found this problem.

I am writing this post on the same laptop, booted into Windows 10 from the same SSD that shows no faults. Sound is OK too in Windows. I need some help to mount ZFS without destroying it. What do I do?

I have found these posts...

Do I need to make a FreeBSD memstick, to boot without installing and try to mount ZFS from that? How do I clear the tmpfs problem from my SSD?
 
Last edited:
Photo of the error on my laptop screen
 

Attachments

  • zfs-error.png
    zfs-error.png
    596.6 KB · Views: 59
I booted off a 13.2-RELEASE memorystick and managed to mount my zfs pool OK.
I reconfigured the memorystick to have RW file system and got SSH running in both directions to another machine running 13.2p9.
I used ZFS send to copy the last known good hourly snapshot to a USB drive on the other machine. Sadly, that drive was a shingled hard drive and it started to have problems with ZFS half way through the transfer. I got some datasets but I haven't needed those copies yet.

I rebooted my laptop and selected option 3.
At the 'OK' prompt I entered the following commands a line at a time

Code:
unload
load /boot/kernel/kernel
load /boot/kernel/opensolaris.ko
load /boot/kernel/zfs.ko
boot

I edited my config files to comment out tmpfs and rebooted. This got rid of the tmpfs and sysctl error, but still no joy on a successful boot. It dropped back to mountroot> prompt.

I rebooted and chose option 3 again. I repeated the successful manual boot method above. I found a command using '?' that resynchronised my boot environment. Although, I have the last three hourly snapshots, seven daily, four weekly and three monthly snapshots, my last good boot environment was at 13.2p7. On the next reboot I selected option 8 and found my three boot environments. I selected '13.2p7' and it booted multi user fine.

I created another two boot environments in succession, activated the last one and rebooted again. After reinstalling all of the pkg updates, 661 of them, I rebooted again to find my system sitting at mountroot> again.

I remembered that earlier on the day of the trouble I found that my packages had not been updating since I used a proxy server at another site last November. I unset the environment variables for the proxy and did a pkg update/upgrade in the morning but I did not reboot as I was running a job that still had an hour or so to complete.

I realised then that a pkg update that came in on the last day must be the cause of the problem. I will strip back my config to the bare minimum by commenting out as much as possible, take a BE snapshot each time, do a pkg upgrade then reboot. I will re-enable one line at a time until I find the conflict.

After that it will be a case of locking pkg versions and retesting. Perhaps I could use zfs diff to identify which files trigger the unsuccessful ZFS root mount by comparing snapshots.

An alternative is just installing 14.0 on another SSD in another machine, ZFS send my data to it, and test for problems. If it is OK, I can install 14.0 on the troublesome laptop and ZFS send my data back.
 
I disabled everything except ZFS in /boot/loader.conf
and everything except basic IP network in /etc/rc.conf

Rebooting to 13.2p9 userland and 13.2p8 kernel, all is OK.
However, if I install all of the pkg upgrades afterwards the system cannot find ZFS again.

I tried to do a zfs diff, but I need to think about this some more as it didn't work as expected.

Code:
$ zfs snapshot zroot/ROOT/default@problem
$ zfs snapshot zroot/ROOT/pre-3rd@noproblem
$ zfs diff zroot/ROOT/default@problem zroot/ROOT/pre-3rd@noproblem
Cannot diff an unmounted snapshot: operation not applicable to datasets of this type

Where 'default' is the boot environment that has had the pkg upgrade and no longer mounts.
'pre-3rd' is the boot environment that worked prior to pkg upgrade, taken before my third test.

I am thinking I should ZFS send them both to another machine where they can be mounted safely. Snapshot the 'noproblem' one on the destination, then file copy the contents of 'problem' over the top, snapshot that, then do a ZFS diff between the two to find the changes. My hunch is that the OpenZFS binaries will be different.
 
Rebooting to 13.2p9 userland and 13.2p8 kernel, all is OK.

Create a boot environment of the OK state.

Then:
  1. create a new environment
  2. use bectl to mount it at /tmp/up
  3. pkg -r /tmp/up upgrade
  4. unmount the environment
  5. activate the environment
  6. save a record of what was upgraded
  7. restart the computer.
 
  • Thanks
Reactions: vmb
Back
Top