Solved Changing ZFS checksum on boot device: bad idea

Hello,

I read several times that changing ZFS checksum to sha512 could improve performances.
So, on a FreeBSD 11 computer installed with a ZFS boot device, I did "zfs set checksum=sha512 zroot" and everything seemed fine.

Until I rebooted.

When I rebooted, I had errors like:
Code:
ZFS: i/o error - all block copies unavailable
and the machine couldn't start.

Then I found:
Booting off of a pools utilizing SHA-512/256 is NOT yet supported.
on this page: https://www.freebsd.org/cgi/man.cgi...opos=0&manpath=FreeBSD+11.0-RELEASE+and+Ports

Oops... :oops:


So, I booted on a live CD, imported my pool, changed the checksum back to the default value with "zfs set checksum=on zroot", then exported my zpool but I still can't boot.
Now I'm having this message:
Code:
Mounting from zfs:zroot/ROOT/default failed with error 2: unknown file system

mountroot>

And if at the prompt I type "zfs:zroot", I get the same "unknown file system" message.

I've also tried disabling completely the checksum with "zfs set checksum=off zroot" in case of... but the problem persists.

How can I solve this problem? All my data is still there and the only thing I changed was the checksum (no update of FreeBSD or installed ports).
 
When you write data to ZFS it uses whichever checksum method is currently enabled. The type of checksum is stored with the data. If you then change the checksum type, all data already written stays as it is (obviously if you had terrabytes of data it would be hugely problematic to try and rewrite all the data). Simply changing checksum/compress/etc settings back is not the same as never setting them in the first place.

As mentioned in the man page, as soon as you set sha512, the relevant feature is set to active on the pool. (see zpool get all poolname). It's probably refusing to load the pool in the bootcode because it doesn't support this feature. Hopefully you can send/recv the modified filesystems from a live cd and destroy the originals to get the active flag to clear.

This feature becomes active once a checksum property
has been set to sha512, and will return to being enabled once
all filesystems that have ever had their checksum set to sha512
are destroyed.
 
Hello,

Thanks a lot usdmatt for your help: problem solved!

Yes, I had read the new checksum would have been used only for new data or modified files. I only missed the most import part: no sha256/sha512 on boot device :(
But, except for new log files, my data didn't really change, so I don't understand why setting checksum to sha512 then reverting it or disabling it didn't solve the problem.


Anyway, if it can help someone who make the same error, here are the commands used to fix the problem:
Code:
zpool import -f -o altroot=/tmp zroot
zfs snapshot -r zroot@last
zfs send -R zroot@last | ssh root@192.168.0.123 "gzip -1 > /tmp/zroot.gz"
zfs destroy -r zroot@last
ssh root@192.168.0.123 "cat /tmp/last-zroot.gz" | gunzip | zfs receive -Fu zroot
In short: I'm storing a snapshot as a compressed file on another server, then I'm getting it back.

And I'm not sure if it's necessary or maybe I had to use it because I fiddled too much with zfs/zpool commands, but I had to also use:
Code:
zpool set bootfs=zroot/ROOT/default zroot
or else I could'nt boot ("Can't find /boot/zfsloader" and "Cant't find /boot/kernel/kernel" errors)
 
Back
Top