Upgrade 11.2 ZFS: i/o error - all block copies unavailable

During an upgrade from 11.1 to 11.2, upon first reboot, I get the following error:

Code:
ZFS: i/o error - all block copies unavailable

/boot/kernel/kernel text=0x1547d28 ZFS: i/o error - all block copies unavailable

elf64_loadimage: read failed
can't  load 'kernel'

I have tried the following suggestions found on line and none of them worked.

1 - Copy over the boot loader code again after booting off live cd option with the memstick image
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 adao
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1

2 - Try copy over the boot directory again for "self-healing" after booting off live cd option with memstick image

zpool import -f -R /mnt zroot
mv /mnt/boot /mnt/boot.orig
mkdir /mnt/boot
cd /mnt/boot.orig && cp -R * /mnt/boot

From this post (https://forums.freebsd.org/threads/...m-zroot-after-applying-p25.54422/#post-308876) it appears that the only way out, if you have a striped zpool across vdevs that have a different partition structure for their devices, that is used for zfs root you are basically screwed?

I don't have enough spare capacity to over the data, rebuild the pool and restore. So my questions are: (please note I am not a freebsd nor zfs expert so apologies if my questions are conceptually confused.)

1) Is this really the only way?
2) Could I remove the 2nd vdev that has the raw disk devices and try and force all the data to be written to the first vdev? The I could re-partition the 2 raw disks and add them back as a similarly formatted vdev? I have read from several, low quality sources, that its not possible to remove a vdev. Can I remove all the disks in a vdev instead and then add them back to the vdev? If space is an issue and not all data can be copied over to the first vdev, which is what I suspect, can I remove the raw devices from the 2nd mirror vdev one at a time, re-partition then and add then back in one at a time to the 2nd mirrored vdev? Wold any of these approaches actually solve the problem?

To clarify:

My zpool has two mirror vdevs. The first has two disks which have been partitioned with boot partitions, swap partitions and then a partition dedicated to ZFS vdev1. The 2nd vdev is also mirrored but the 2 entire disks have been added to the vdev.
 
Don't remove anything, don't install anything. Not yet at least. Not until we figure out what you really need to do. If you do the wrong thing your data will be toast.

Please boot from a rescue disk (the install media will do) and post the output from gpart show so we have a better understanding of what we're dealing with.
 
Also, while you're booting from said rescue disk try dropping down to the boot console (press escape) and run lsdev. Does it detect your disks (and your ZFS pool(s)) at all?

Don't do anything else, just run boot afterwards to continue booting and follow up on SirDice's question.

(edit) PS: are you running a regular version or did you customize stuff (like building your own kernel and/or base system)?
 
Output from lsdev, not booting from livecd but from the "rescue" console I am dumped to.

freebsd.jpg


I can run "zpool import -R /mnt zroot" at the rescue prompt and all looks good.

zpool status
pool: zroot
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
ada0p3 ONLINE 0 0 0
gpt/zfs1 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
ada2 ONLINE 0 0 0
ada3 ONLINE 0 0 0

errors: No known data errors

The output of "gpart show" from livecd is

root@:~ # gpart show
=> 34 3907029101 ada0 GPT (1.8T)
34 6 - free - (3.0K)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 4194304 2 freebsd-swap (2.0G)
4196352 3902832640 3 freebsd-zfs (1.8T)
3907028992 143 - free - (72K)

=> 34 3907029101 diskid/DISK-Z4Z09HQB GPT (1.8T)
34 6 - free - (3.0K)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 4194304 2 freebsd-swap (2.0G)
4196352 3902832640 3 freebsd-zfs (1.8T)
3907028992 143 - free - (72K)

=> 34 5860533101 ada1 GPT (2.7T)
34 6 - free - (3.0K)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 4194304 2 freebsd-swap (2.0G)
4196352 5856335872 3 freebsd-zfs (2.7T)
5860532224 911 - free - (456K)

=> 34 5860533101 diskid/DISK-WD-WMC4N0L26AUL GPT (2.7T)
34 6 - free - (3.0K)
40 1024 1 freebsd-boot (512K)
1064 984 - free - (492K)
2048 4194304 2 freebsd-swap (2.0G)
4196352 5856335872 3 freebsd-zfs (2.7T)
5860532224 911 - free - (456K)

=> 1 15633407 da0 MBR (7.5G)
1 1600 1 !239 (800K)
1601 1505616 2 freebsd [active] (735M)
1507217 14126191 - free - (6.7G)

=> 0 1505616 da0s2 BSD (735M)
0 16 - free - (8.0K)
16 1505600 1 freebsd-ufs (735M)

=> 1 15633407 diskid/DISK-4C530001090929106062 MBR (7.5G)
1 1600 1 !239 (800K)
1601 1505616 2 freebsd [active] (735M)
1507217 14126191 - free - (6.7G)

=> 0 1505616 diskid/DISK-4C530001090929106062s2 BSD (735M)
0 16 - free - (8.0K)
16 1505600 1 freebsd-ufs (735M)

I am running stock standard freebsd. Don't know enough to try anything too fancy yet :)

thanks for the help
 
attempting to remove mirror-1 with
zpool remove zroot mirror-1

results in

cannot remove mirror-1: root pool can have removed devices because GRUB does not understand them.

Its not looking good.
 
attempting to remove mirror-1 with
zpool remove zroot mirror-1
Why would you want to do that?

I'm also a little surprised about the error message because I was pretty sure that zpool wouldn't know anything about Grub (Grub isn't part of the FreeBSD base system afterall) but I checked and I traced the error message back to: /usr/src/cddl/contrib/opensolaris/lib/libzfs/common/libzfs_zpool.c. You learn something new everyday :)

Anyway.... This was about booting. I'd leave the pool for now until you can fully boot again.

The command you used in the OP ( gpart bootcode) was the right one however.. You probably ran that from the rescue media I assume? Which means that you'd have used that bootcode, but that code could be different than that on your actual pool.

Therefor my suggestion would be to boot using a rescue system, mount your pool, and then use the boot code as it is installed on your system. So (for example): /mnt/boot/gptzfsboot. That would ensure that your versions will never mismatch.

(edit)

Almost forgot: the screenshot you shared tells us that the bootloader does recognize your system, which is good. Are you by any chance using a custom kernel, and if so are you sure you build all the required modules needed for proper ZFS support?

Worst case scenario would be to boot your system from the live cd using the kernel from the live CD. After that you'd have full access again and can restore whatever it is that's broke. Here's how you could do that:
  1. Boot from the cd and drop down to the boot prompt as before (when you used lsdev). Then use the following commands:
  2. unload
  3. load /boot/kernel/kernel
  4. load /boot/kernel/opensolaris.ko
  5. load /boot/kernel/zfs.ko
  6. set curdev="disk0p3"
  7. set vfs.root.mountfrom="zfs:zroot"
Careful with those set commands, do not add any spaces or something, just type it as I've shown here. After all this you can boot your system using either boot (boot normally) or boot -s if you want to be careful and boot in single user mode. Keep in mind that you'd probably need to manually mount the rest of your file systems other than the root if you use single user mode.

This should allow you to at least boot your system and access your stuff. And this would also be the perfect environment to try and fix your boot code; if you run the gpart bootcode command here you'd be sure that it would use the right bootcode.

Hope this can help!
 
Thanks ShelLuser - Your post has helped improved my understanding of the freebsd boot process ;)

I had already proceeded on a different course of action to recover and had to wait for the resilver to finish before confirming if it worked.
tldr;
I have a bootable system!
Longer version

Here is what I did from the live-cd environment

1) zpool offline zroot ada3
2) gpart create -s GPT ada3 (could probably just have done step 3 instead)
3) gpart add -t freebsd-boot ... gpart add -t freebsd-swap ... gpart add -t freebsd-zfs (to mimic the existing partition on the physical disk in the first mirror vdev. I don't think this was completely necessary - just gpt partition table with 1 partition may have done the trick.
4) zpool detach zroot ada3
5) zpool attach zroot ada2 ada3 -> wait for rebuild to finish
6) repeat steps 1 to 5 for ada2
7) cross fingers and reboot
8) crack open a can of the best!

I am not sure how grub plays a role in the boot process. I am pretty sure I just used the bsd boot loader? But it was a while ago and I was still learning so maybe I followed a guide that used grub at some point.

What I do know about grub, now, is that it needs to use a block list to find core.img - its later stage bootcode image but looks like it needs a partition table to do that. I am not sure why grub even features though :) Maybe a vestigial piece of code.


Hope this helps someone.
 
Back
Top