Solved ZFS: i/o error - all block copies unavailable ... Invalid format

dvl@

Developer
The system is running FreeBSD 10.2. I run freebsd-update on a regular basis.

Of note: mfsBSD is 10.2 release. System is slightly newer... Is that relevant (see below)?

The screen shot: https://twitter.com/DLangille/status/701412183561871360

While in mfsBSD and running various things, I managed to put one of the drives into a gmirror which it should not have been. After fixing that, the box would not boot.

I tried mfsBSD to see if I could import the zpool. Yes, I could:

Code:
root@mfsbsd:~ # zpool import -N system
root@mfsbsd:~ # zpool status
  pool: system
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
    attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
    using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://illumos.org/msg/ZFS-8000-9P
  scan: resilvered 1.88T in 51h0m with 0 errors on Sun Feb 21 06:44:49 2016
config:

    NAME                                                          STATE     READ WRITE CKSUM
    system                                                        ONLINE       0     0     0
     raidz2-0                                                    ONLINE       0     0     0
       gptid/1f65b3e4-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
       gptid/21f2788d-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
       gptid/2484e4ef-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
       diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20Z2U658VGSp3  ONLINE       0     0     0
       gptid/29b08a29-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
       gptid/2c5b024e-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
       gptid/cf46c81a-d6ba-11e5-8298-00259082215a                ONLINE       0     0     0
       gptid/31e93100-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
       gptid/329ad7d7-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
       gptid/3412b7e6-efc8-11e2-92b7-00259082215a                ONLINE       0     0     5

errors: No known data errors
root@mfsbsd:~ #

I a zpool clear and rebooted. Still the same issue as in the screen shot.

I tried zpool set cachefile=/boot/zfs/zpool.cache system but:

Code:
root@mfsbsd:~ # zpool import -R /mnt system
root@mfsbsd:~ # mkdir /mnt/dev
root@mfsbsd:~ # mount -t devfs devfs /mnt/dev
root@mfsbsd:~ # mount -t zfs system/rootfs /mnt
root@mfsbsd:~ # chroot /mnt
root@mfsbsd:/ # zpool set cachefile=/boot/zfs/zpool.cache system
internal error: failed to initialize ZFS library
root@mfsbsd:/ #

Julian pointed out, there is no /dev/zfs in the chroot.

EvilPete said that might be devfs rulesets.

Ideas?
 
I ran this again, and this time the chroot contained a /dev/zfs.

I managed to get the set cachefile to work, but it didn't allow booting.

Code:
root@mfsbsd:~ # zpool import -R /mnt system
root@mfsbsd:~ # mkdir /mnt/dev
root@mfsbsd:~ # mount -t zfs system/rootfs /mnt
root@mfsbsd:~ # ls /mnt/dev/
root@mfsbsd:~ # mount -t devfs devfs /mnt/dev
root@mfsbsd:~ # chroot /mnt
root@mfsbsd:/ # ls /dev/zfs
/dev/zfs
root@mfsbsd:/ # zpool set cachefile=/boot/zfs/zpool.cache system
root@mfsbsd:/ # exit
exit
root@mfsbsd:~ # zpool status
  pool: system
state: ONLINE
  scan: resilvered 72K in 0h0m with 0 errors on Sun Feb 21 15:44:32 2016
config:

    NAME                                                          STATE     READ WRITE CKSUM
    system                                                        ONLINE       0     0     0
      raidz2-0                                                    ONLINE       0     0     0
        gptid/1f65b3e4-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
        gptid/21f2788d-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
        gptid/2484e4ef-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
        diskid/DISK-%20%20%20%20%20%20%20%20%20%20%20Z2U658VGSp3  ONLINE       0     0     0
        gptid/29b08a29-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
        gptid/2c5b024e-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
        gptid/cf46c81a-d6ba-11e5-8298-00259082215a                ONLINE       0     0     0
        gptid/31e93100-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
        gptid/329ad7d7-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0
        gptid/3412b7e6-efc8-11e2-92b7-00259082215a                ONLINE       0     0     0

errors: No known data errors
root@mfsbsd:~ #
 
I also tried removing one of the HDD to see if the message would change. It did not.
 
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 XXX has been run against each drive.

I did with the the files provided with mfsBSD and with the files from my local 10.2 system. Neither solution changed the booting problem.
 
This is one of many reason why I don't use ZFS on boot devices....
Your issue is that your boot device (USB) isn't called /dev/da1, given that you have several USB storage devices connected considered the information you've given earlier you either need to edit /etc/fstab on your boot device to da0 or da2. I'm not sure how well USB3 works during boot but you're probably better off for now using an old USB2 memory/cardreader.
//Danne
 
There are no USB devices in the system, unless it's booting from mfsBSD.
 
My bad, anyway it can't find the path so either the USB3(?) ports aren't supported/enabled during boot or you have something else that shuffles the /dev/da* devices during boot.
//Danne
 
Current status is:

Code:
ZFS: i/o error - all block copies unavailable
ZFS: can't read MOS object directory
ZFS: can't find root filesystem
gptzfsboot: failed to mount default pool system

FreeBBSD/x86 boot
ZFS: i/o error - all block copies unavailable
ZFS: can't find dataset u
Default: system/<0x0>:
boot:

transcript at https://gist.github.com/dlangille/f08e68d70af05b00ee1f
 
I am now thinking about creating another pair of boot drives, just to get the system booted.

I'm thinking: after boot, mount the other array over this one... bad idea?
 
Solved.

Tonight, while booting into mfsBSD to do a zpool set cachefile, I noticed that I saw only three drives while booting. I looked in BIOS but didn't find what I was looking for, but I did find it in the LSI card setting. During the debugging of the original problem, I changed the Boot Support setting on the LSI card from BIOS & OS to OS Only. It did not help, but the debugging change was never reverted.

I made that change, rebooted the server, and it's back.

Cheers. :)
 
I ran into this today. It was a 10.1-RELEASE machine, with ZFS on root (zroot). It's a machine where 4x4TB disks on a HW RAID controller form the 14.5TB block device which is given to ZFS.

After upgrading it to 10.3-RELEASE, it wouldn't reboot with "zfs i/o error - all block copies unavailable".

Fix it by doing (using the live file system of a 10.3-RELEASE CDROM:

# mkdir /tmp/mnt
# zpool import -R /tmp/mnt -f zroot
# cd /tmp/mnt
# mv boot boot.orig
# mkdir boot
# cd boot.orig
# cp -Rp * /tmp/mnt/boot
# zpool export
# reboot


This is cool and all, but why did this happen in the first place? I followed normal procedure:

# freebsd-update -r 10.3-RELEASE upgrade
# freebsd-update install
# reboot
# <broken>
 
You forgot to write the new bootblocks after upgrading to different release?
Code:
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad0
 
You forgot to write the new bootblocks after upgrading to different release?
Code:
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ad0

I didn't do this, indeed. Is this in the documenation somewhere?

Besides that, did you see my fix? This did not involve writing the bootblocks.
 
Well, after 12.1->12.2 upgrade, I've got the same.
But booting from mfsbsd and installing correct bootcode did not help. I can import my pool from mfsbsd, and does not see any error, but cannot boot from it!
 
What is the boot method, legacy BIOS or UEFI?
Or, if you prefer, what are the exact messages you get at boot?

On an already started system you can see it with: sysctl machdep.bootmethod
 
tarkhil Did you figure out your situation? I have the same.
I built 12.2-releng on my 12.1 system, installed the kernel and rebooted (which worked), then after "make install-world" in single user, the system fails to boot in this way. I have an all-ZFS system on a hardware raid, single disk to FreeBSDs point of view. GPT, and my system is configured for UEFI, though I'm not 100% sure the disk is set up that way. I know the FreeBSD-12.2 ISO I'm booting boots that way.

Anyone have any idea how to resolve this? I tried checking and scrubbing the zfs pool from ISO boot, it seems fine. Reinstalled boot (though, hmm, not necessarily for UEFI? not sure), still same problem.
 
Back
Top