ZFS mirror boot: cannot boot from 2nd disk

I installed 8.1-RELEASE onto two identical drives in a mirror, following this guide: Installing FreeBSD Root on ZFS (Mirror) using GPT.

I can boot from either disk by changing the boot priority in the BIOS, so long as both disks are connected.

I can disconnect drive #2 and boot from drive #1 just fine, whether drive #1 is plugged into sata0 or sata1.

However when I disconnect drive #1 and try to boot from drive #2, on either sata0 or sata1, I get errors:

Code:
error 1 lba 32
error 1 lba 1
error 1 lba 32
error 1 lba 1
error 1 lba 32
error 1 lba 1
error 1 lba 32
error 1 lba 1
error 1 lba 32
error 1 lba 1
error 1 lba 32
error 1 lba 1
No ZFS pools located, can't boot

It looks like a hardware problem, but then why does disk #2 work when disk #1 is connected?

(Note, if both are plugged in, everything works fine; zpool status is clean, zpool scrub is clean, SMART status is good after long self-test.)
 
Update: I swapped the sata ports for the two drives, and it made no difference. I can still boot from either one through the BIOS boot priority, but only so long as both are connected.
 
If this is true it looks like a serious issue to me.

(UPDATE)

I just reproduced it on a VM running 8.1 release with the exactly the same results.

George
 
I just installed a 8.1 FreeBSD with just the same problem. First I tried was to rip out one of the disks. Started with Disk1. Same thing, no way I send this off to Co-lo like this.
Trying to recompile with the patch suggested.
 
Patch for testing

Patch source:
http://people.freebsd.org/~mm/patches/zfs/head-zfsimpl.c.patch

To recompile with the patch:
Code:
cd /usr/src
patch -p0 < /path_to/head-zfsimpl.c.patch
cd /usr/src/sys/boot/zfs
make clean
make depend
make
cd /usr/src/sys/boot/i386/zfsloader
make clean
make depend
make
make install
cd /usr/src/sys/boot/i386/gptzfsboot
make clean
make depend
make
make install
gpart bootcode -p /boot/gptzfsboot -i 1 ad4
gpart bootcode -p /boot/gptzfsboot -i 1 ad6

Repeat the last lines for all your disks in the ZFS mirror or raidz, assuming that the first partition is of type freebsd-boot and replace disk names "ad4", "ad6", etc. with your GEOM disk names

To view GEOM disk names, use:
gpart show
 
Bootloaders are only 32-bit I think, so this should work on amd64 too. I know that in freebsd 8.0-R when it was necessary to build a ZFS-aware loader manually, it was only an i386 binary that was built.

Only once the kernel loads do things switch to 64-bit.
 
Just wondered, how would this work if the ZFS pool has multiple vdevs and the root filesystem is spread over all of these.

Say, you have a two disk mirror ZFS pool, which apparently works, then add another mirror vdev, then recompile/reinstall the system -- files would now be stripped across all four disks. Will the zfsloader handle such scenario?

PS: To answer my own questions, with current code:

Code:
test# zpool add storage mirror label/disk4 label/disk5
cannot add to 'storage': root pool can not have multiple vdevs or separate logs

Apparently this is not going to work.
 
Thanks for the patch. I confirm it fixes the issue on amd64 (simulating a disk failure under virtual machine install). I ran the patch on RELENG_8_1 src.

Prior to compiling zfsloader, I had to make the /usr/src/sys/boot/i386/btx dependency.
 
danbi said:
Just wondered, how would this work if the ZFS pool has multiple vdevs and the root filesystem is spread over all of these.

Say, you have a two disk mirror ZFS pool, which apparently works, then add another mirror vdev, then recompile/reinstall the system -- files would now be stripped across all four disks. Will the zfsloader handle such scenario?

PS: To answer my own questions, with current code:

Code:
test# zpool add storage mirror label/disk4 label/disk5
cannot add to 'storage': root pool can not have multiple vdevs or separate logs

Apparently this is not going to work.

Actually it does:

http://lists.freebsd.org/pipermail/freebsd-stable/2010-October/059411.html
 
It works indeed, for now, by disabling the bootfs property, adding more vdevs and re-enabling it again.

That poor server has suffered a dozen re-installs :)
 
I am experiencing a similar problem. How do you apply this patch? I mean how do I get a terminal when I can't even boot in?!

thanks,
atwinix
 
atwinix said:
I am experiencing a similar problem. How do you apply this patch? I mean how do I get a terminal when I can't even boot in?!

thanks,
atwinix
Hmmm. Your second disk just died? Have you tried every combination of likely boot drive in your BIOS first? If the second disk has truly died, you might have to build a new system the way you want to, and import your pools into that.
 
How come this bug is still not listed as errata on 8.1-release? Makes me question how many other major bugs are known but not reported as errata.
 
mm@: I've been using this patch on several production servers for quite a while. Tests in VMWare & on said production servers (before deployment) shows the bug seems to be fixed.
I can't see any negative side-effects as a result of this patch.

Using 8.1-RELEASE w/ patches, one i386 and one amd64.
 
I am a victim of this bug and have to say it should have been patched into 8.1R, lost a lot of confidence now in the OS that this affected me on stable code.

Primary hdd died, cannot boot of 2nd hdd.

So whats the recovery path for this scenario?
 
chrcol said:
I am a victim of this bug and have to say it should have been patched into 8.1R, lost a lot of confidence now in the OS that this affected me on stable code.

I'm afraid the Release Engineering team is asleep at the wheel. I emailed them about this bug back in August with no response.

Then there's all the other major bugs in 8.1-release. Try booting off a ZFS mirror when another unrelated ZFS pool is corrupted. Can't do it.

The e1000 Ethernet driver is buggy (8-STABLE is fixed).

Then there's the resource leaks somewhere that cause my file server to silently disappear off the network, with no error messages anywhere. Kinda like kern/144330 did in 8.0-REL. That's a totally unacceptable failure mode for a so-called "server-class" OS.

Then there was the time I ran 'telnet foobar' (non-existent address) and locked up the whole thing (console was dead, though networking was still sorta running). Required hard reset to fix.

Sigh... please forgive the rant.
 
no worries.

I did try the patch and instructions (as can boot in single mode) but it fails with a missing file error similiar to what someone else got, shame the dev's couldnt even test that procedure before posting it.

so right now csupping down 8.2 code and hoping I can compile and install it without it freaking out. Have only sshd service running as bit hard to do all this on kvm.
 
Back
Top