Solved Tried to replace da0 in RAIDZ2, now server won't boot. (Ended up rebuilding OS install.)

jrronimo

New Member

Reaction score: 2
Messages: 16

Hello! I have a lot of background info of my setup here, but the short version is:

FreeBSD 12.2 server, RAIDZ2 with 5 disks in a RAIDZ2. I just tried to replace drive da0, and now my server won't boot.

I'm sure I set this all up wrong, but I hope someone can help me recover.

Earlier, I was trying to make sure each drive had boot info on it and did
Code:
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 daX
for each drive (replace daX with da1-4).

While the server was running fine, I did:
Code:
# zpool offline zjrr da0p3

Then I did a
Code:
shutdown -p now
and swapped the disk. (I've tried hotswapping a few times, but had pretty disastrous results.)

Now I'm seeing:
Code:
ZFS: i/o error - all block copies unavailable
Invalid format
ZFS: i/o error - all block copies unavailable
Invalid format
ZFS: i/o error - all block copies unavailable

Can't find /boot/kernel/kernel

FreeBSD/x86 boot
Default: zjrr/ROOT/default:/boot/kernel/kernel
boot:
ZFS: i/o error - all block copies unavailable

Can't find /boot/kernel/kernel

FreeBSD/x85 boot
Default: zjr/ROOT/default:/boot/kernel/kernel
boot:

and I have no idea how to recover from here. I (clearly incorrectly, hah) thought that the boot information would be on each of the other drives.

I'd greatly appreciate any help that anyone can offer!
 

covacat

Well-Known Member

Reaction score: 209
Messages: 448

gptzfsboot may not understand z2 redundancy
also default should be /boot/loader ?
you may try to boot from external media and resilver the pool
 
OP
jrronimo

jrronimo

New Member

Reaction score: 2
Messages: 16

Thanks for the tips!

Try putting back the old drive and check if the system boots normaly.
I did try putting da0 back in and still had no joy.

After thinking about troubleshooting it, I decided to rebuild the box. It's just a personal machine; nothing critical. I installed a pair of 250 GB SSDs in a RAID1, reinstalled the OS, and loaded the pool with an altroot ( zpool import -f -o altroot=/somelocation zjrr) and all my data is safe. I still need to replace the drive and resilver, but this was a long overdue rebuild. Next time I'll learn the proper way to repair this... but I suspect I was going to load up a live USB stick and rebuild the /boot partition for all the drives.


Which commands did you use and what results did it produce?

When I tried hotplugging, I had done zfs offline daX, then just plugged a new drive in to the raid controller (which is an LSI 9211-8i with the firmware for just using it as a controller). The system just kind of... locked up, I believe. I hadn't gotten to the point of trying commands.
 
Top