FreeBSD-10 zfs replace failed zroot disk

Can anyone point me to a single doc that shows how to replace a failed zroot mirror disk in a system installed via bsdinstall? It seems to me if zfs is now point and click, so to speak, we ought to have a good recovery document/how-to somewhere for what will surely become a common occurrence ...Thanks!
 
man zpool will do the trick. There is a short section explaining how replace works. It's very easy, the command is just zpool replace <pool> <old> <new>. The only difficult part is how to identify the old and new disk. Personally, I use GPT partitioning with logical names of all partitions, so I just used /dev/gpt/my_favorite_disk_name. If ZFS can't find a disk (for example because it has failed), it creates a made-up name for the disk, which can be found in the output of zpool status. That's a long string of numbers, which you can cut and paste into the replace command.
 
ralphbsz: I am specifically referring to a zpool that is zroot - in this case, two mirrored zfs drives that have the os and everything else on one pool. Your response certainly works for any other pool, but in the case of zroot, it is different - because there is bootcode existing outside the pool involved; various .ko kernel code have to get loaded, etc. I believe the way the mirrored pool works is it boots initially off one of the disks (even though one is wise to have bootcode on both), and then the zroot pool is basically imported or mounted. And what happens if the remaining disk loses its non zfs bootcode? Also, what about the zpool cache?

I have a datacenter setting up this system for me and they thankfully want to understand disk replacement should one of these drives go bye-bye. My assumption is they will be using the bsdinstall method of setting up the zfs pool...

I think we really need a good flowchart for this stuff given it is now integral to bsdinstall...
 
I've seen quite a few emails on the doc mailing list regarding ZFS. My take is the documentation is playing catch up as the ZFS install was still marked experimental last I checked. I'm sure once everything is done that will change. Normally for things like this I just look at Oracle's ZFS documentation ( http://docs.oracle.com/cd/E19253-01/819-5461/ ) because it is very detailed.
 
Hm, no answer yet. I don't know the answer either, since my zpools are not the root disk. On the other hand, it seems that there are really two orthogonal issues here: (a) how do you replace a disk in a zpool (and that answer was discussed above), and (b) how do you make a disk bootable, independent of the root file system on it. I have not yet seen a concise guide for that second question. I'll have to migrate my root disk (which uses 4 or 5 ufs file systems) to new hardware in a few weeks, maybe I'll write up exactly what steps I took.
 
ralphbsz said:
man zpool will do the trick. There is a short section explaining how replace works. It's very easy, the command is just zpool replace <pool> <old> <new>.

No, no, no, no, and no! Do *not* use "replace" with mirror vdevs. You'll cause yourself no end of issues.

The proper way to replace a failed (or failing) drive in a mirror vdev:
Code:
# zpool attach <poolname> <old drive> <new drive>
<wait for resilver to complete>
# zpool detach <poolname> <old drive>
Doing it this way, you never break the mirror, you never lose redundancy (if <old drive> is not completely dead, anyway), and you don't end up in weird configurations.

"zpool replace" should only be used with raidz vdevs.

If the old drive is mostly dead, then attach the new drive to the working drive in the mirror, wait for the resilver to complete, and then detach the dead drive.

If the old drive is completely dead, then you can "zpool offline" it, attach the new drive to the other side of the mirror, wait for the resilver to complete, and then detach the dead drive.
 
roy2098 said:
ralphbsz: I am specifically referring to a zpool that is zroot - in this case, two mirrored zfs drives that have the os and everything else on one pool. Your response certainly works for any other pool, but in the case of zroot, it is different - because there is bootcode existing outside the pool involved; various .ko kernel code have to get loaded, etc. I believe the way the mirrored pool works is it boots initially off one of the disks (even though one is wise to have bootcode on both), and then the zroot pool is basically imported or mounted. And what happens if the remaining disk loses its non zfs bootcode? Also, what about the zpool cache?

After you do the attach/detach process to get the new drive added to the mirror, you just run the following gpart commands to add the bootcode to it (if you are using GPT partitioning and created a separate freebsd-boot partition like you are supposed to):
Code:
# gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 <diskdevice>
 
Slightly OT:

phoenix said:
No, no, no, no, and no! Do *not* use "replace" with mirror vdevs. You'll cause yourself no end of issues.
If the old drive is really dead (for example disconnected, sitting on my lab bench, and can't be put into the machine because it prevents the motherboard from booting): Is zpool replace OK then?

I understand the difference with a partially working drive, but I fail to see the difference for a completely dead one.
 
If it's part of mirror, then no!

Just # zpool detach <poolname> <drive>. That's it, that's all. No replace required.

Replace should only be used with raidz vdevs.
 
robert1307 said:
phoenix, thank you for "zpool attach" hint. I've lost lot of hairs trying to follow "zpool replace" guides. There are lot of tutors specifying this "dead" method in the internets not only for raidz vdevs, but also for zfs root "mirror", e.g. http://earth2baz.net/2014/04/10/freebsdzfs/.

The replace operation is only for raidz vdevs, that's maybe not emphasized enough in the various guides and documentation. The mirror vdevs are less widely used because people (including the ones writing the guides who should know better) are too lazy to delve into the details and assume that raidz is the be all end all solution for everything.
 
I've done zpool detach zroot [dead-drive], then zpool attach zroot [working-drive] ada1

After resilvering was finished, I wasn't able to do:

gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1

There is simply no such geom (ada1):

gpart show
Code:
=>        34  1953525101  ada0  GPT  (932G)
          34           6        - free -  (3.0K)
          40        1024     1  freebsd-boot  (512K)
        1064    33554432     2  freebsd-swap  (16G)
    33555496  1919969632     3  freebsd-zfs  (916G)
  1953525128           7        - free -  (3.5K)

So I've performed:
Code:
sysctl kern.geom.debugflags=0x10
gpart add -b 40 -l gptboot1 -s 512K -t freebsd-boot ada1
gpart add -s 16G -l swap1 -t freebsd-swap ada1
gpart add -t freebsd-zfs -l zfs1 ada1
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1

After that ada1 geom appeared in the list:

gpart show
Code:
=>        34  1953525101  ada0  GPT  (932G)
          34           6        - free -  (3.0K)
          40        1024     1  freebsd-boot  (512K)
        1064    33554432     2  freebsd-swap  (16G)
    33555496  1919969632     3  freebsd-zfs  (916G)
  1953525128           7        - free -  (3.5K)

=>        34  1953525101  ada1  GPT  (932G)
          34           6        - free -  (3.0K)
          40        1024     1  freebsd-boot  (512K)
        1064    33554432     2  freebsd-swap  (16G)
    33555496  1919969632     3  freebsd-zfs  (916G)
  1953525128           7        - free -  (3.5K)

However, after reboot, geom ada1 disappeared again, and I'm getting errors like "no /dev/gpt/swap1".

zpool status is ok:

zpool status
pool: zroot
state: ONLINE
scan: none requested
config:

NAME STATE READ WRITE CKSUM
zroot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
gptid/b5caacfd-fd60-11e3-a1d9-49be4d48d146 ONLINE 0 0 0
ada1 ONLINE 0 0 0

errors: No known data errors

gpart show
Code:
=>        34  1953525101  ada0  GPT  (932G)
          34           6        - free -  (3.0K)
          40        1024     1  freebsd-boot  (512K)
        1064    33554432     2  freebsd-swap  (16G)
    33555496  1919969632     3  freebsd-zfs  (916G)
  1953525128           7        - free -  (3.5K)

Am I missing something?
 
Well, you attached the whole ada1 disk to the mirror instead of the partition ada1p3. You not only overwrote the GPT partition table but also made ada1 non-bootable. Detach the ada1 disk from the mirror and use zpool labelclear -f ada1 to clear the ZFS metadata on it, then recreate the partitioning and bootcodes as you did before but use zpool attach zroot ada0p3 ada1p3 (you may need to use gptid/b5caacfd-fd60-11e3-a1d9-49be4d48d146 in place of ada0p3) to attach the partition to the mirror.

ZFS and the ZFS utilities to do exactly what you tell them to do instead of doing some interpretation on the arguments, when you tell zpool(8) to attach a whole ask to a mirror it does exactly that and doesn't ask any question if you're sure that you know what you're doing.
 
You made my day, thanks!

Last question - is that possible to rename gptid/b5caacfd-fd60-11e3-a1d9-49be4d48d146 to ada0p3 in the mirror somehow?
 
robert1307 said:
You made my day, thanks!

Last question - is that possible to rename gptid/b5caacfd-fd60-11e3-a1d9-49be4d48d146 to ada0p3 in the mirror somehow?

I have done something like that before and you first make sure there's no resilver in progress and then detach the gptid/* device from the mirror. Then re-attach it using the ada* name, in your case zpool attach zroot ada1p3 ada0p3.

To avoid future problems with the gptid/* names you should set kern.geom.label.gptid.enable to 0, on the running system you can do:

Code:
sysctl kern.geom.label.gptid.enable=0

Also place the same in /boot/loader.conf so the setting is always 0:

Code:
kern.geom.label.gptid.enable=0
 
Back
Top