Root on ZFS using GPT - Replacing Failed Drive

I have found several good instructional pages on how to install Root File system on ZFS using GPT however, I have not found any instructions walking through the process of setting up the replacement drive.

In the case of Root on ZFS, there are several steps required that must be followed to get it set up correctly.

Has anybody documented the best practices for replacing a failed drive when using Root on ZFS using GPT?
 
When installing ZFS on root, you will generally follow some steps like the below.

Code:
gpart create -s gpt adaX
gpart add -t freebsd-boot -s 128k adaX
gpart add -t freebsd-swap -s 4G adaX
gpart add -t freebsd-zfs adaX
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 adaX

These steps create the GPT partition table, create 3 partitions (one for boot, one for swap and one for ZFS), then add bootcode to the first parition (the freebsd-boot one). It's generally best practice to configure all disks in a root pool exactly the same. This is especially important for the bootcode because if you only add a boot partition and code to disk 1, your machine will obviously fail to boot if that disk fails, even though you may have redundancy in the pool.

Because all the root disks should be configured the same, when replacing a disk you will want to simply perform exactly the same steps you did when you partitioned the original disks. It's then just a case of using zpool replace pool adaApX adaBpX to replace the missing disk in the pool. (You don't need to specify the device twice if it's exactly the same)

There used to be advice to run zpool offline pool adApX before the replace but if the device has actually failed I'm not sure how much difference, if any, that makes.
 
Thanks @usdmatt.

I had similar steps figured out but was not sure if gpart bootcode ... was enough to get the boot partition configured correctly. I thought that maybe I would also need to clone the boot partition using dd(1) and reset the UUID first. I am pretty new to this.

One suggestion to your script that differs from mine is that adding -l switch to the gpart add lines:
Code:
gpart create -s gpt adaX
gpart add -t freebsd-boot -s 128k  -l gptboot<n> adaX
gpart add -t freebsd-swap -s 4G  -l swap<n> adaX
gpart add -t freebsd-zfs  -l zfs<n> adaX
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 adaX
Where:
n is a unique number 0 or 1.
Must be different than good drive's labels.
This conforms with nomenclature of FreeBSD installer setup

Also, specifying the size of the ZFS partition may be a good idea to ensure flexibility in adding new drives in the future. If left blank, it will take the rest of the space and this may be greater than available space for a future replacement drive. I saw a post recommending constraining the ZFS partition size slightly on initial configuration to allow for flexibility in replacement drives later on which seemed logical.

Regarding zpool offline command, I believe this makes sense if the failed drive is still connected. If it has been physically removed, it is redundant and will result in a benign warning message.
 
Last edited by a moderator:
Yeah I usually use GPT labels but I didn't bother in this post as I wanted to keep it as simple as possible.
I can't remember the last time I had drives of the same size that actually differed but I have no problem with people being cautious and making the partition a bit smaller.
 
Back
Top