ZFS Replacing all disks in a ZFS mirror without taking it offline.

I couldn't find this described anywhere in a single post and have pieced it together from documentation (mostly the FreeBSD Handbook and zpool man page).

I have an existing pool (zdata) that is a mirror made up of two physical disks. I would like to upgrade this pool by replacing both drives with larger drives. I would like to do so without taking the pool offline at any time during the process. I think I can do so as follows:

My existing mirror is made up of da0 and da1 (raw devices, no partitioning), of 2TB each.

I wish to replace these with two new devices of 5TB each. (da2 and da3)

1) Add a new disk to the existing pool, creating a 3-way mirror. (1)

Code:
zpool attach zdata da0 da2

This triggers a resilver(3). So we wait several hours and come back when it is done.

2) Split the pool by moving an old disk to a new pool as backup and export it. (2)

Code:
zpool split zdata olddata da0
zpool export olddata

You can now safely remove the da0 disk (assuming you have hot-swap drives).

3) Add the other new disk to once again have a 3-way mirror.

Code:
zpool attach zdata da1 da3

And again wait for the resilver to complete before removing the last remaining old disk from the pool.

Code:
zpool split zdata olddata2 da1
zpool export olddata2

It is now safe to remove the other old drive.

4) Expand the new pool to take up the new expanded available space.

Code:
zpool online -e zdata da2 da3


Did I miss anything?

(sept 1: updated to reflect the actual process I used following SirDice's advice.)

Footnotes:

(1) I don't have enough ports to have all 4 disks attached at the same time. If I did, I would have either made a 7TB pool of two mirrors, or put 4 disks in one mirror, then removed the old disk.

(2) You could split the existing pool first, export the new pool and remove that disk to free a port for the first new drive if you don't have enough room to add a new drive before removing the old ones.

(3) In my case zpool estimated 2 hours and 11 minutes to resilver but it actually took about 4 hours and 40 minutes. However, the system was running (and in-use) at the time.
 
You could just:
* create a one drive newpool
* send/recv the whole oldpool to newpool
* offline and remove oldpool
* attach 2nd device to make newpool a mirror (resilver)

Main benefits: fewer pool actions; fewer chances to accidentally add a device (extending pool) rather than attach a device (mirroring.) Also less convoluted to understand; no split command you need to get just right. You could also choose what amount of your snapshot history you want to carry forward to the new pool.
 
I'm sure I've done that, but I can't remember exactly how. Unfortunately, mine wasn't carefully pre-planned, but had to be done in response to a drive dying: I used that as an opportunity to buy two new (larger) drives and enlarge the file system.

Ah, just found the notes: Super simple. Connect the new drive. Replace one old drive with a new drive, using zpool replace. Wait for resilvering to finish. Remove that old drive. Repeat with the second pair (replace from old2 -> new2). Finally give the command to enlarge the file system (which I can't find in my notes right now). Done. What's wrong with this simple solution?
 
Nothing is wrong with that approach if you want to upgrade to a bigger pool. Perfectly valid and simple!

Stories of add / attach whoopsies are out there, and a big "ZFS is dangerous" FUD talking point. "Replace" is more semantically distinct, while "add" and "attach" are close together in meaning, which is why I think being very careful around those verbs is good practice.

I was just offering up a slightly more flexible approach for consideration with the added benefit of avoiding attach actions. You will also get a "fresher" (less fragmented) layout with the send/recv. The downside (with send/recv approach) is you have a period (during resilver) where your live copy has no redundancy, but you do have a recent offline (and redundant) copy... so that should be considered, as a drawback, but perhaps tolerable one depending on what your risk tolerance / usage is like.
 
I may have missed it but the original steps seems to have a moment when there's no redundancy. Assuming A and B are the original disks, and C and D the new ones:
  • A - B - C ; new disk added to mirror
  • B - C ; old A disk removed
  • B - C - D ; other new disk added
  • C - D ; last old disk removed.
 
You could just:
* create a one drive newpool
* send/recv the whole oldpool to newpool
* offline and remove oldpool
* attach 2nd device to make newpool a mirror (resilver)

Main benefits: fewer pool actions; fewer chances to accidentally add a device (extending pool) rather than attach a device (mirroring.) Also less convoluted to understand; no split command you need to get just right. You could also choose what amount of your snapshot history you want to carry forward to the new pool.

If I had thought of it and had the spare ports, I would have created an entirely new mirror pool, did a send/recv and had a brief outage to put the new pool on the old mount point (after exporting the old pool). Done the way I did, the pool never went offline and always consisted of at least 2 drives.

I may have missed it but the original steps seems to have a moment when there's no redundancy. Assuming A and B are the original disks, and C and D the new ones:
  • A - B - C ; new disk added to mirror
  • B - C ; old A disk removed
  • B - C - D ; other new disk added
  • C - D ; last old disk removed.

Thanks, I think I cleared up my list to follow that process.
 
I may have missed it but the original steps seems to have a moment when there's no redundancy. Assuming A and B are the original disks, and C and D the new ones:
  • A - B - C ; new disk added to mirror
  • B - C ; old A disk removed
  • B - C - D ; other new disk added
  • C - D ; last old disk removed.
The original steps have redundancy the whole way; it's all mirrors, so it starts 2-way, then 3,2,3,2-way for the steps you list. My send-recv alternative is the one that lacks live redundancy for a bit. (After S/R, oringinal mirror AB offlined to make physical room/port available and CD resilver in progress.)
 
The original steps have redundancy the whole way; it's all mirrors, so it starts 2-way, then 3,2,3,2-way for the steps you list. My send-recv alternative is the one that lacks live redundancy for a bit. (After S/R, oringinal mirror AB offlined to make physical room/port available and CD resilver in progress.)

Actually, in my original sequence, I think I took both old disk out before I added the 2nd new disk (I fixed that in the OP to follow SirDice's advice).
 
Nothing is wrong with that approach if you want to upgrade to a bigger pool. Perfectly valid and simple!

Stories of add / attach whoopsies are out there, and a big "ZFS is dangerous" FUD talking point. "Replace" is more semantically distinct, while "add" and "attach" are close together in meaning, which is why I think being very careful around those verbs is good practice.

If I used "replace" what is the effect on the old disk? Can I still mount it as a separate pool to access the data?

One of the reasons I did the attach/split was so that the old disks act as a point in time backup in case the new drives are either bad or I do something horribly wrong and can't recover. In a production environment, I would have a proper full backup system to protect against my own stupidity. On my home server, I haven't got the capacity to backup all the data involved (I'm relying on the mirror).
 
After replace is done, the old disk is not in that pool any more. It is in no pool, and you could do with you as you please: Add it to an existing pool for more capacity, attach it to an existing pool for more redundancy, create a new pool. Or what happened in my case, throw it in the trash because it was getting too many errors (actually, I saved the head and platters for a mobile that I'm going to hang somewhere).

But I buy your "old disk becomes a separate pool that is a backup" argument too. By the way, you seem to be aware that you really need a proper backup strategy. Remember there are only two kind of people: Those who religiously perform backups, and those who haven't lost their data yet.
 
Not sure if you need to take the pool offline for this but maybe it is better.

Code:
 zpool replace [-f] <pool> <device> [new-device]
 
But I buy your "old disk becomes a separate pool that is a backup" argument too. By the way, you seem to be aware that you really need a proper backup strategy. Remember there are only two kind of people: Those who religiously perform backups, and those who haven't lost their data yet.

I am a firm believer in backup. At home my backup(s) are: locally attached USB drive on each computer that gets nightly backups (using native backup tools). NAS (WD MyCloud) that gets weekly backups of each system. Very important files go into encfs encrypted files which are synced across two cloud providers and between the various desktops. I am, however, still lacking full offsite backup.

The only thing that lacks backup is the media library, which is why it gets the mirrored zfs instead. My NAS is not big enough to back up the desktops, server, and media library.
 
Back
Top