Deploying Multiple Systems ==> Drives, Filesystems, Imaging, Etc.

Phishfry · May 14, 2022

Can I add one point here. I think when configuring and planning your ZFS Pool it is very important to think about the consumers.

How fast is your network? You might expect your ZFS pool to be able to come near the saturation point of your network in terms of throughput. So plan your zraids according to the output speed you desire.

Many people come here saying my ZFS pool is so slow. That was why I mentioned NVMe and ZIL.
It can make hard disk pools faster. Enterprise class drives are desirable.

Phishfry · May 14, 2022

ralphbsz said:
But one warning: If one of the boot drives fails, you will probably have to be physically present, to convince the BIOS to actually boot.

This is handled by using EFI. It presents the gmirror option to boot. So it can boot with single drive or mirror.
I have tested this personally.
When you install FreeBSD it seems to write an entry to UEFI bios regarding boot drive.(Maybe it uses UUID)
So instead of a drive label EFI looks like this: UEFI Disk
I assume the efibootmgr is at play.

Install UEFI FreeBSD on gmirror

I wanted to post instruction for installing FreeBSD on a GEOM_MIRROR -aka- gmirror(8). This is an advanced topic so I assume you are capable of determining that your two chosen disks are empty. UFS RAID1 on FreeBSD is enabled with the geom_mirror module. I am using ada0 and ada1 as examples...

forums.freebsd.org

I do agree you will need to pull the dead drive to boot. The SATA Controller is not going to like a dead drive at boot.
That part is hard to test without a dead drive.

Dave-D · May 14, 2022

mer said:
ralphbsz That's what I like the most about this forum (speaking generally). Sharing ideas about how to do something. Everyone "knows" their way is the "best" but listening to others expands the knowledge base. Maybe my next system I use your ideas because they are a better solution.

Exactly.

"Iron sharpens iron."
.
.

Dave-D · May 14, 2022

ralphbsz said:
On the other hand: With 6 ports on the motherboard, I really don't think that more disk drive ports will be needed. In particular with modern disk capacity.

USEFUL TOOL: RAID CAPACITY CACULATOR:

RAID Capacity Calculator

RAID Capacity Calculator - evaluates capacity of different RAID types and configurations.

wintelguy.com

ralphbsz · May 15, 2022

Let's use a simple starting point, then people can make holes in it.

Drives 1 and 2: Make them into a pool using mirroring. That pool has ~2TB capacity, and can handle failure of any 1 drive. Use them as the boot disks, and use that pool for two purposes: First, as the root pool, where you install the OS. You will have lots of free space in there, use that for (temporary) backups. To make sure that backups don't fill the root file system, write the backups as a non-root user, and use quotas to make sure not too much space is used for backups.

Drives 3 through 6: Make them into a data pool, using RAID-Z2. That pool has ~4TB capacity, and can handle two disk failures. Use that for the~300GB of user data.

Will performance be adequate? Probably. I have no idea what your performance needs are, but most people's needs are amazingly low.

One thing you really need to do: Monitor disk health. Run smartd, and look at the output. Ideally set up an e-mail system that warns you if any disk has unusual metrics in there. Set up regular ZFS scrubbing. Again, set up e-mails that warn you if zpools are not in perfect shape, or scrubbing finds problems.

mer · May 15, 2022

Assuming ZFS throughout.
The mirror:
Create a dataset for the local backups to further separate root/OS from the backed up data. It also would let you zfs send/receive to offline storage.

Disk health. Plenty of good knobs to turn in /etc/periodic.conf to run things automatically. You can also have periodic output go to files instead of emails, but you must remember to look at them.
Scrubbing: regular interval is good. It can help catch issues before they become a big problem. A "rule of thumb" is every 3 months or so, but a lot depends on the quality of the drives and the rest of the system. Some say that consumer grade you want to run every month, high end enterprise maybe every 6 months.
Scrubs run over the amount of data, so if there's nothing on the disk, they complete quickly. If they are almost full, they take a while. That's important if the scrub is running during "real work", performance can drop.

Dave-D · May 16, 2022

ralphbsz said:
Let's use a simple starting point, then people can make holes in it.

Agreed. However, I propose that we start with solving the worst case scenario,
and then move on to find the best way to prevent that scenario.

Worst Case Scenario

Total drive failure.
System totally down.
Must get back up within hours.
No time to rebuild everything.

The main reason for all these schemes is to prevent this from happening in the first place.
But lets say it does happen.
For whatever reason ----> All my drives go away.

Now what?

1. First - restore the system (Cold Metal Restore)

I'm still wondering about cold metal restore with FreeBSD and still haven't heard any good and solid answers.

Formerly, I used Clonezilla to make an image of the (linux or windows) hard drive and stored that image on an external hard drive.
I could restore that image (either the whole drive or any partition) to a hard drive (only had to be same size or larger than the original hard drive.)
Could be up and running again within 15 minutes. Clonezilla does UFS, but it will not do ZFS.

What can be done with ZFS to completely restore the whole operation system and everything on that drive?
Lets say I need to be up and running again in one to three hours.
.
2. Second - restore the data

Needs little discussion at this point. Use whatever backups you have to restore data.
.

ralphbsz · May 16, 2022

mer said:
Assuming ZFS throughout.
The mirror:
Create a dataset for the local backups to further separate root/OS from the backed up data. It also would let you zfs send/receive to offline storage.

Good idea.

Dave-D said:
Agreed. However, I propose that we start with solving the worst case scenario,
and then move on to find the best way to prevent that scenario.

To begin with: with RAID, that worst case scenario becomes astronomically unlikely. The likely failure scenario is that one disk fails (either completely, or has a single error, or develops many errors), and the other disks serve the data. You need to have a plan for noticing when this happens, identifying the failed disk, and replacing it. ZFS will then handle rebuilding data onto the new disk. The overall reliability of the system depends crucially on how quickly you execute the plan, since the risk of total failure comes from a second (or third) disk failing before you had time to repair the first failure.

The worst case scenario is complete loss of a pool, because its fault tolerance has been overwhelmed by too many disks failing before you had time to repair the first failure.

In the case of the system pool, you are proposing to use a cold metal restore from an image copy. In a nutshell, what you are doing there is having one extra (external) disk in the mirrored RAID group for the system pool, but that extra disk is only updated rarely and manually. This is theoretically possible, and I happen to do the same thing with my system right now (by root disk is non-redundant, using UFS). But this brings up difficult questions. In the worst case, the first thing you have to do is to obtain spare empty disks. How are you going to do this? Ideally, you should have a spare on site. Maybe the spare *IS* the external disk with the copy of the root pool: it is in an external enclosure, updated once in a while, and spends the rest of its time on the shelf?

One of the things you have to plan and test (before releasing the system) is the procedure for recovering from a total disk failure. With ZFS, I do not know what commands would have to be executed. You should run through this once, and take extensive notes on what exactly needs to be done. And then store the notes in such a fashion that you can get to them even when the system is down.

In the case of the data pool, a total failure is much less likely (since we designed it to be 2-fault tolerant). It is so unlikely that other failure modes are now more important, such as user error (rm * for example). The idea of restoring from backups (which is likely to take a long time) is reasonable here. Again, this has to be documented and tested.

Dave-D · May 16, 2022

ralphbsz said:
In the case of the system pool, you are proposing to use a cold metal restore from an image copy. In a nutshell, what you are doing there is having one extra (external) disk in the mirrored RAID group for the system pool, but that extra disk is only updated rarely and manually. This is theoretically possible, and I happen to do the same thing with my system right now (by root disk is non-redundant, using UFS). But this brings up difficult questions. In the worst case, the first thing you have to do is to obtain spare empty disks. How are you going to do this? Ideally, you should have a spare on site. Maybe the spare *IS* the external disk with the copy of the root pool: it is in an external enclosure, updated once in a while, and spends the rest of its time on the shelf?

This is what I used to do with linux:
1. Build a new system, keep a document for each step of the process.
2. Track everything in a spreadsheet, have the step numbers, which steps were completed on the initial build.
3. Insert steps for cloning, and note which completed steps that clone included.
4. Clone when system completed, again, keeping notes.

If that computer would fail, I could bring back the server in less than 20 minutes. Never worried about the state of log files, but it worked well, for years.
We're talking low-budget small-business linux servers.

I could also bring back a clone of the data drive, mainly for the folder structure, then bring the data in by tape or a backup stored on external hard drive.

I would also use my clone and restore scheme for building similar client machines.
Bring back a clone of a "template" client, create record in my spreadsheet, then add/remove a minimum of software on the client,
updating my spreadsheet as necessary, then again, clone at the end of that computers build.
.
.

Jose · May 16, 2022

Dave-D said:
Agreed. However, I propose that we start with solving the worst case scenario,
and then move on to find the best way to prevent that scenario.

Worst Case Scenario

Total drive failure.
System totally down.
Must get back up within hours.
No time to rebuild everything.

This is why I used a GEOM mirror for my boot drives. The two drives are completely identical down to the boot sector. I use BIOS booting, and I have them set up as primary and secondary boot devices (doesn't matter which is which, they're identical.) Should the current boot drive fail completely, the system will automatically boot from the second drive.

Handling the case where the current boot drive has errors is more complicated. You'll have to detect the errors (look ralphbsz 's excellent suggestions regarding SMART mornitoring), and manually swap to the backup drive probably by turning the system off and physically replacing the bad drive.

I didn't know much about ZFS when I came up with this setup. It's likely you can accomplish something similar with a RAID1 zpool and UEFI booting.

Dave-D said:
For whatever reason ----> All my drives go away.

Sure; fire, earthquake, etc.

Dave-D said:
Now what?

1. First - restore the system (Cold Metal Restore)

I'm still wondering about cold metal restore with FreeBSD and still haven't heard any good and solid answers.

Formerly, I used Clonezilla to make an image of the (linux or windows) hard drive and stored that image on an external hard drive.
I could restore that image (either the whole drive or any partition) to a hard drive (only had to be same size or larger than the original hard drive.)
Could be up and running again within 15 minutes. Clonezilla does UFS, but it will not do ZFS.

What can be done with ZFS to completely restore the whole operation system and everything on that drive?
Lets say I need to be up and running again in one to three hours.

You're looking for ZFS snaphots if you're using a root zpool. Personally I believe setting up the system should be scriptable. There should not be any changes, besides configuration files, from a base Freebsd install + whatever custom packages I've built. No, I haven't written this script yet.

Dave-D said:
2. Second - restore the data

Needs little discussion at this point. Use whatever backups you have to restore data.

You'll need off-site backups. I'm interested in Tarsnap.

Dave-D · May 16, 2022

ralphbsz said:
In the case of the data pool, a total failure is much less likely (since we designed it to be 2-fault tolerant). It is so unlikely that other failure modes are now more important, such as user error (rm * for example). The idea of restoring from backups (which is likely to take a long time) is reasonable here. Again, this has to be documented and tested.

Agree completely. Needs to be figured out, tested and documented.
I'm currently doing test builds, so now is a good time for me to do that, if possible.
I'm willing to share info. so all can benefit, and maybe take these ideas further.

.
.

Dave-D · May 16, 2022

Jose said:
You'll need off-site backups. I'm interested in Tarsnap.

I just bought a book about Tarsnap.
"Tarsnap Mastery: Online Backups for the Truly Paranoid"
by Michael W. Lucas
Pretty sure that was from Ebay for around $20 new.

Jose · May 16, 2022

Dave-D said:
I just bought a book about Tarsnap.
"Tarsnap Mastery: Online Backups for the Truly Paranoid"
by Michael W. Lucas
Pretty sure that was from Ebay for around $20 new.

Buy them direct from Lucas and disintermediate.

Edit: Coz it's hard to convey tone over text. I bought the only Lucas book I own through Amazon. I didn't know better. All I can say in my defense is that I followed an affiliate link from Freshports, 'cause I find that site so useful so often.

Erichans · May 16, 2022

ralphbsz said:
[...] You need to have a plan for noticing when this happens, identifying the failed disk, and replacing it. ZFS will then handle rebuilding data onto the new disk. The overall reliability of the system depends crucially on how quickly you execute the plan, since the risk of total failure comes from a second (or third) disk failing before you had time to repair the first failure.

When a disk is starting to fail but is not completely "unresponsive", it can be very useful to have both the failing disk and the replacement disk in the system during resilvering. In general resilvering is a time-consuming and stressful activity.* Keeping the to-be-replaced disk in the system together with the new disk can speed up the resilvering process and offers better overall IO performance of the pool during resilvering versus disconnecting and taking out the to-be-replaced disk and exchange it with the new disk before resilvering. See also: Replacing a failing drive in a ZFS zpool

To accommodate this replacement procedure the ideal situation would be that you have an extra physical disk location (tray/bay etc.) available that is also equipped with an appropriate interface.

___
* for the discs in the pool and perhaps also for the sysadmin

ralphbsz · May 16, 2022

Jose said:
This is why I used a GEOM mirror for my boot drives. ..
I didn't know much about ZFS when I came up with this setup. It's likely you can accomplish something similar with a RAID1 zpool and UEFI booting.

You can definitely do it with ZFS. A friend of mine does. But: I don't know how to do it (since I'm still running non-redundant UFS for my boot drive at home).

Agree with your suggestion that a failing drive may be helpful during resilvering. But we have to underline MAY be, it is not guaranteed. Good example: A drive on which 99.9% of all the IOs succeed, and the remaining 0.1% fail cleanly with fast error returns: very helpful. Bad example: A drive on which 90% of the IOs succeed, but the reamaining 10% take a 5-minute timeout, and then cause a SATA error so severe it crashes the motherboard and requires a reboot. The second drive is not helpful.

Dave-D · May 17, 2022

Erichans said:
it can be very useful to have both the failing disk and the replacement disk in the system during resilvering. In general resilvering is a time-consuming and stressful activity.

I read that you can lose another disk during this stressful process.

So lets say we have RAID2z - we can lose 2 disks. But it takes a min. of 4 disks.
We just lost one, or its nearly dead. So we can afford to lose 1 more.
We start the resilver process, and now 1 more dies.
Now we're down to nothing.
Or, what if the admin makes a mistake, we're down to nothing.
You almost need a RAIDz3 for peace of mind. So now I'm talking 5 disks, minimum.

Unless you go with mirrors. They are much easier (less stressful on the system) to resilver.
With a 4-way mirror, you can lose 3 disks.

So far I've gathered the following:

RAIDz
(somehow) better for data intregrity (not sure how / or if true)
harder on the system to resilver
main benefit is space, while mirror is performance (at the expense of space)
lower performance that mirrors
higher system resource use than mirrors
cannot add to an existing RAIDz.

MIRRORS
easier on the system to resilver
main benefit is performace, while RAIDz is space (at the expense of performance.)
higher performance than RAIDz
lower system resource use than RAIDz
*can* add to an existing mirror.
*can* remove drives from an existing mirror.
*can* expand space in an existing mirror.
Overall more flexible when expanding a system.
And another one - if I lose everything but 1 drive, the whole system is on that 1 drive, which I assume I can access as 1 drive (?).
.
.
I think RAIDZx has a lot more of the "cool" factor.
But plain old mirrors may have a lot to offer for the small business type of server.
I can do a LOT of business inside 2TB worth of space.
.
.
my 2cents worth...
.
.

Dave-D · May 17, 2022

Jose said:
This is why I used a GEOM mirror for my boot drives.

I'm assuming this is the same basic idea as mirrors on ZFS?

See my last post, mirrors are starting to look like a lot better option for my use case.

I'm thinking:

1 x 3-disk mirror for O/S + LOCAL_BACKUP = 2TB total space
1 x 3-disk mirror for DATA = 2TB total space

-OR-

1 x 2-disk mirror for O/S + LOCAL_BACKUP = 2TB total space
1 x 4-disk mirror for DATA = 2TB total space

THEN

In the future, if I need more space, I can add larger drives to any existing mirror, when I've replaced all drives in that mirror, the mirror will automatically expand to the size of the larger drives.
Flexible. Simple. Easy. Nice.

Not so with RAIDz. Other than adding another pool.
.
.

cy@ · May 17, 2022

Dave-D said:
Is there no problem with running UFS on operating system drive, then ZFS on the other 5 drives?

Dave

I do this on all my systems. Not that I planned it that way but I installed the first one about 25 years ago and the rest were dump | restore clones of the first one. The ZFS partitions simply evolved over time. Long story short, you will have no problems.

However the UFS buffer cache and the ZFS ARC will compete for RAM. But since the system slices are rarely referenced (in my case) the UFS buffer cache remains small. The only time I see it grow, causing the ZFS ARC to shrink, is during installworld/installkernel.

Typically this is not what people tend to do but keeping the O/S on UFS does allow a person to clone the system using dump piped to restore (dump | ssh | restore) to another server (booted off ISO or my rescue drive) or simply clone my rescue drive to a machine, change a few settings in rc.conf and fstab and then boot. One may be able to do this with zfs send/receive but I haven't needed to try that.

As to how I chose to clone using dump piped to restore, I used to do this with Solaris UFS and Tru64 UFS back in the day. I also booted Solaris off UFS, using ZFS for data and booted Tru64 from its UFS while using AdvFS for data. It's not a new concept, just something I've done all my career.

Dave-D · May 17, 2022

cy@ said:
I do this on all my systems. Not that I planned it that way but I installed the first one about 25 years ago and the rest were dump | restore clones of the first one. The ZFS partitions simply evolved over time. Long story short, you will have no problems.

However the UFS buffer cache and the ZFS ARC will compete for RAM. But since the system slices are rarely referenced (in my case) the UFS buffer cache remains small. The only time I see it grow, causing the ZFS ARC to shrink, is during installworld/installkernel.

Typically this is not what people tend to do but keeping the O/S on UFS does allow a person to clone the system using dump piped to restore (dump | ssh | restore) to another server (booted off ISO or my rescue drive) or simply clone my rescue drive to a machine, change a few settings in rc.conf and fstab and then boot. One may be able to do this with zfs send/receive but I haven't needed to try that.

As to how I chose to clone using dump piped to restore, I used to do this with Solaris UFS and Tru64 UFS back in the day. I also booted Solaris off UFS, using ZFS for data and booted Tru64 from its UFS while using AdvFS for data. It's not a new concept, just something I've done all my career.

Are you aware of clonezilla? I've used it for years with linux & windoz.
Might be a lot easier than what you're doing??
It supports the UFS file system, but not ZFS.
clonezilla.org

gpw928 · May 17, 2022

Never underestimate the danger of replacing the wrong drive when one (or more) spindles fail in a RAID set.

I have seen it happen. Happily I was just a spectator.

I'm paranoid in dealing with this situation. It's why the IBM procedures for RAID maintenance walk the engineer through a process that eventually lights a bulb on the broken drive. However your drives probably won't have lights, and if you have multiple sites, you may have to rely on hired help.

So you need well defined procedures to defend against bad outcomes. GPT labels that encode disk location and serial number are usually part of the defense. This is done at system build time.

On the matter of using a UFS root, I too did that for ages because at the beginning ZFS had no boot option. Back then I actually had space allocated on the root mirror for two completely separate bootable root file systems. And I used them for upgrades because I had to have a fallback if something went wrong.

I have since switched to ZFS root. Boot environments were the reason, and they are great. You just have a different set of procedures to test and document in order to build your systems and recover from problems.

You don't need clonezilla if you know what you are doing. You do need to understand some first principles. The process to put a bootable ZFS root on a naked disk is well understood (and I'm happy to send you a well tested script).

Dave-D · May 17, 2022

gpw928 said:
The process to put a bootable ZFS root on a naked disk is well understood (and I'm happy to send you a well tested script).

Yes, please.

gpw928 said:
Never underestimate the danger of replacing the wrong drive when one (or more) spindles fail in a RAID set.

Exactly what I'm talking about.
I lose 1 drive.
Admin messes up by replacing the wrong drive.
Now we're sitting on pins and needles.
One more issue and its game over.

gpw928 said:
GPT labels that encode disk location and serial number are usually part of the defense.

Exactly.
Any idea what the character limit is on those labels?
Do you label the GPT partitions or is there a master label for the disk?
Do you use gparted to label the disk or some other method?

THANK YOU for your help!

Dave-D · May 17, 2022

gpw928 said:
Never underestimate the danger of replacing the wrong drive when one (or more) spindles fail in a RAID set.

Would it be impossible to "get into trouble" by pulling the wrong drive while using mirrors?
If I did pull the wrong drive - any single drive would have a complete set of data on it and doesn't "need" any other drive to "reconstruct" the data like RAIDz would?
As long as I had 1 complete and working drive from the mirror, could I not reconstruct the whole mirror from that one drive?

Sorry about peppering you with Q's.

But then, you've got me thinking...

A really appreciate all your help.

gpw928 · May 17, 2022

The plan to install a ZFS mirror'd root on a pair of naked disks is here. It's been used several times, and is configured for "Stage 2". You probably want to examine and set ZROOTSRC, ZROOTDST, SWAP, DEV0, and DEV1 appropriately for your needs. Pay attention to the comments around "zpool get bootfs". You have to have a running FreeBSD system to execute it.

Dave-D said:
Any idea what the character limit is on those labels?
Do you label the GPT partitions or is there a master label for the disk?
Do you use gparted to label the disk or some other method?

Partition labels are limited to 15 characters.
You have to label partitions, not whole disks.
You use gpart(8) to apply the labels (have a look at the script above).
Root disks need multiple partitions, configured in a variety of ways.
My root disks, created by the script above, are partitioned like this, which is based on the layout the FreeBSD 13.0 installer uses:

Code:

[sherman.143] $ gpart show ada0
=>       40  781422688  ada0  GPT  (373G)
         40       1024     1  freebsd-boot  (512K)
       1064        984        - free -  (492K)
       2048   33554432     2  freebsd-swap  (16G)
   33556480  180355072     3  freebsd-zfs  (86G)
  213911552   25165824     4  freebsd-zfs  (12G)
  239077376  134217728     5  freebsd-zfs  (64G)
  373295104  408127488     6  freebsd-ufs  (195G)
  781422592        136        - free -  (68K)

Boot is not mirror'd, but the boot partition on each disk needs to be identical.
Swap is a GEOM mirror on partition 2:

Code:

[sherman.149] $ gmirror status
       Name    Status  Components
mirror/swap  COMPLETE  ada0p2 (ACTIVE)
                       ada1p2 (ACTIVE)

[sherman.150] $ grep swap /etc/fstab
#/dev/mirror/swap      none        swap    sw        0    0
# With ".eli" appeded to the swap device, swapon(8) will set up GELI encrypt.
/dev/mirror/swap.eli      none        swap    sw        0    0

[sherman.151] $ swapinfo
Device          1K-blocks     Used    Avail Capacity
/dev/mirror/swap.eli  16777212        0 16777212     0%

The root is a ZFS mirror on partition 3:

Code:

[sherman.152] $ zpool status  zroot
  pool: zroot
 state: ONLINE
  scan: scrub repaired 0B in 00:02:52 with 0 errors on Wed Apr 13 14:13:59 2022
config:

    NAME                      STATE     READ WRITE CKSUM
    zroot                     ONLINE       0     0     0
      mirror-0                ONLINE       0     0     0
        gpt/236009L240AGN:p3  ONLINE       0     0     0
        gpt/410008H400VGN:p3  ONLINE       0     0     0

Partition 4 is a SLOG (ZFS mirror) for the tank -- only appropriate if you have "enterprise class" SSDs.
Partition 5 is an L2ARC (ZFS stripe) for the tank.
Partition 6 is unused (over-provisioning for the SSDs).
Data (tank) disks are generally created with one large partition, that partition is labeled, and then used to create the RAID set.
My tank is labeled like this with stack position and serial number encoded in the label:

Code:

[sherman.155] $ zpool status
  pool: tank
 state: ONLINE
  scan: scrub repaired 0B in 05:06:26 with 0 errors on Thu Apr 14 08:08:14 2022
config:

    NAME                      STATE     READ WRITE CKSUM
    tank                      ONLINE       0     0     0
      mirror-0                ONLINE       0     0     0
        gpt/L1:ZC1564PG       ONLINE       0     0     0
        gpt/L6:WMC1T1408153   ONLINE       0     0     0
      mirror-1                ONLINE       0     0     0
        gpt/L0:ZC135AE5       ONLINE       0     0     0
        gpt/L5:WMC1T2195505   ONLINE       0     0     0
      mirror-2                ONLINE       0     0     0
        gpt/L4:ZC12LHRD       ONLINE       0     0     0
        gpt/L3:WCC4N5CVZ6V4   ONLINE       0     0     0
      mirror-3                ONLINE       0     0     0
        gpt/L2:ZC1AKXQM       ONLINE       0     0     0
        gpt/L7:WE23ZTX9       ONLINE       0     0     0
    logs    
      mirror-4                ONLINE       0     0     0
        gpt/236009L240AGN:p4  ONLINE       0     0     0
        gpt/410008H400VGN:p4  ONLINE       0     0     0
    cache
      gpt/236009L240AGN:p5    ONLINE       0     0     0
      gpt/410008H400VGN:p5    ONLINE       0     0     0

This is how I created the tank (it could be improved, but shows what's needed):

Code:

LABELS="L6:WMC1T1408153
L5:WMC1T2195505
L3:WCC4N5CVZ6V4
L1:ZC1564PG
L0:ZC135AE5
L2:ZC1AKXQM
L4:ZC12LHRD
L7:WE23ZTX9"

n=0
for label in $LABELS
do
    gpart destroy -F /dev/da$n
    gpart create -s gpt /dev/da$n
    gpart add -t freebsd-zfs -l "$label" /dev/da$n
    n=$(($n+1))
done

# I'm pairing these manually, old with new for mirror reliability (not speed :-)
M0="/dev/gpt/L1:ZC1564PG" 
M1="/dev/gpt/L6:WMC1T1408153"
M2="/dev/gpt/L0:ZC135AE5"
M3="/dev/gpt/L5:WMC1T2195505"
M4="/dev/gpt/L4:ZC12LHRD"
M5="/dev/gpt/L3:WCC4N5CVZ6V4"
M6="/dev/gpt/L2:ZC1AKXQM"
M7="/dev/gpt/L7:WE23ZTX9"

# Create the new tank as 4 x 2x3TB mirrors
eval zpool create tank \
    mirror $M0 $M1 \
    mirror $M2 $M3 \
    mirror $M4 $M5 \
    mirror $M6 $M7
zfs set compression=lz4 tank
zpool status

gpw928 · May 17, 2022

Dave-D said:
Would it be impossible to "get into trouble" by pulling the wrong drive while using mirrors?
If I did pull the wrong drive - any single drive would have a complete set of data on it and doesn't "need" any other drive to "reconstruct" the data like RAIDz would?
As long as I had 1 complete and working drive from the mirror, could I not reconstruct the whole mirror from that one drive?

That would depend on whether your drives were "hot swapable".
If you pull the wrong "how swap" drive from the system, it can get corrupted when you pull it.
I would always choose to shut down a system before pulling a drive, it I had the option (you tend not to have the option in large enterprises).
That way you can compare the serial number on the label to what you expect to see, and get some wiggle space to back track.

[Often, when drives fail, the disk (and its label) will disappear from the status displays. But if all the others can be seen, then the missing one may be deduced.]

cy@ · May 17, 2022

Dave-D said:
Are you aware of clonezilla? I've used it for years with linux & windoz.
Might be a lot easier than what you're doing??
It supports the UFS file system, but not ZFS.
clonezilla.org

I've used clonezilla on Linux at $JOB. But my approach predates clonezilla by about 10-15 years. I conceived it when I switched from MVS (IBM mainframe) to UNIX (Solaris, HP/UX, DG-UX, OSF/1), when patching would keep a server down for 1-3 hours instead of mere minutes. What I did was pretty much what Sun did when they implemented UFS boot environments. I shared the approach with Sun in 1995.

On the mainframe we'd patch the inactive disk, then reboot the inactive disk during a change window. Patching would take many weeks of research, planning, and implementation while the reboot took less than 30 minutes. When I started work on UNIX in 1992 the first thing that came to mind was, how backwards the UNIX patching and system install process was.

Basically it's:

Boot from ISO, slice and newfs filesystems,

cd /a && ssh server dump 0f - / | restore xf -
cd /a/var && do the same as above for /var.
Change /a/etc/rc.conf & /a/etc/hosts with new IP addresses, and fixup /etc/fstab as necessary. Reboot.

As I said, this was developed on Solaris 2.3 at the time (1995), long before clonezilla was a thing. And, works on Tru64 and all BSD variants. If it ain't broke, don't fix it.