Deploying Multiple Systems ==> Drives, Filesystems, Imaging, Etc.

Dave-D · May 11, 2022

Hi. I'm new to FreeBSD, but learning.

Want to make sure I'm heading in the right direction.

Currently doing test builds on servers and clients.

Here is what I'm considering, any feedback is greatly appreciated.

SERVERS:
For 3 small businesses.
Each server = 6 hard drives,
1 boot/operating system drive, (UFS or ZFS, not yet sure)
3 mirrored "Data" drives. (ZFS)
2 mirrored "Local Backup" drives. (ZFS)
(each server has a SCSI tape drive for long term storage.)

SERVER QUESTIONS:
1.) Does this seem like a reasonable drive arrangement?
2.) Re: the boot/operating system drive, if I go with UFS I can use Clonezilla to image, for quick cold-metal restores.
I've yet to work with ZFS, not sure if can clone/image a whole system like with Clonezilla.
Clonezilla does not support ZFS.
Whats the best choice for this drive - ZFS or UFS, and why?

CLIENTS:
I'll be deploying the same basic client, with minor differences, to 20+ workstations, possibly more.
Would like to store the setups (images) in some way for rapid deployment.
1 (single) or 2 (mirrored) drives containing everything. (ZFS)

CLIENT QUESTIONS:
1.) Whats the better file system for clients, ZFS or UFS?
UFS I can image with Clonezilla.
I've yet to work with ZFS, not sure if can clone/image a whole system like with Clonezilla.
Clonezilla does not support ZFS.
Whats the best choice for client drives - ZFS or UFS, and why?

Any and all feedback greatly appreciated!
Sharing of similar use-cases greatly appreciated, if possible!!

Thanks,
Dave

Geezer · May 11, 2022

ZFS vs UFS
They are both good. It is a personal choice. You should try both and make your own decision. Neither would be a mistake so you cannot go too wrong.

No need to clone with third party software, all the tools are there in FreeBSD. But better still, if you have to install a fresh, then install a fresh!

Dave-D · May 11, 2022

Geezer said:
ZFS vs UFS
They are both good. It is a personal choice. You should try both and make your own decision. Neither would be a mistake so you cannot go too wrong.

No need to clone with third party software, all the tools are there in FreeBSD. But better still, if you have to install a fresh, then install a fresh!

Is there no problem with running UFS on operating system drive, then ZFS on the other 5 drives?

Dave

richardtoohey2 · May 11, 2022

Geezer said:
ZFS vs UFS
They are both good. It is a personal choice. You should try both and make your own decision. Neither would be a mistake so you cannot go too wrong.

This; personally I've not yet moved off UFS (with/without hardware RAID) but ZFS has a lot of good stuff in it (including snapshots, boot environments etc.).

Everytime I start looking at ZFS, though, there seems to be someone popping up in the forums with a ZFS issue and I get a bit nervous. But I think that is true of any technology - the majority of people have no issues so they don't have anything to say, so the bulk of any "noise" is the small portion of users who do encounter an issue.

As Geezer says, though, it's probably best to try both and see how you go.

Phishfry · May 11, 2022

Dave-D said:
Is there no problem with running UFS on operating system drive, then ZFS on the other 5 drives?

No problem at all. You could even run a UFS gmirror for the OS drive and ZFS for the tank.

For me it was an easier learning experience to separate the OS from the data.

gpw928 · May 11, 2022

ZFS has a great unique feature called boot environments. Look into that before making up your mind.

But... ZFS has problems if your disks get too full. So keep that firmly in mind when sizing systems. Don't let them go past about 80%.

Also, I would never build any "business" system without some sort of redundancy on all the disks.

With 40 years in IT, I have seen a *lot* of failed disks. You don't want one dead disk to ruin your day.

It's easy enough to clone a running zfs system, provided you can hot swap in/out the target disk of the clone on a system running ZFS.

I'd be concerned about the SCSI tape drives. I threw out all my tapes and tape drives a long time ago. However, we'd need to examine your data capacities before making any sensible suggestions regarding the backup strategy.

gpw928 · May 11, 2022

Phishfry said:
No problem at all. You could even run a UFS gmirror for the OS drive and ZFS for the tank.

That's exactly what I did at the outset, but I have converted to a ZFS root now to get the advantage of boot environments.

Phishfry said:
For me it was an easier learning experience to separate the OS from the data.

That's best practice at any time.

Dave-D · May 11, 2022

gpw928 said:
ZFS has a great unique feature called boot environments. Look into that before making up your mind.

But... ZFS has problems if your disks get too full. So keep that firmly in mind when sizing systems. Don't let them go past about 80%.

Also, I would never build any "business" system without some sort of redundancy on all the disks.

With 40 years in IT, I have seen a *lot* of failed disks. You don't want one dead disk to ruin your day.

It's easy enough to clone a running zfs system, provided you can hot swap in/out the target disk of the clone on a system running ZFS.

I'd be concerned about the SCSI tape drives. I threw out all my tapes and tape drives a long time ago. However, we'd need to examine your data capacities before making any sensible suggestions regarding the backup strategy.

RE: ZFS Sizes:
Set a ZFS quota at around 75%?

These are all small businesses so if server goes down for a day its not good,
but not catastrophic. Still, best not to go there at all, if possible.

Redundancy on ALL disks:
Server motherboard only has 6 ports.

Could add a card?
Do you know of any non-scsi sata cards so I can add another drive?
(I've heard that ZFS does NOT like hardware SCSI. Even though could disable,
best not to have a card with SCSI, if possible)???

Otherwise:
What if I change my drive arrangement . . .
Go with 2 mirrored operating system drives, (ZFS)
and reduce "Local Backups" drive to one drive? (ZFS)
If Local Backups goes down, the data is still on the 3 mirrored "Data Drives"
Local Backups will be for misc. stuff, depending on how it gets used,
non-catastrophic if it goes away.
BUT - if operating system goes away, end of story, until restored.

BACKUPS:

Tape drives:
HP Ultrium 1760 LTO-4 SCSI
800GB/1.6TB
Mainly for incremental backups during the week (non-attended).

Current data size: Approx. 300GB on each server.

May also use portable drive enclosures (for rotating off-site backups)
Possibly some cloud-based.

Not sure yet, I'm new to this, will have to figure it out as we go.

Dave

gpw928 · May 11, 2022

I'd mirror the root, and take the risk on the backups disk. Or move the backups disk to external USB disk (easier to rotate).

For the price of an LTO tape drive you could buy a bunch of 500 GB USB disks for off-site rotation.

Study the USB spec very carefully. Understand 3.1, 3.2, Gen 1, Gen2, and make sure you match accordingly.

Is this a single customer here, or many unrelated? What's the geographic topology? Are there network connections between hosts? Back to base? How much bandwidth? [I'm trying to figure out if backups might be run over a network connection to remote sites.]

Phishfry · May 11, 2022

Dave-D said:
Do you know of any non-scsi sata cards so I can add another drive?

NVMe makes for nice addition. Maybe you could use part of it for your ZFS cache too.

richardtoohey2 · May 11, 2022

+1 to NVME, and you can get PCIe cards for M.2, including fancy (too fancy?) ones that will let you fit two M.2 NVMEs and do RAID 1 for you. Or two M.2 NVMEs (and maybe 4?) with no RAID. Not sure if any performance penalty for that sort of set-up, though.

Dave-D · May 11, 2022

NVME is not an option at this point.

Dave-D · May 11, 2022

gpw928 said:
I'd mirror the root, and take the risk on the backups disk. Or move the backups disk to external USB disk (easier to rotate).

I'm planning on buildling a backup routine using
External H.D's in quick-swap drive enclosures.
Internal SCSI Tape Drive
Cloud (encrypted)

ralphbsz · May 11, 2022

You don't need dedicated OS disks for the root filesystem. The root filesystem is likely small enough, it can be done with small partitions on a diet that is otherwise used for data. How about this: two disks are partitioned, with a small partition for root (mirrored) plus a large partition for backup (also mirrored). The other four disks are for data (mirrored or better RAID-Z2).

Dave-D · May 11, 2022

ralphbsz said:
You don't need dedicated OS disks for the root filesystem. The root filesystem is likely small enough, it can be done with small partitions on a diet that is otherwise used for data. How about this: two disks are partitioned, with a small partition for root (mirrored) plus a large partition for backup (also mirrored). The other four disks are for data (mirrored or better RAID-Z2).

You might be onto something.
I'm not doing the dedicated OS disks for size reasons, but rather to keep the OS as separate from the data, I've heard this is a better/cleaner/easier way to go, if you can do it.
The "Local Data" drive will serve (really not sure yet, remains to be seen) possibly for database dumps, system images, ZFS snapshots/clones/etc,
Your idea of combining OS and Local Data drives is brilliant.
But, I would have to do the install inside of a partition, I haven't been able to figure out how to do that. But I really like you're idea.

Something to think about:
RAIDZ2 takes 4 drives and can lose up to 2.
Mirror with 4 drives can lose up to 3.
Also, I assume that the surviving mirrored drive has all the data structures intact. With RAIDZ2 the surviving 2 drives has my data scattered across 2 drives, not a direct "look at your original data and work with it directly" type of situation.

Or, what about 3 mirrored drives for "OS+Local Backups" + 3 mirrored drives for "Data"?

Which, in your opinion, would be better?

Thank you for your idea!
Dave

mer · May 11, 2022

Just my opinions:
I like the idea of separate OS and Data drives. Makes upgrading easier. You can have smaller OS devices, say 250G is plenty and have bigger Data drives. I acknowledge ralphbsz solution as a valid idea, but since we're talking opinions, I like separate.

ZFS, ZFS mirror for the boot devices. Why? Boot Environments. Best way to do upgrades, simply because rollback is "reboot, stop in boot loader, choose a working BE, boot" Mirrors because honestly, the OS device is mostly read only after boot (assumes configuration has already been done). Log files are about the biggest thing that is read/write.
Why Mirror boot device? Protection against the boot device physically failing. You may have to be physically present to boot from the other one in the pair, but it should come up and let you replace the failed device.

Mirror with more than 2 vs RAID configurations. Mirrors are only as big as the smallest device. The RAIDs give you various multipliers. Performance: Mirrors read complete at the fastest device, writes complete at the slowest. RAIDs are in between.
I have no input on the optimal RAID configuration for 4 or more drives.

Erichans · May 11, 2022

Dave-D said:
RAIDZ2 takes 4 drives and can lose up to 2.
Mirror with 4 drives can lose up to 3.

Assuming the 4 drives are all the same size:

RAID-Z2 with 4 drives leaves you with two disks worth of available disk space for your data.
A mirror with 4 drives of which you can afford to lose 3 is a 4-way mirror.
That leaves you with one disk worth of available disk space for your data.

Comparing (short version) 4 disks and both leaving 2 disks worth for your data:

A pool consisting of one vdev with RAID-Z2: any 2 disks can fail and you still have all your data*
A pool consisting of two vdev-s, 2 mirrors of 2 disks:
- any one disk can fail and you still have all your data*;
- 2 disks can fail if they are not part of the same vdev; you still have all your data*.

Losing one vdev in a pool means losing the whole pool**. More info: 4 drives coming - raid-z1, or what; also the thread containing my earlier message: pool layout & references

Please take under consideration that of the maximum available disk space for your data, the utilization should—preferably—max out at ca. 80%.

Edit:
* italic texts added
** losing the pool = losing your data in that pool. There is no (UFS) fsck(8) equivalent for ZFS: you'll have to resort to backups.

Dave-D · May 11, 2022

mer said:
Just my opinions:
I like the idea of separate OS and Data drives. Makes upgrading easier. You can have smaller OS devices, say 250G is plenty and have bigger Data drives. I acknowledge ralphbsz solution as a valid idea, but since we're talking opinions, I like separate.

ZFS, ZFS mirror for the boot devices. Why? Boot Environments. Best way to do upgrades, simply because rollback is "reboot, stop in boot loader, choose a working BE, boot" Mirrors because honestly, the OS device is mostly read only after boot (assumes configuration has already been done). Log files are about the biggest thing that is read/write.
Why Mirror boot device? Protection against the boot device physically failing. You may have to be physically present to boot from the other one in the pair, but it should come up and let you replace the failed device.

Mirror with more than 2 vs RAID configurations. Mirrors are only as big as the smallest device. The RAIDs give you various multipliers. Performance: Mirrors read complete at the fastest device, writes complete at the slowest. RAIDs are in between.
I have no input on the optimal RAID configuration for 4 or more drives.

How about these two ideas:

OPTION 1:

6 Drives total, as follows:

VDEV #1 = 2 drives @ 500GB ea. (mirrored) = 500GB total size, (optionally) partition to use outer 250GB (for speed) (x.8 = 200GB avail.) ==> [OS]
(can lose 1 drive)
VDEV #2 = 4 drives @ 2TB ea. (RAIDZ2) = 4TB total size, 1 partition for 3GB (x.8 = 2.4TB avail.) [DATA], 1 partition for 1GB (x.8=800GB avail.) [LOCAL_BACKUP] ==> [DATA + LOCAL_BACKUP]
(can lose 2 drives)

(??) (not sure I like that setup, only end up with 2.4TB Data, 3GB Data partition is spread across 2 drives in the pool (since drive is only total size 2TB - not sure if that might be an issue.)

OPTION 2:

6 Drives total, as follows:

VDEV #1 = 2 drives @ 2TB ea. (Mirrored) = 2TB total size, 1 partition for 250GB (x.8=200GB avail.) [OS], 1 partition for 1750GB (x.8=1400GB avail.) [LOCAL_BACKUP] ==> [OS + LOCAL_BACKUP]
(can lose 1 drive)
VDEV #2 = 4 drives @ 2TB ea. (RAIDZ2) = 4TB total size, 1 partition for 4GB (x.8 = 3.2TB avail.) [DATA] ==> [DATA]
(can lose 2 drives)

**HOWEVER**
re: Erichans ... Does the 4TB in my OPTION 2, VDEV #2 come from one VDEV= 4TB available, or two mirrored vdev's = 4TB avail?
The capacity of 4TB is little confusing. I don't want to risk losing any 2 drives *AS LONG AS* they are not in the same VDEV, sounds scary to me.

Thoughts....?
.
.
.

gpw928 · May 12, 2022

mer said:
I like the idea of separate OS and Data drives. Makes upgrading easier. You can have smaller OS devices, say 250G is plenty and have bigger Data drives. I acknowledge ralphbsz solution as a valid idea, but since we're talking opinions, I like separate.

ZFS, ZFS mirror for the boot devices. Why? Boot Environments. Best way to do upgrades, simply because rollback is "reboot, stop in boot loader, choose a working BE, boot" Mirrors because honestly, the OS device is mostly read only after boot (assumes configuration has already been done). Log files are about the biggest thing that is read/write.
Why Mirror boot device? Protection against the boot device physically failing. You may have to be physically present to boot from the other one in the pair, but it should come up and let you replace the failed device.

I agree 100% with everything said above.

In addition, with the tank physically separate from the root, you can export the tank (applications and data) and optionally send it anywhere you like. You can do anything you want to the OS and the media it lives on knowing that you can not impact the tank, including replacement of the media or re-provisioning a new physical system (in another location, if you want), optionally bring the tank back, and import it into the system. These options greatly enhance your capacity to deal with adversity. And, if thought out ahead of your system builds, so you test the processes when you initially deploy the systems, will set you up for a much easier life with maintenance. So, keep the root separate from applications and data, and either document or automate provisioning of the root.
I'd also recommend the same approach for the applications in the tank.

My thoughts on the layout options are that you have to figure out your own level of risk aversion. I run a tank of four striped 2-spindle mirrors, so I can afford lose one drive in each mirror. That configuration gives me the best performance possible, with enough redundancy to make me feel comfortable. But I'm physically present with the system, it's on a UPS, and I have several brand new 4 TB drives ready to deploy at a moment's notice.

Dave-D · May 12, 2022

gpw928 said:
I agree 100% with everything said above.

In addition, with the tank physically separate from the root, you can export the tank (applications and data) and optionally send it anywhere you like. You can do anything you want to the OS and the media it lives on knowing that you can not impact the tank, including replacement of the media or re-provisioning a new physical system (in another location, if you want), optionally bring the tank back, and import it into the system. These options greatly enhance your capacity to deal with adversity. And, if thought out ahead of your system builds, so you test the processes when you initially deploy the systems, will set you up for a much easier life with maintenance. So, keep the root separate from applications and data, and either document or automate provisioning of the root.
I'd also recommend the same approach for the applications in the tank.

My thoughts on the layout options are that you have to figure out your own level of risk aversion. I run a tank of four striped 2-spindle mirrors, so I can afford lose one drive in each mirror. That configuration gives me the best performance possible, with enough redundancy to make me feel comfortable. But I'm physically present with the system, it's on a UPS, and I have several brand new 4 TB drives ready to deploy at a moment's notice.

I agree with you completely.

It just makes sense.

A couple questions:

1.) I've heard about keeping the applications separate from the operating system, but it sounded like there was no clean or simple way to do that short of major surgery by a tech with greater knowledge and ability than what I currently possess. Can you offer comments on this and possibly point to where I could get more information, if at all possible?

2.) What do you think about the OS and the LOCAL_BACKUP being in 2 partitions on the first 2 mirrored drives? Then 4 drives for data, either mirrored or more likely RAIDZ2? Or would you make the OS drive pure-OS-only, and nothing else, always?

Thank you for your excellent post.

gpw928 · May 12, 2022

Automated provisioning, and configuration management, are ubiquitous at the big end of town. But there's a lot to master if you have never been there. And the real benefits come when you have a large fleet of systems. Puppet was probably the first of the configuration management tools. But there's now many more.

If you only have a few near-identical systems to deploy, then keeping meticulous records of everything you do (and aiming to script it) may be a satisfactory mechanism. You start with detailed records of how to install the root.

It's true that the installation of packages and applications will want to make changes in the root, e.g. to create accounts, or install configuration files. But you can easily identify these:

Code:

touch stamp
# install something
sudo find / -newer stamp

In that way you can identify what your applications have added to the root and immediately document all the changes.

If you keep good documents, repairing broken systems, and automating new builds, gets a whole lot easier.

As Phishfry implies above, iff you get enterprise class low latency SSDs with end-to-end data protection for the root mirror (think Intel SSD D3 Series) you could place a ZFS intent log (ZIL), and a ZFS L2ARC on the SSDs. The benefit of this would very much depend on the nature of the I/O load, but it could be substantial. However it doesn't make sense to buy expensive SSDs for backups...

I'm guessing the backups (in this explicit context) are lower in value because they are recoverable in other ways, so that means that they can be relegated to less premium storage.

You would need to flesh out your ideas on how the backups are going to work. i.e. how much data routinely goes offsite, when it goes, how long it stays offsite, and how easily it can be accessed for recovery before it's possible to comment in detail on the local backup question.

Consider that if you can contain your application data in one or more dedicated ZFS file systems, you can pause your application, snapshot the file system (it takes no time), re-start the application and send the snapshot to backup media as a file.

The obvious advantage of tape is that, occasionally, you can put one away "for ever". But that's not quite true, as the media will eventually become unreadable unless it goes through a routine refreshment cycle. You may be able to get a similar benefit by the long term retention of a compressed snapshot on an external USB disk. e.g. a 14TB external USB disk costs a few hundred bucks, and I suspect you could get ten of these drives for the price of a single LTO tape drive (ignoring the controller and tapes).

At some stage you should look at what the LTO tape drives and controllers and tapes are going to cost, and the benefits you expect to derive. Once that's on the table, and we know a little more about data volumes, and if there any network connections available, the arguments can be better prosecuted.

Erichans · May 12, 2022

Dave-D said:
OPTION 2:

6 Drives total, as follows:
VDEV #1 = 2 drives @ 2TB ea. (Mirrored) = 2TB total size, 1 partition for 250GB (x.8=200GB avail.) [OS], 1 partition for 1750GB (x.8=1400GB avail.) [LOCAL_BACKUP] ==> [OS + LOCAL_BACKUP]
(can lose 1 drive)
VDEV #2 = 4 drives @ 2TB ea. (RAIDZ2) = 4TB total size, 1 partition for 4GB (x.8 = 3.2TB avail.) [DATA] ==> [DATA]
(can lose 2 drives)

**HOWEVER**
re: Erichans ... Does the 4TB in my OPTION 2, VDEV #2 come from one VDEV= 4TB available, or two mirrored vdev's = 4TB avail?
The capacity of 4TB is little confusing. I don't want to risk losing any 2 drives *AS LONG AS* they are not in the same VDEV, sounds scary to me.

I've extended my earlier message #17 a bit for clarification.

"[...] or two mirrored vdev's = 4TB avail?"
With 4 disks you have disk 1 and 2 in one mirror (first vdev); disk 3 and 4 in the other mirror (second vdev). The two vdevs are combined into one pool. With the RAID-Z2 layout you have one pool that consists of one RAID-Z2 vdev of 4 drives. Note the distinction between a pool and a vdev.

As to available space: a pool with RAID-Z2 layout of 4*2TB disks has the same 4TB available for your data as a pool of two mirrors. With mirrors: If one disk fails then, the vdev to which the failed disk belongs has only one instance of its data (that data isn't anywhere else in the pool) and that vdev has lost its redundancy; the pool as a whole has lost its redundancy. With RAID-Z2: if one disk fails then, the (only) vdev still has one disk of redundancy left; the pool as a whole still has redundancy, even though it is a reduced redundancy. As you can see the robustness of RAID-Z2 is better than the mirrors alternative when a pool consists of 4 disks.

A ZFS pool gets its redundancy from the redundancy of each of its constituent vdevs (image below from here):

If vdev x in a pool loses its redundancy it affects the whole pool. If vdev x has already lost its redundancy completely and another disk in vdev x fails then the whole pool is lost.

Traditional RAID configurations can be compared to ZFS configurations to a certain extent but you have to think a little different because the concepts are not really the same. Perhaps have a look at the Dan Langille's ZFS for newbies mentioned here; you might be familiar with a lot of its contents but the different point of view as how to look at redundancy in ZFS and not as in parity in traditional raid configurations is important. If you want easy access to valuable and complete information perhaps have a look at the two ZFS (e)books: FreeBSD Development: Books, Papers, Slides

Dave-D · May 12, 2022

Erichans said:
I've extended my earlier message #17 a bit for clarification.

THANK You!
I'm learning.
What you wrote helped me a lot.

ralphbsz · May 14, 2022

I'm just going to collect a few small replies to minor points that may have been ignored earlier.

Dave-D said:
Could add a card?
Do you know of any non-scsi sata cards so I can add another drive?
(I've heard that ZFS does NOT like hardware SCSI. Even though could disable,
best not to have a card with SCSI, if possible)???

On the contrary. ZFS is perfectly fine with SCSI cards. There are lots of "large" systems around, running FreeBSD, using LSI/Avago/Broadcom SAS cards, with dozens of disks. One can also use one of those cards to connect extra SATA disks (nearly all SCSI cards can handle SATA disks). And also: you can get multi-port SATA cards too.

On the other hand: With 6 ports on the motherboard, I really don't think that more disk drive ports will be needed. In particular with modern disk capacity.

Current data size: Approx. 300GB on each server.

That's very small. Modern disk drives typically come in sizes such as 16 or 20 TB. If you were to for example buy four of these drives, and then use them in a RAID-Z2 layout (which can handle two failures), you would have 40TB of usable disk space, and your file system would be less than 1% full. Even buying small inexpensive disks (I think the sweet spot for new, not used, disks may be 4TB drives) capacity is not a problem for the foreseeable future. So you don't need a huge number of drives. And you only need many because you want redundancy.

gpw928 said:
For the price of an LTO tape drive you could buy a bunch of 500 GB USB disks for off-site rotation.

True.

Dave-D said:
I'm planning on buildling a backup routine using
External H.D's in quick-swap drive enclosures.
Internal SCSI Tape Drive
Cloud (encrypted)

External disks in a professional setting? Risky. Now you're assuming that there are regular visits to the site, you're relying on disks that are transported and thrown around, you are relying on cables that are plugged and unplugged. I know it can be done, but I would try to avoid it. The good news about this approach is that capacity is really cheap: Put a 20TB drive into an external enclosure, using a good interface (USB-3 or eSATA), and for less than $1000 you have an enormous amount of backup capacity.

Tapes? That's all the downsides of external disks, and then some. Again you need site visits. The reliability of tapes drives ... is nasty. On paper they look great. But they have the nasty habit of failing in the real world. If I had to rely on tapes, I would (a) use enterprise-grade drives (3480/3490/3590 style, perhaps LTO if you can tolerate risk, definitely not small cartridges), and (b) write redundant tapes. But look at the cost of drives and media: Last I looked a good LTO-8 or -9 drive plus a 20-pack of cartridges brings you to about $5K or $10K. For that, you can get lots of other stuff.

If you have reasonable bandwidth, backing up to the cloud seems like the best option.

Here's an idea for a hybrid: Set up most of your data disks to be 2-fault tolerant (for example 4 disks, and then make big data partitions on them, which you arrange as RAID-Z2). Also put a small backup partition on each drive, and then make a non-redundant backup partition out of them (you get more capacity out of those). Use the backup disk partition for a first level backup, then copy these backups over the network offsite to the cloud. That gives you relatively cheap unlimited capacity in the cloud, rapid access to a local backup (even when the network is slow or down), and disaster recovery: If something destroys the whole server, you still have a (slightly older) offsite backup.

mer said:
Just my opinions:
I like the idea of separate OS and Data drives. Makes upgrading easier.

A fine opinion to have, and I won't disagree with you. Matter-of-fact, the moment you start partitioning drives and using the partitions in different ZFS pools, it means the operator needs to think. For example, if you have four disks, each with 3 partitions (one tiny for OS somewhat redundant, one big for data highly redundant, and one small for backups), if one physical disk fails, you have four slightly sick pools. Orchestrating disk replacement is perfectly possible, but it requires multiple commands, and not getting things wrong. Using extra disks just to simplify the system is not a bad idea. A lot of this is a tradeoff: How much training do your operators have? How much will this system be modified in the future? Are you power- or size-constrained?

Why Mirror boot device? Protection against the boot device physically failing. You may have to be physically present to boot from the other one in the pair, but it should come up and let you replace the failed device.

For a professionally managed system, this is a great idea. For a home system, where your users (in my case spouse and child) can handle a multi-day outage, it's less important.

But one warning: If one of the boot drives fails, you will probably have to be physically present, to convince the BIOS to actually boot. Even worse, I've seen SATA drives that fail so thoroughly, they completely disable the motherboard. So in a failure case, you may have to be physically present to pull disks (one at a time), until the system starts breathing again. Not fun when it happens, but sadly it does.

In the following post, I'm going to skip all the capacity calculations, but those are clearly important.

Dave-D said:
I don't want to risk losing any 2 drives *AS LONG AS* they are not in the same VDEV, sounds scary to me.

Modern disk drives are so large, the probability of an unpredicted and uncorrectable single-sector error is getting to be significant. And the fastest way to lose data is to get the following double fault: One drive dies completely (rubber side up, shiny side down). It happens. No problem, you have redundancy, meaning ZFS will now read a while disk's worth of capacity from the other drives to put the data onto the spare. Unfortunately, during that giant rebuild/reading operation, you get a single sector error. You only lose one sector (one file), but using the "a barrel of wine with one spoonful of sewage in it is still sewage" theorem, the customer is now (justifiably) pissed off.

To guard against that, for enterprise-grade professionally managed systems, one should really have a system that can tolerate two faults.

Dave-D said:
1.) I've heard about keeping the applications separate from the operating system, but it sounded like there was no clean or simple way to do that short of major surgery by a tech with greater knowledge and ability than what I currently possess.

If by applications you mean "packages and ports": Those go into /usr/local. You could theoretically create separate file systems (or even pools) for that. In practice, that's probably silly, since they are typically quite small (dozens of GB total). In the past, the tradition was to have many separate file systems (for root, /usr, /usr/local, /var, /var/log and so on); these days pretty much the only splitting of file systems that's still commonly done is OS, user data, and backups.

gpw928 said:
Automated provisioning, and configuration management, are ubiquitous at the big end of town. But there's a lot to master if you have never been there. And the real benefits come when you have a large fleet of systems.
...
If you only have a few near-identical systems to deploy, then keeping meticulous records of everything you do (and aiming to script it) may be a satisfactory mechanism. You start with detailed records of how to install the root.

Completely agree. Automated install with updates and customization is possible, but really hard. For a half dozen system, it will probably not gain you anything, on the contrary, you will waste much time learning the system.

And I completely agree with the "keep a record" system. The way I do this: Whenever I do system administration, I have a separate window open, and I type into a file exactly what I did (the files are in /root/, and named YYYYMMDD.txt). If I type a command, I cut and paste it into there. If I need to explain something, I do that by adding comments. Like that, the resulting file is sort of usable as a script, which means re-doing it (for example on another machine) becomes super easy. It also means that if I lose my OS or want to re-install, I can just work through all these files, and repeat all the required steps.

The obvious advantage of tape is that, occasionally, you can put one away "for ever". But that's not quite true, as the media will eventually become unreadable unless it goes through a routine refreshment cycle.

Media is also REALLY expensive today. I just looked: An LTO-9 cartridge is over $140. Sure, it gives you 45TB, but for that, you can buy an extra disk drive, which is probably more practical.

If your data volume is only 300 GB, and the data doesn't change fast, then suspect the expected backup volume will be small, and easily handled by things more cost-efficient than tape.

mer · May 14, 2022

ralphbsz That's what I like the most about this forum (speaking generally). Sharing ideas about how to do something. Everyone "knows" their way is the "best" but listening to others expands the knowledge base. Maybe my next system I use your ideas because they are a better solution.