Solved Advice needed for backing up multiple FreeBSD hosts to each other.

Hello everyone,

I plan to set up two FreeBSD NAS servers, one in each of two locations, with a third FreeBSD server for offline backups.
To protect against data loss, I want both servers to back up each other:

Server 1 (NAS)
- 2x 18 TB HDDs in a ZFS mirror
- Location A (home)
- Will hold ~ 7 TB of data
- Pulls snapshots from Server 2

Server 2 (NAS)
- 2x 18 TB HDDs in a ZFS mirror
- Location B (parents)
- Will hold ~ 2 TB of data
- Pulls snapshots from Server 1

Server 3 (Offline backup)
- 8x 3 TB HDDs in a ZFS Raid-Z1
- Location: A
- Will hold ~ 9 TB of data (21 TB raw)
- Pulls snapshots from Server 1 and 2

Important

Due to bandwidth limitations (slow uploads on site b), I want to "buffer" the daily ZFS snapshots from Server 2 (site b) on Server 1 (site a), before pulling the snapshots from site a and b off Server 1 to Server 3 once a week.


How would you recommend I structure the ZFS datasets on the three pools?
I plan to use zfs-autobackup for taking snapshots and ZFS send/receive.

This is the general dataset layout I am using right now:

Code:
pool01
pool01/users/
pool01/users/user1
pool01/users/user2
pool01/users/user3
pool01/services
pool01/services/plex
pool01/services/nextcloud
pool01/services/torrent
pool01/services/syncthing
pool01/backups
pool01/backups/pc1
pool01/backups/pc2
pool01/jails
 
I use zap to help with managing ZFS snapshots and cloning. I set ZFS properties to configure where backups for a dataset should go, how long they should live, and have scripts read that and act accordingly. My scripts are called via cron, but you could also use periodic if precise timing isn't important.

Does the dataset structure matter that much? If it is a lot of data, might you consider making an on-site copy of the data first, then ship the drive to the other data center to avoid having to wait for the ZFS send / receive to complete?

Or, the beauty of snapshots is that you can create as many as you want as frequently as you want. Perhaps you load a little bit of data onto the source drive, take a snapshot, load a bit more, take another snapshot ... Meanwhile, you can send incremental snapshots to the target so that will let you minimize the amount of transaction time so you can safely run the backups during off hours without spillover. That would put the data that isn't backed up at risk if you can tolerate that.

Once you're all done with synchronizing content, drop earlier snapshots if you want, otherwise, they don't really have a storage penalty but would let you easily bring other drives up to speed safely and easily.

If you want to explore something different and challenging, you could consider using iSCSI and adding the device over the network and then let it automatically resilver. You can take the drive offline whenever and it SHOULD be able to determine where it left off to continue the resilver process, but that'd be risky. It might be a fun exercise :).
 
Back
Top