Currently my backup server is running a different OS (OpenBSD) which does not support ZFS, so the backup is simply an rsync of the current ZFS filesystem (not of a specific snapshot, but the active filesystem). In a few weeks I will be moving to a backup server running FreeBSD. I would like to find a way to actually replicate datasets from the file server to the backup server along with all existing snapshots. I want the structure of the data and snapshots on the backup server to be identical, and I would like nothing outside the backup process to be able to touch the duplicated datasets (which is what I mean by immutable). Most likely the destination zpool will either be exported or mounted read-only during normal operation, and only imported or made read-write when a backup is happening. If anything has happened to the destination datasets (for example, some change made since the last snapshot) I need that to be discarded the next time the backup happens.
On the fileserver, I have a python script which handles all of the snapshots (including purging old snapshots). Each dataset has yearly/monthly/weekly/daily/hourly snapshots, which I would like to replicate to the backup server. The backups only run once per day, so it would be acceptable to simply replicate each dataset up to the most recently snapshot (whatever it is), because any additional snapshots or changes that happen during the day should be replicated the next time it runs. The source data is heavily snapshotted, so this is intended solely to protect against complete pool loss on the file server. It's acceptable for up to 24 hours of data to be lost if this happens.
I can replicate the zfs datasets manually on the backup server and continue to replicate via rsync, and simply have the backup server itself handle snapshotting and purging using the same script via cron. It feels like there has to be a better way to do this, though (possibly using zfs send/recv). If anyone has any great ideas on how to accomplish, I'm all ears. I'm pretty good at scripting things, so it doesn't matter (up to a point) how complicated the method would be to do this. I've looked into syncoid, but at first glance it doesn't look like it does quite what I want as-is. However, if someone's got a set of options they know of that would enable me to use it (as either the entire solution, or as part of the solution which can be wrapped in a shell script to accomplish the rest), that's just fine with me.
On the fileserver, I have a python script which handles all of the snapshots (including purging old snapshots). Each dataset has yearly/monthly/weekly/daily/hourly snapshots, which I would like to replicate to the backup server. The backups only run once per day, so it would be acceptable to simply replicate each dataset up to the most recently snapshot (whatever it is), because any additional snapshots or changes that happen during the day should be replicated the next time it runs. The source data is heavily snapshotted, so this is intended solely to protect against complete pool loss on the file server. It's acceptable for up to 24 hours of data to be lost if this happens.
I can replicate the zfs datasets manually on the backup server and continue to replicate via rsync, and simply have the backup server itself handle snapshotting and purging using the same script via cron. It feels like there has to be a better way to do this, though (possibly using zfs send/recv). If anyone has any great ideas on how to accomplish, I'm all ears. I'm pretty good at scripting things, so it doesn't matter (up to a point) how complicated the method would be to do this. I've looked into syncoid, but at first glance it doesn't look like it does quite what I want as-is. However, if someone's got a set of options they know of that would enable me to use it (as either the entire solution, or as part of the solution which can be wrapped in a shell script to accomplish the rest), that's just fine with me.