Other Doing incremental upgrades without ZFS (sigh)

jb82 · Sep 24, 2024

Hey,
I have a little non-standard BSD question. We have to do incremental backups of logs (i.e. append-only data) of size in hundreds of GB. But we cannot use ZFS as for the reasons I cannot publicly share.

What would you guys use in such scenario? Formally, machine A (logs produced there), machine B connected over ssh (quite close). They don't need to be realtime. A few times per day is enough. Say, 5 GB of new logs per day. Some rsync incremental? Other ideas? Any tricks?

cracauer@ · Sep 24, 2024

rsync has advanced options to do incremental backups.

jb82 · Sep 25, 2024

cracauer@ said:
rsync has advanced options to do incremental backups.

Like --link-dest or even something more involved? I mean, nothing better than hardlinks for versions on the target, right?

cracauer@ · Sep 25, 2024

--compare-dest and --copy-dest are also interesting.

You can also specify multiple targets for --link-dest.

On the other hand, if you have large append-only files then increments using hardlinks will not save space. Only a block-level dedup such as from ZFS would help. Keeping the files small by rotating them more often would help.

Alain De Vos · Sep 25, 2024

FreshPorts -- sysutils/clone: File tree cloning tool

clone is a file tree cloning tool which runs 3 threads - a scheduler (main), a reader, and a writer thread. Reading and writing occurs in parallel. While this is most beneficial for copying data from one physical disk to another, clone is also very well suited for cloning a file tree to any...

www.freshports.org

Mirror176 · Sep 29, 2024

archivers/zpaqfranz has features of compression, incremental, and deduplication. Not sure how good/bad append-only logs would perform but the port maintainer is also the maintainer of this fork of zpaq in case known append-only has room for performance optimization.

Kai Burghardt · Oct 14, 2024

jb82 said:
[…] We have to do incremental backups of logs […] of size in hundreds of GB. […]

What would you guys use in such scenario? […] Other ideas? Any tricks?

? Logrotate (newsyslog(8)) as you’re about to backup a new increment.
Sending “mini increments” via network (e. g. sysutils/rsyslog or sysutils/syslog-ng) to avoid I/O spikes. ?
Don’t do backups at all. ? (← Without knowing what kind of logs we’re talking about, I’d probably do this. I don’t do backups for fun.)

jb82 said:
[…] But we cannot use ZFS as for the reasons I cannot publicly share. […]

Oh boy, now I’m curious what “top secret” reasons one might have.

You can use file vdevs (see zpoolconcepts(7)) so you create a separate pool (and dataset) just for those logs and still benefit from the whole ZFS workflow. ?
You can abuse revision control systems like svn(1) to obtain (and sync) just the delta.