ZFS Freezing (archiving) zfs snapshot: looking for beta testers

  • Thread starter Thread starter Deleted member 67440
  • Start date Start date
D

Deleted member 67440

Guest
I'm looking for someone interested in freezing (archiving) point-in-time snapshots of a folder

It's common, using some kind of script (typically with crontab), to make multiple time-stamped snapshots, getting (just an example) something like
Code:
tank/d@2021-06-06_00.01.00--60d
tank/d@2021-06-07_00.01.00--60d
tank/d@2021-06-08_00.01.00--60d
tank/d@2021-06-09_00.01.00--60d
(...)
tank/d@2021-06-17_00.01.00--60d
tank/d@2021-06-18_00.01.00--60d
tank/d@2021-06-19_00.01.00--60d
tank/d@2021-06-20_00.01.00--60d
tank/d@2021-06-21_00.01.00--60d
(...)
tank/d@2021-08-03_13.00.00--7d
tank/d@2021-08-03_15.00.00--7d
tank/d@2021-08-03_17.00.00--7d
tank/d@2021-08-03_19.00.00--7d
tank/d@2021-08-04_00.01.00--60d
tank/d@2021-08-04_09.00.00--7d
tank/d@2021-08-04_11.00.00--7d
tank/d@2021-08-04_13.00.00--7d
tank/d@2021-08-04_15.00.00--7d
tank/d@syncoid_bakrem_aserver_2021-08-04:15:02:08
temporaneo/dedup@quickdelete
zroot/interna@syncoid_antoz2_aserver_2021-08-04:15:01:52
zroot/ssd@2021-07-30_21.30.00--5d
zroot/ssd@2021-07-31_21.30.00--5d
zroot/ssd@2021-08-01_21.30.00--5d
zroot/ssd@2021-08-02_21.30.00--5d
zroot/ssd@2021-08-03_21.30.00--5d

After some times (in the example 7 (tank/d), 5 (zroot/ssd) and 60 days (/tank/d)) the snapshots get pruned, and the older deleted.
OK, that's very common.

But there is a loss of data: the older one simply disappear.

In fact it is not very convenient to use for example tar, 7z, rar etc to create a single archive in which to put all the snapshots together, both for reasons of time (one compression/run per snapshot) and space (~one per snapshot).

More clearly: if you do something like (in this case 5 snapshot, but can be 500)
Code:
rar a /temporaneo/mygoodbackup.rar  /zroot/ssd/.zfs/zroot/ssd@2021-07-30_21.30.00--5d/* /zroot/ssd/.zfs/zroot/ssd@2021-07-31_21.30.00--5d/* /zroot/ssd/.zfs/zroot/ssd@2021-08-01_21.30.00--5d/* /zroot/ssd/.zfs/zroot/ssd@2021-08-02_21.30.00--5d/* /zroot/ssd/.zfs/zroot/ssd@2021-08-03_21.30.00--5d/*
you will need 5 times the time, and 5 times the space to store in mygoodbackup.rar the 5 snapshots.
For big folder (hundreds of GB) this it is unmanageable.
In this example a (say) 2.5GB source folder compressed to 2GB=> will need (~2.0GB x 5= 10GB) of target space

After the rar, tar, 7z or whatever you can happily delete the older snapshots, without losing data.


So I have prepared a small program that stores one or more snapshots (even hundreds) in a single file.
It actually creates a ready-to-run script file, filtering snapshots

Code:
zpaqfranz zfsadd "zroot/ssd@2021" "--5d" "/tmp/47/zpaqfranz" "/temporaneo/kongo5d.zpaq"  -output ./jobba.sh
This example make a "jobba.sh"
Code:
/tmp/47/zpaqfranz a /temporaneo/kongo5d.zpaq /zroot/ssd/.zfs/snapshot/2021-07-30_21.30.00-
-5d/ -to /zroot/ssd/ -timestamp 2021-07-30_21.30.00
/tmp/47/zpaqfranz a /temporaneo/kongo5d.zpaq /zroot/ssd/.zfs/snapshot/2021-07-31_21.30.00-
-5d/ -to /zroot/ssd/ -timestamp 2021-07-31_21.30.00
/tmp/47/zpaqfranz a /temporaneo/kongo5d.zpaq /zroot/ssd/.zfs/snapshot/2021-08-01_21.30.00-
-5d/ -to /zroot/ssd/ -timestamp 2021-08-01_21.30.00
/tmp/47/zpaqfranz a /temporaneo/kongo5d.zpaq /zroot/ssd/.zfs/snapshot/2021-08-02_21.30.00-
-5d/ -to /zroot/ssd/ -timestamp 2021-08-02_21.30.00
/tmp/47/zpaqfranz a /temporaneo/kongo5d.zpaq /zroot/ssd/.zfs/snapshot/2021-08-03_21.30.00-
-5d/ -to /zroot/ssd/ -timestamp 2021-08-03_21.30.00
After run in a single kongo5d.zpaq file you will get ALL the file in the 5 snapshots, archived by snapshot time, in 2GB

Code:
root@aserver:/tmp/47 # ./zpaqfranz i /temporaneo/kongo5d.zpaq
zpaqfranz v52.14-experimental snapshot archiver, compiled Aug  4 2021
/temporaneo/kongo5d.zpaq:
5 versions, 3.492 files, 36.617 fragments, 2.071.595.645 bytes (1.93 GB)
Long filenames (>255)         1

Version(s) enumerator
-------------------------------------------------------------------------
< Ver  > <  date  > < time >  < added > <removed>    <    bytes added   >
-------------------------------------------------------------------------
00000001 2021-07-30 21:30:00  +00003472 -00000000 ->        2.063.765.530
00000002 2021-07-31 21:30:00  +00000003 -00000000 ->            4.755.217
00000003 2021-08-01 21:30:00  +00000001 -00000000 ->                  608
00000004 2021-08-02 21:30:00  +00000007 -00000000 ->            1.136.884
00000005 2021-08-03 21:30:00  +00000009 -00000000 ->            1.937.406

How it works?
zpaqfranz => archiver name
zfsadd => command
"zroot/ssd@2021" => get all snapshot that have this in name (the "header")
"--5d" => and this (the "footer")
"/tmp/47/zpaqfranz" (full path of the archiver to be executed)
"/temporaneo/kongo5d.zpaq" (the output archive)
-output ./jobba.sh (the script created)

Short version: if someone is interested, I'd like to get some beta testers on the snapshot-name-parser.

If you have g++ installed it is trivial to compile the little software
Thanks to all reply
 
It is also possible to contribute just by posting here the list of your snapshots (created with some timed mechanism), so I can evolve the parser.
Thank you!
Code:
zfs list -t snapshot
 
any reason why you want to go such a complex route? ... with a simple backup program like borgbackup, restic or bupstash you can achieve that very easily
 
any reason why you want to go such a complex route? ... with a simple backup program like borgbackup, restic or bupstash you can achieve that very easily
In fact, AFAIK, no.
You cannot get a single-append-only archive with snapshots inside.

borg, restic and hb (hashbackup) all works with very complex "repositories", very fragile and way too hard to manage (at least for my tastes).

I do not know "bupstash", it looks similar to what I am doing

EDIT: yes, it seems a Rust-based archiver, a mix of ZPAQ and others "repositories-based", primary on Linux and Btrfs.
It doesn't seems to run on Windows.
In fact it is a "wheel reinvented", you can find ancient version of ZPAQ from 2014 into the ports (paq)

EDIT/2: https://bupstash.io/doc/man/bupstash-repository.html
Yes it seems to use much complex repositories, so it is a big no-no

EDIT/3: of course, after this developing stage via a script, the archive will make anything. But, for debugging, it is WAY more easy to look with vim
The real problem is not monotonic snapshots. But with just about zero users, I can pick my own needs :)
 
I'd be very happy if some FreeBSD user who uses scripts to create point-in-time snapshots (aka: with a timestamp embedded in the name), could post the output of the zfs list -t snapshot

Or a https://github.com/fcorbelli/zpaqfranz

zpaqfranz zfslist "*" (just about the same).

it would help me improve my snapshot freezer, thanks to all reply.
 
Back
Top