D
Deleted member 67440
Guest
Since I frequently deal with backup and disaster recovery for BSD machines, both physical and virtual, I find it useful to share my experiences with other users in order to learn something new.
The question is long and complex, involving both the use of "standard" (rsync, 7z), advanced (hb), system (zfs), utility (zfsSnap, syncoid) programs, and written by me.
Since, with the help of the forum, I managed (maybe) to create a working ports package for a fork of a specific program for versioned copies (this http://mattmahoney.net/dc/zpaq.html) I think reasonable to start from there, that is, from zpaqfranz (https://github.com/fcorbelli/zpaqfranz)
STEP ONE: zpaqfranz
Hopefully a zpaqfranz executable will be created in /usr/local/bin.
Please if anyone is kind enough to try and report any anomalies I would be very grateful to them
So suppose we have compiled zpaqfranz.
Why is it so relevant to FreeBSD's backup (in my style, of course)?
Because it has a feature that goes well with snapshots, specifically to zfs in particular, that is the ability to keep data forever, as a sort of Timemachine.
Please note that the original author of this software is NOT me, so I am not trying to give myself credit that I don't have.
In the next post I will try to explain why it is the ideal medium for backups in general, and zfs in particular, with one real flaw.
Otherwise it is, in my opinion, something that simply cannot be compared with other programs: it is to rar or 7z as zfs is to NTFS
It is a program that I have been using for about 5 years, therefore extremely tested, however I had to make a series of changes to the program to make it easier to compile on systems such as ESXi servers and QNAP NAS, so caution should be used before entrusting it with exceptionally important data.
It is an important clarification because it is not exactly a trivial program (you can obviously read the source directly).
What should be the characteristics of the "ideal" program for backups?
What would you choose, if you could rub Aladdin's lamp?
0) Nothing complicated, neither the program, nor the archives. No complex mechanisms like hashbackup or borg. No archives divided into hundreds of files and folders, each one essential. If it's simple, maybe it works.
1) Keep all data, without ever deleting them
2) Reliably deduplicate the information
3) Compress them in a reasonable time
4) Have several methods to check the integrity of the data
5) Easily check that the backups exactly match the original information
6) Encrypt the information (optional)
7) Have a format particularly suitable for the use of rsync --append (cloud copies of minimum size)
8) "Understand" the .zfs (ie exclude them)
9) Run on various systems in an almost identical way (Windows, Linux, FreeBSD, QNAP)
10) Have specific functionalities for storage managers, i.e. commands to compare folders, calculate hashes etc.
11) Take full advantage of modern systems (i.e. solid state disks, CPUs with multiple cores, HW instructions), in particular of the Xeon type (i.e. many cores, but not very high frequency)
To get an idea of the last point this is a quick example of HASH calculation (something like hashdeep) on a Xeon machine with 8 physical cores, NVMe disks, of large files (Thunderbird mbox), at the rate of actual 2.8GB/s.
Yes, GB, not MB, readed from a zfs volume.
On little files (84.743.675.893 bytes for 134.990 files)
just about 2GB/s
If this is of interest, I can proceed (after dinner)
The question is long and complex, involving both the use of "standard" (rsync, 7z), advanced (hb), system (zfs), utility (zfsSnap, syncoid) programs, and written by me.
Since, with the help of the forum, I managed (maybe) to create a working ports package for a fork of a specific program for versioned copies (this http://mattmahoney.net/dc/zpaq.html) I think reasonable to start from there, that is, from zpaqfranz (https://github.com/fcorbelli/zpaqfranz)
STEP ONE: zpaqfranz
Code:
mkdir /tmp/testme
cd /tmp/testme
wget http://www.francocorbelli.it/zpaqfranz/ports-51.10.tar.gz
tar -xvf ports-51.10.tar.gz
make install clean
Hopefully a zpaqfranz executable will be created in /usr/local/bin.
Please if anyone is kind enough to try and report any anomalies I would be very grateful to them
So suppose we have compiled zpaqfranz.
Why is it so relevant to FreeBSD's backup (in my style, of course)?
Because it has a feature that goes well with snapshots, specifically to zfs in particular, that is the ability to keep data forever, as a sort of Timemachine.
Please note that the original author of this software is NOT me, so I am not trying to give myself credit that I don't have.
In the next post I will try to explain why it is the ideal medium for backups in general, and zfs in particular, with one real flaw.
Otherwise it is, in my opinion, something that simply cannot be compared with other programs: it is to rar or 7z as zfs is to NTFS
It is a program that I have been using for about 5 years, therefore extremely tested, however I had to make a series of changes to the program to make it easier to compile on systems such as ESXi servers and QNAP NAS, so caution should be used before entrusting it with exceptionally important data.
It is an important clarification because it is not exactly a trivial program (you can obviously read the source directly).
What should be the characteristics of the "ideal" program for backups?
What would you choose, if you could rub Aladdin's lamp?
0) Nothing complicated, neither the program, nor the archives. No complex mechanisms like hashbackup or borg. No archives divided into hundreds of files and folders, each one essential. If it's simple, maybe it works.
1) Keep all data, without ever deleting them
2) Reliably deduplicate the information
3) Compress them in a reasonable time
4) Have several methods to check the integrity of the data
5) Easily check that the backups exactly match the original information
6) Encrypt the information (optional)
7) Have a format particularly suitable for the use of rsync --append (cloud copies of minimum size)
8) "Understand" the .zfs (ie exclude them)
9) Run on various systems in an almost identical way (Windows, Linux, FreeBSD, QNAP)
10) Have specific functionalities for storage managers, i.e. commands to compare folders, calculate hashes etc.
11) Take full advantage of modern systems (i.e. solid state disks, CPUs with multiple cores, HW instructions), in particular of the Xeon type (i.e. many cores, but not very high frequency)
To get an idea of the last point this is a quick example of HASH calculation (something like hashdeep) on a Xeon machine with 8 physical cores, NVMe disks, of large files (Thunderbird mbox), at the rate of actual 2.8GB/s.
Yes, GB, not MB, readed from a zfs volume.
Code:
root@aserver:/ # zpaqfranz sha1 /tank/mboxstorico/ -all -xxhash
zpaqfranz v51.10-experimental journaling archiver, compiled Apr 5 2021
franz:use xxhash
Getting XXH3 ignoring .zfs and :$DATA
Computing filesize for 1 files/directory...
Found 116.085.569.679 bytes (108.11 GB) in 0.001000
Creating 16 hashing thread(s)
010% 0:00:34 11.608.569.604 of 116.085.569.679 3.869.523.201/sec
020% 0:00:29 23.217.133.760 of 116.085.569.679 3.316.733.394/sec
030% 0:00:26 34.825.715.537 of 116.085.569.679 3.165.974.139/sec
040% 0:00:22 46.434.282.548 of 116.085.569.679 3.316.734.467/sec
050% 0:00:18 58.042.795.926 of 116.085.569.679 3.224.599.773/sec
060% 0:00:14 69.651.367.897 of 116.085.569.679 3.165.971.268/sec
070% 0:00:11 81.259.924.731 of 116.085.569.679 3.250.396.989/sec
080% 0:00:07 92.868.456.770 of 116.085.569.679 3.202.360.578/sec
090% 0:00:03 104.477.058.202 of 116.085.569.679 3.165.971.460/sec
XXH3: 0076C91D4183AFC8A6363DEA77BAEA01 /tank/mboxstorico/inviata_20140630_20130524.sbd/inviata_20100517_20101207
(...)
XXH3: FBDE63D258722379047B92FC83779D4A /tank/mboxstorico/cestino_2016
Algo XXH3 by 16 threads
Scanning filesystem time 0.001000 s
Data transfer+CPU time 41.053000 s
Data output time 0.000000 s
Worked on 116.085.569.679 bytes avg speed (hashtime) 2.827.700.038 B/s
On little files (84.743.675.893 bytes for 134.990 files)
Code:
root@aserver:/ # zpaqfranz sha1 /tank/d -all -xxhash
zpaqfranz v51.10-experimental journaling archiver, compiled Apr 5 2021
franz:use xxhash
Getting XXH3 ignoring .zfs and :$DATA
Computing filesize for 1 files/directory....250.034)
Found 84.743.675.893 bytes (78.92 GB) in 2.422000
Creating 16 hashing thread(s)
010% 0:00:38 8.474.399.005 of 84.743.675.893 2.118.599.751/sec
020% 0:00:34 16.948.753.177 of 84.743.675.893 2.118.594.147/sec
030% 0:00:29 25.423.129.487 of 84.743.675.893 2.118.594.123/sec
040% 0:00:24 33.897.500.349 of 84.743.675.893 2.118.593.771/sec
050% 0:00:20 42.371.898.749 of 84.743.675.893 2.118.594.937/sec
060% 0:00:16 50.846.225.845 of 84.743.675.893 2.118.592.743/sec
070% 0:00:12 59.320.600.443 of 84.743.675.893 2.118.592.872/sec
080% 0:00:07 67.794.943.633 of 84.743.675.893 2.186.933.665/sec
090% 0:00:03 76.269.316.408 of 84.743.675.893 2.179.123.325/sec
If this is of interest, I can proceed (after dinner)