ZFS Complete tool for automated backups (to Amazon AWS, Google Cloud, or Microsoft Azure or INTERNAL storage)

While in a mathematical sense you are correct, in a practical sense you are not. Other storage service providers have outage statistics that are MUCH MUCH better than Cloudflare. But even if that doesn't worry users, the real worry is that it indicates that their engineering has a cavalier attitude towards reliability.
As You may see, in this (and other threads on this forum) i vote for CF as for DNS/proxy service, not as for reliable storage service.
From my point of view for small-middle companies are only one type of storage: own storage system with 2+ controllers connected by FC cables and FC switches 2+ 10-15k rom HDD shelfs. From GOOD manufacturer like IBM, EMC, Toshiba,... hardware and ZFS filesystem.

Yes, but Cloudflare's track record of leaking information demonstrates, yet again, that they are careless and/or incompetent. Significantly worse than other providers.

But the real problem with Cloudflare is not that they are bad at what they should be doing (providing a reliable service, without data leakage). The real problem is that Cloudflare explicitly and deliberately serves customers that it knows are doing things that are either outright illegal (such as killing people, they were the provider to ISIS/ISIL), or ethically very bad but not yet illegal (such as providing the backend for 8chan, which various hate groups connected to mass shootings have used as a platform to organize themselves). While nobody has been able to prove that Cloudflare is itself a criminal enterprise, it willingly provides service to criminals and terrorists.

Interestingly, it has that in common with the cryptocurrency industry: While it might have been originally well-intended, today it is mostly a tool of scammers and organized crime; plus fools who think they can make a fortune from it (and usually end up poorer).
In common sense You are right. But in real world MUCH MORE DANGEROUS RIGHT NOW are the situation when russia using chemicals weapon in Ukraine, targeting the nuclear power plants (and occupy one - in Zaporizzya), blocking a Black Sea (which lead to mass dies in poor countries in East and Africa), and make a public messages that “ready to shoot US by strategic nuclear weapon”...
And officially russia make friendship with Taliban and ISIS/ISL.
And all of this right now, from 24 Feb of 2022.

Or may be You forgot russian attacks on fuel pipes structures of US recent year?

In which world real or digital a You live now???

P.S. I purpose return to topic. :)
 
From my point of view for small-middle companies are only one type of storage: own storage system with 2+ controllers connected by FC cables and FC switches 2+ 10-15k rom HDD shelfs. From GOOD manufacturer like IBM, EMC, Toshiba,... hardware and ZFS filesystem.
That can indeed be a good solution. Whether it is good or bad depends mostly on the quality of the personnel that implements it.
But today I think it is no longer a cost-effective solution. Add up all the cost of the system (two servers, disk enclosures, SAS or FC hardware, software licenses and/or support contracts, system administration personnel costs), and it is likely more expensive than outsourcing your storage to a cloud provider. But this tradeoff needs to be evaluated specifically for each situation. Part of that tradeoff is what vendor to use.

But in real world MUCH MORE DANGEROUS RIGHT NOW are the situation when russia using chemicals weapon in Ukraine, targeting the nuclear power plants (and occupy one - in Zaporizzya), blocking a Black Sea (which lead to mass dies in poor countries in East and Africa), and make a public messages that “ready to shoot US by strategic nuclear weapon”...
And officially russia make friendship with Taliban and ISIS/ISL.

Absolutely true. And I'm following the news in detail, and some of my friends are rather directly involved in the conflict. But that doesn't change the fact that there are good cloud storage providers, and bad ones, with Cloudflare pretty solidly on the bad side. For a variety of reasons, some technical, some ethical.
 
That can indeed be a good solution. Whether it is good or bad depends mostly on the quality of the personnel that implements it.
Totally agree with the fact that there are extremely less numbers of quality personnel: most of youth nowadays not pay much attention to hardware & os, protocols basics and much more addicted to “work just right out of the box” solutions even TCO for that solutions are higher for company in which they working on.
But today I think it is no longer a cost-effective solution. Add up all the cost of the system (two servers, disk enclosures, SAS or FC hardware, software licenses and/or support contracts, system administration personnel costs), and it is likely more expensive than outsourcing your storage to a cloud provider. But this tradeoff needs to be evaluated specifically for each situation. Part of that tradeoff is what vendor to use.
Hm... Lets calc:
EMC/NetApp 24+ 3,5” HDD shelf (best HGST 3Tb legendary SAS drives cost $130-180/each new) cost on aftermarket ~ $200-250
2 x Xeon 5670 + 64Gb DDR3 chipkill memory controller IBM-based server cost on aftermarket ~ $200-250
great Qualcomm 6Gb/s FC controller cost on aftermarket ~ $150-200
7-8,000 kVA Emerson/ Liebert rack online interactive UPS cost ~ $500-600 (with a new batteries)
smart SDU/PDU (Athen/Servertech) + environment control (like APC EMS) cost on aftermarket ~$300-350 total
Floor stand cooling system for room 25m2 (1 main device + 1 for failure hotswap) - new cost ~$800 / both
FC cables + Eth cables ~ $50-100

FreeNAS - free license
FreeBSD + a lot of pkg - free
ok, setup fee would be $30-60/h.

Utilities bill - according Your country rates.

So on that price a You have a robust system, based on hi-quality components, expandable, with VERY low maintenance cost.
And price for all of this are equal 2-3-4 month of service of great cloud provider. (And EACH additional step like adding HDD, reconfigurations cost a You more and more...)

Where am I wrong in calculating?
Absolutely true. And I'm following the news in detail, and some of my friends are rather directly involved in the conflict. But that doesn't change the fact that there are good cloud storage providers, and bad ones, with Cloudflare pretty solidly on the bad side. For a variety of reasons, some technical, some ethical.
As a You may see, cyber-terror from russia, North China and Iran last months are only increasing...
 
Guess what's going on again right now? Another massive Cloudflare outage!
View: https://twitter.com/patelkjoel/status/1539151292685332481

Here in Ukraine, on ISP level we see some small delays, but not so much.

Mostly we using CF as DNS service (reverse proxy), and not see huge outage/delays.

May be for cryptocurrency geeks this outage issue was more valuable... ;) Am I wrong ?
 
Regarding your tool selection problem: you just need to use ZFS snapshots and integrate this with restic if you are not using a ZFS enabled backup provider.

Like this:

1. create throwaway ZFS snapshot for backup
2. use restic to backup that ZFS snapshot offsite
3. delete your ZFS snapshot
Seems to be a nice workaroud. Did you test actually it?
I actually want to backup snapshots of a restic repo.
 
Well since you've asked, I'll add a few more points to that. Cloudflare started back then as honeypot project, which later added DDoS protection to their portfolio. Their "free" DDoS protection is now "protecting" many web sites. So they also ventured into the field of DNS servers, and are one of the main propagators of that abomination called DNS over HTTPS.

They are operating the free DNS server 1.1.1.1 as well as the DoH DNS server, to which Mozilla Firefox out of the box connects to, amongst many other things.

Just a few highlights out of their "career": in 2014 they opened up a challenge website when Heartbleed was all the rage, claming people could abuse Heartbleed but not retrieve their SSL certificates. Of course somebody was succesful.

The no other company on the world is causing more issues for the Tor network than Cloudflare.

Tavis Ormandy (Google Project Zero) found in 2017 the Cloudbleed bug: reverse proxies were dumping uninitialized memory.

Of course 1.1.1.1 is there to grab all our DNS query data, just like 8.8.8.8 is there for Google.

DNS over HTTPS became the standard in Mozilla spring 2020, and of course uses Cloudflare. Bert Hubert from PowerDNS about that move. https://blog.powerdns.com/2018/09/04/on-firefox-moving-dns-to-a-third-party/

They had also big DNS outages, like in 2019: https://ianix.com/pub/dnssec-outages/20190321-www.cloudflare.com/, often talking half of the internet with them.

Complete breakdown in 2019, which affected lots of web sites: https://metro.co.uk/2019/07/02/cloudflare-outage-means-websites-including-detector-10103471/

And they want you to believe that public keys are not enough for SSH security, so you should integrate them in your security architecture: https://blog.cloudflare.com/public-keys-are-not-enough-for-ssh-security/ - what could possibly go wrong?

In 2020 they've created cloud based web browsers and wanted to offer this service to people. https://www.techradar.com/in/news/cloudflare-wants-to-run-your-web-browser-in-the-cloud

And they are unable to handle DNS root zones correctly. https://lists.dns-oarc.net/pipermail/dns-operations/2020-January/019684.html

Cloudflare was rate limiting npm - by mistake. https://github.com/npm/cli/issues/836#issuecomment-587019096

And of course if people are too lazy to create SSL certificates, instead let Cloudflare handle that - OMG.

Cloudflare considered harmful. And there's oh so much more about it...
Dear HardworkingNewbie!

Recently I re-read all links that You post in this tread, and dozen of related: to be sure that my arguments are still valid.

So, I need to add a little bit more for my previous answer: later or sooner, the "PROfessional internet inside ORDINARYinternet" become reality, whatever agree we with this or not.
Because a lot of factors: in next decade only really transglobal companies (or governments of rich western countries) be able to investment into infrastructure building (because of a decade of regional wars and economic recession due this), connectivity protocols stack dramatically and fastly changing, and network devices become more intellectual and powerful, post-industrial world need more people who consuming rather producing, social networks and metaverse become new reality for next generations of people, ...
In fact world would be divide on two categories: PRO - hi speed, guarantee, high availability, etc and ORDINARY - who lives not so critical bounded with net.

And CloudFlare, Google, Akamai, Amazon & Facebook just try to catch as much bigger piece of this "big cake of future money".

And of course all of them would make mistakes, take wrong decision,... Like government regulators in each country also would make unprofessional, wrong & just stupid decision...

So at the bottom line: of course CloudFlare are not Ideal, but much better in all senses than Google, cheaper (TCO) than Amazon, and great solution for those, who have no big budgets on Akamai (for example) solution for enterprise clients.

Am I wrong?
 
What is Your opinion about
backup/zapzend
https://github.com/oetiker/znapzend

backup/zfs_autobackup


Please give me Your opinion about pro/cons about this two solutions.

Note:
1. We need strongly secured connection on a outside remote machine/cloud instance (in this case may be data compression also would be ok, because unstable routing with speed are vary, but on inside local bare metal machine data compression would not need at all,- the speed are more important. So, for outside secure+data compress, for inside less/no secure and no data compress);
2. We using ZFS only, on each machine.
2. The speed of whole backup procedure on a remote machine/cloud instance - are IMPORTANT;
3. less pressure on CPU/RAM/disc - are IMPORTANT.

P.S.
Let's say on a outside remote machine/cloud instance

Thank You all for detailed suggestions!
 
You won't go wrong with Sanoid/Syncoid. Take a look at its Github repo, particularly number of folks, contributors, users, issues and so on. These metrics can be helpful in deciding what solution to explore. Apply it to other interests too.
 
Please give me Your opinion about pro/cons about this two solutions.

Note:
1. We need strongly secured connection on a outside remote machine/cloud instance (in this case may be data compression also would be ok, because unstable routing with speed are vary, but on inside local bare metal machine data compression would not need at all,- the speed are more important. So, for outside secure+data compress, for inside less/no secure and no data compress);
2. We using ZFS only, on each machine.
2. The speed of whole backup procedure on a remote machine/cloud instance - are IMPORTANT;
3. less pressure on CPU/RAM/disc - are IMPORTANT.

P.S.
Let's say on a outside remote machine/cloud instance

Thank You all for detailed suggestions!
Any comment?
 
Which method (or utility set) would be best for making EVERYDAY ZFS snapshots of server’s current FreeBSD system (whole system installed but NOT THE WHOLE DRIVE) to internal USB 2.0 pen-drive?

Need to NOTE: this USB 2.0 pen-drive are FORMATTED as bootable (by periodically downloading from online new FreeBSD RELEASE and making bootable by dd command, INTENDED FOR INITIAL INSTALL together with guided bsdinstall), but I also need to place DAILY SNAPSHOT at another volume ON THE SAME INTERNAL USB 2.0 pen-drive (INTENDED FOR RESCUE RESTORE, if something happens with internal main drives or whole main board malfunction or crash).

Another aspect that may affect choosing the toolset for this, are: the server are frequently on high-load conditions with both INTENSIVE network I/O and disk I/O.

Ansible native or 3rd-party roles/modules OR sh/bash script would be preferable.
Thank You all for detailed answering and explanation.

Have a nice sunny day!
 
I'm new to FreeBSD and ZFS, and in need of a backup solution. This thread helped me a lot. I think I'm going to start off with, for me, the simplest solution possible:

* maybe local snapshots ( question at end of post )
* rsync to rsync.net (rsync.net maintains daily snapshots in zfs, so rsyncing to them is easy and relatively foolproof, see https://rsync.net/resources/howto/rsync.html)
* periodic database dumps (which get rsync'd to rsync.net)

I'm certain this is not the most efficient (or cheapest) way to do this, but the cost of rsync.net isn't out of line for the service they provide. And it'll get me further down the road quickly. I'll report back how things work out.

Of course, any pointers to sleeping dragons are welcome. One thing that's not optimal in my current server, is I have only 2 nvme drives. So I have them mirrored in a VDEV. The FreeBSD installed onto the OS into this zpool:

Code:
# zpool status
  pool: zroot
 state: ONLINE
config:

    NAME            STATE     READ WRITE CKSUM
    zroot           ONLINE       0     0     0
      mirror-0      ONLINE       0     0     0
        nda0p3.eli  ONLINE       0     0     0
        nda1p3.eli  ONLINE       0     0     0

And so (unfortunately) my OS files are on the same zpool as my applications (jails with databases, webservers, the usual):

Code:
# zfs list
NAME                                        USED  AVAIL  REFER  MOUNTPOINT
zroot                                      4.10G  3.37T    96K  /zroot
zroot/ROOT                                 2.13G  3.37T    96K  none
zroot/ROOT/15.0-RELEASE_2026-02-03_142353     8K  3.37T  1.95G  /
zroot/ROOT/default                         2.13G  3.37T  1.98G  /
zroot/home                                  158M  3.37T   104K  /home
zroot/home/ansible                          128K  3.37T   128K  /home/ansible
zroot/home/toddg                            158M  3.37T   158M  /home/toddg
zroot/tmp                                    96K  3.37T    96K  /tmp
zroot/usr                                  1.81G  3.37T    96K  /usr
zroot/usr/ports                             871M  3.37T   871M  /usr/ports
zroot/usr/src                               980M  3.37T   980M  /usr/src
zroot/var                                  1.09M  3.37T    96K  /var
zroot/var/audit                              96K  3.37T    96K  /var/audit
zroot/var/crash                              96K  3.37T    96K  /var/crash
zroot/var/log                               512K  3.37T   512K  /var/log
zroot/var/mail                              216K  3.37T   216K  /var/mail
zroot/var/tmp                               104K  3.37T   104K  /var/tmp

I'm assuming I should back all of this up except /tmp...

So I guess my question is, how would you suggest I go about backing this up? Should I zfs snapshot various datasets and then rsync those (and if so, which datasets)? Or should I forego snapshots and just rsync the mount points directly (and if so, which mount points)?
 
rsync.net is good for personal use if you're using their basic SSH plan because they can have small volumes. Note that their minimum for zfs is 5TB which is $60/mo. If that fits your budget, cool, just be aware of it. Also they set up a VM for you, they upgrade it at their own pace, blow away anything that's not on your zpool, and you can't really run other services on it. It's intended as a zfs snapshot target only. They're pros though, so if you just need a place to send snapshots and can afford it, I say go for it. Also would have no hesitation with their basic rsync service. I keep meaning to set up a hosthatch server so I can run my own VM with additional service, but it's low on priority list.

If you do go with zfs, my preference is sanoid for policy-based snapshots and zelta for snapshot replication. zelta is more of a "do the right thing for backups" out of the box than sanoid. Sanoid is excellent and capable, but it's a sharper tool. It will let you receive datasets that share a mountpoint with /, which means your next commands on the receiver might be pretty wacky. That's not a shortcoming of sanoid at all - it's just how zfs works - but for a minimum of fuss, zelta gets the job done.

What I personally do is set up a zroot/SAFE dataset, and anything that I want to back up descends from there and is mounted into the right place. For things under /var/db I do zfs create -o mountpoint=/var/db canmount=off zroot/SAFE/var/db (you'll need a similar one for just /var first). Then I create datasets for the things that I want to back up, e.g. zfs create zroot/SAFE/var/db/postgres.

This way my sanoid and zelta only need to deal with zroot/SAFE. Anything I want in the backup descends from that dataset. I can snapshot anything else but that's just local snapshots so I can recover files easily if needed, not the long-term robust backup plan.

Should I zfs snapshot various datasets and then rsync those (and if so, which datasets)? Or should I forego snapshots and just rsync the mount points directly (and if so, which mount points)?

You should do an offsite backup of whatever data you want to be able to recover in case your server is lost. For me, that's data, not OS installation, applications, or config. All of that can be reproduced from other sources (e.g. install from upstream, config in version control).

There are two ways you can use snapshots in this sort of scenario. The first is to replicate the snapshots directly using something like zelta. The other is to make a snapshot, and then rsync the files from that snapshot. This avoids the problem of files changing while rsync is running. To do that you would do something like rsync -avz --delete /.zfs/snapshots/foo-snapshot-name remote-server@/backups or maybe use a backup tool like restic or borg, but sourcing your latest snapshot dir. It's a nifty way to use zfs in an environment where you don't want to / can't receive snapshots. Also note that this approach is NOT recursive, so if you look at /.zfs/snapshots that will just be the stuff on your bectl root, you'll need to do the same for /home, /var, etc.

Personally I prefer replicating snapshots, it's very simple.

One last thing: it's a good idea to test your backups automatically. You may need to restore from them one day, and it'll be a lot more comfortable if you know it works rather than crossing your fingers when the time comes. For example, you can zfs snapshot the postgres data dir and get a crash consistent backup (equivalent to if you pulled the plug on the server). It's usually reliable but at $JOB we've seen < 0.1% of the backups not be restorable. We know this because we verify every snapshot by mounting it into a jail and starting postgres on it to make sure it recovers.
 
patmaddox Thank you for the detailed response!

rsync.net is good for personal use if you're using their basic SSH plan because they can have small volumes. Note that their minimum for zfs is 5TB which is $60/mo. If that fits your budget, cool, just be aware of it. Also they set up a VM for you, they upgrade it at their own pace, blow away anything that's not on your zpool, and you can't really run other services on it. It's intended as a zfs snapshot target only. They're pros though, so if you just need a place to send snapshots and can afford it, I say go for it. Also would have no hesitation with their basic rsync service. I keep meaning to set up a hosthatch server so I can run my own VM with additional service, but it's low on priority list.

I'm bootstrapping a business, and I need to keep my burn rate low until I have cash flow. For that reason, I'll stick with rsync on the basic SSH plan for now. When things take off, I'll transition to zfs + sanoid + zelta (those look like great combo!).

What I personally do is set up a zroot/SAFE dataset, and anything that I want to back up descends from there and is mounted into the right place. For things under /var/db I do zfs create -o mountpoint=/var/db canmount=off zroot/SAFE/var/db (you'll need a similar one for just /var first). Then I create datasets for the things that I want to back up, e.g. zfs create zroot/SAFE/var/db/postgres.

This way my sanoid and zelta only need to deal with zroot/SAFE. Anything I want in the backup descends from that dataset. I can snapshot anything else but that's just local snapshots so I can recover files easily if needed, not the long-term robust backup plan.

Setting up a zroot/SAFE dataset and backing up descendants is simple and smart. I'll do that. I plan to build my product in nested jails, e.g. a top level jail for each environment: [dev,stage,prod]. Then in each top level jail, I'll have: [webserver, database]. As the webserver is basically stateless, the db state is all I really need to backup. I assume I can mount the db datasets something like : zroot/SAFE/dev/db, zroot/SAFE/stage/db, etc.

You should do an offsite backup of whatever data you want to be able to recover in case your server is lost. For me, that's data, not OS installation, applications, or config. All of that can be reproduced from other sources (e.g. install from upstream, config in version control).
That makes sense. I'll have my IAC in Ansible, checked into a git repo. So the host and jails should be fully reproducible from that.

There are two ways you can use snapshots in this sort of scenario. The first is to replicate the snapshots directly using something like zelta. The other is to make a snapshot, and then rsync the files from that snapshot. This avoids the problem of files changing while rsync is running. To do that you would do something like rsync -avz --delete /.zfs/snapshots/foo-snapshot-name remote-server@/backups or maybe use a backup tool like restic or borg, but sourcing your latest snapshot dir. It's a nifty way to use zfs in an environment where you don't want to / can't receive snapshots. Also note that this approach is NOT recursive, so if you look at /.zfs/snapshots that will just be the stuff on your bectl root, you'll need to do the same for /home, /var, etc.

Personally I prefer replicating snapshots, it's very simple.

Bectl is new to me, looks very cool. I don't fully understand this sentence, "Also note that this approach is NOT recursive, so if you look at /.zfs/snapshots that will just be the stuff on your bectl root, you'll need to do the same for /home, /var, etc."

But overall, cool! It seems that replicating snapshots fit's my brain and should work for me.

One last thing: it's a good idea to test your backups automatically. You may need to restore from them one day, and it'll be a lot more comfortable if you know it works rather than crossing your fingers when the time comes. For example, you can zfs snapshot the postgres data dir and get a crash consistent backup (equivalent to if you pulled the plug on the server). It's usually reliable but at $JOB we've seen < 0.1% of the backups not be restorable. We know this because we verify every snapshot by mounting it into a jail and starting postgres on it to make sure it recovers.

I'm no expert here, but wouldn't it be preferable to pg_dump the live database and take a snapshot of that? It seems that snapshotting the db state files directly could be problematic... the postgres docs are pretty emphatic about that (linked below). Do you have scripts or suggestions for how to automate mounting and testing db snapshots?

patmaddox Thanks again! Your comments have been very helpful.

--ToddG

Links:
* https://github.com/jimsalterjrs/sanoid
* https://freebsdfoundation.org/blog/zfs-automatic-snapshots-with-sanoid-on-freebsd/
* https://zelta.space/home/start
* https://man.freebsd.org/cgi/man.cgi?query=bectl
* https://thedistrowriteproject.blogs...ntial-Guide-to-FreeBSD-Boot-Environments.html
* https://www.postgresql.org/docs/current/backup-dump.html
* https://www.postgresql.org/docs/current/backup-file.html
 
I assume I can mount the db datasets something like : zroot/SAFE/dev/db, zroot/SAFE/stage/db, etc.

Yep you can set datasets to any mountpoint you want. Jails are a little trickier because you need to mount things in the right order, and potentially delegate to the jail if you want to be able to do snapshots and create child datasets within the jail. If you don't want to do that, you can just mount it where you want it and go. This is easy with a simple script to start and stop jails, and I suspect the various jail managers do this as well.

I don't fully understand this sentence, "Also note that this approach is NOT recursive, so if you look at /.zfs/snapshots that will just be the stuff on your bectl root, you'll need to do the same for /home, /var, etc."

This is very, very important to understand when doing zfs snapshots. Every dataset has its own set of snapshots. So when you do a recursive snapshot like zfs snapshot -r zroot, what it does is create a snapshot of zroot and all its children, that are consistent with one another. Meaning there's no data race condition - they are all snapshotted at the same moment in time (really, block written at zpool level). But the snapshots themselves are totally separate. You can verify this by looking in /.zfs/snapshot. You might think that because it is mounted to / then you will see everything. But the dataset layout you shared (and the default in FreeBSD) has /home provided by a different dataset - zroot/home. You'll find the snapshots for it in /home/.zfs/snapshot.

Why this is matters is let's say you write a script to have restic backup from /.zfs/<latest-snapshot>, thinking that it will be a complete backup - you will be very sad when you try to restore. To provide the same backup functionality as replicating all the snapshots, you have to iterate over each dataset and rsync its respective snapshot.

wouldn't it be preferable to pg_dump the live database and take a snapshot of that? It seems that snapshotting the db state files directly could be problematic... the postgres docs are pretty emphatic about that (linked below). Do you have scripts or suggestions for how to automate mounting and testing db snapshots?

There are many backup procedures available in postgres ecosystem, and pg_dump(1) and zfs snapshots are just two of them. One thing to be aware of is that pg_dump only dumps a single database. It is not necessarily the full state of the postgres cluster (postgres term for what most people think of as "the server.") There's also the practical matter of whether you can regularly do a full dump. This all depends on db size, resources available, and your RTO / RPO. As for zfs snapshots, here's a quote from the docs on file system level backup:

An alternative file-system backup approach is to make a “consistent snapshot” of the data directory, if the file system supports that functionality (and you are willing to trust that it is implemented correctly). The typical procedure is to make a “frozen snapshot” of the volume containing the database, then copy the whole data directory (not just parts, see above) from the snapshot to a backup device, then release the frozen snapshot. This will work even while the database server is running. However, a backup created in this way saves the database files in a state as if the database server was not properly shut down; therefore, when you start the database server on the backed-up data, it will think the previous server instance crashed and will replay the WAL log. This is not a problem; just be aware of it (and be sure to include the WAL files in your backup). You can perform a CHECKPOINT before taking the snapshot to reduce recovery time.


If your database is spread across multiple file systems, there might not be any way to obtain exactly-simultaneous frozen snapshots of all the volumes. For example, if your data files and WAL log are on different disks, or if tablespaces are on different file systems, it might not be possible to use snapshot backup because the snapshots must be simultaneous. Read your file system documentation very carefully before trusting the consistent-snapshot technique in such situations.

So as I mentioned before, it's just like you pulled the plug on the server. Postgres is designed to be resilient to this condition. The caveats it gives are that the data and WAL are consistent in the snapshot - meaning either you keep WAL on same dataset as the rest of the data, or you make WAL a sibling or child of data dir and do a zfs recursive snapshot. It's simpler to keep it on same dataset, the only reason to not is if you have to tune the data and WAL datasets differently - and by that point you likely have developed a thorough understanding of what you're doing and why.

In any case, confidence in the backups comes from the restore process. zfs snapshots that are verified every time beat pg_dump that is first tested when it's for all the marbles, and vice versa.

I can't share the script (I'll see if we can make it public) but it's a pretty straightforward jail + zfs trick:
  • Clone snapshot to temporary dataset
  • Strip production config and write barebones postgresql.conf and pg_hba.conf for verification
  • Remove stale postmaster.pid file
  • Start in a jail with no network access, mount dataset to /var/db/postgres in jail
  • Start postgres with pg_ctl start -w (wait for startup)
  • Log result
  • Clean up: stop jail, destroy dataset
 
patmaddox Ok, that was super helpful.

This is very, very important to understand when doing zfs snapshots. Every dataset has its own set of snapshots. So when you do a recursive snapshot like zfs snapshot -r zroot, what it does is create a snapshot of zroot and all its children, that are consistent with one another. Meaning there's no data race condition - they are all snapshotted at the same moment in time (really, block written at zpool level). But the snapshots themselves are totally separate. You can verify this by looking in /.zfs/snapshot. You might think that because it is mounted to / then you will see everything. But the dataset layout you shared (and the default in FreeBSD) has /home provided by a different dataset - zroot/home. You'll find the snapshots for it in /home/.zfs/snapshot.

Why this is matters is let's say you write a script to have restic backup from /.zfs/<latest-snapshot>, thinking that it will be a complete backup - you will be very sad when you try to restore. To provide the same backup functionality as replicating all the snapshots, you have to iterate over each dataset and rsync its respective snapshot.

Ahh, this is something I did not fully understand. So if I create a dataset:
Code:
 zroot/SAFE
and then create child datasets
Code:
zroot/SAFE/dev, zroot/SAFE/stage, etc
then I'll need to back up
Code:
 zroot/SAFE
recursively. Btw this reddit thread helped me understand the difference and importance of datasets vs folders.

So my main takeaways are:

* put state that needs to be backed up in an obvious place, like zroot/SAFE
* create datasets for each grouping of things that need to be dealt with differently (permissions, backups, etc). In my case, definitely dev,stage,prod environments for jails would qualify
* understand how zfs is creating the snapshots
* practice with restoring snapshots
* for now, use ssh + rsync (or similar) to backup snapshots. think about transitioning to zfs + sanoid + zelta in the future
* understand postgres backup strategies. the file replication strategy you suggest sounds like it would be a good fit, so long as I test the backups

My mail queue now includes books by MWLucas on zfs and jails, a couple books on Ansible, and from this conversation I ordered some books on postgres.

Thx again!
 
Back
Top