Solved How to create new full snapshots?

I have created full snapshots before. If I want to create full snapshots again, must I destroy all existing snapshots?
Thanks.
 
There is no need to delete old snapshots. Ok, nihr43 was faster. The additional information is that the content of a previous snapshots is the difference to the next one. The latest snapshot represents the difference to the current content of the pool. You can monitor the behaviour by zfs list -t snapshot.
 
Thanks.
My understanding is that when I execute zfs snapshot -r zroot@backup1 at the first time, I will create a full snapshot, and then execute zfs snapshot -r zroot@backup* later , I will get incremental snapshots, is it right?

If I want to do a full backup every Monday and save it to another server, like this
zfs send -Rv zroot@backup1 | gzip > ssh host2 /root/backup1.gz ,
then next Monday, how do I create a full snapshot instead of an incremental snapshot?
 
Thanks.
My understanding is that when I execute zfs snapshot -r zroot@backup1 at the first time, I will create a full snapshot, and then execute zfs snapshot -r zroot@backup* later , I will get incremental snapshots, is it right?
No, that's incorrect.

Also see zfs(8) on snapshots:
Code:
     A snapshot is a read-only copy of a file system or volume. Snapshots can
     be created extremely quickly, and initially consume no additional space
     within the pool. As data within the active dataset changes, the snapshot
     consumes more data than would otherwise be shared with the active
     dataset.
A snapshot is just what it's name implies: a snapshot of the current state of your filesystem. After you make a snapshot it'll be empty and as data changes it gets more filled up.

So if you make a second snapshot all you're doing is saving the new state your filesystem is in at that time. This has nothing to do with incremental backups or anything.

For example:

Code:
$ zfs list -rt all zroot/home
NAME                USED  AVAIL  REFER  MOUNTPOINT
zroot/home         26.1G  51.2G  25.9G  /home
zroot/home@230718  10.9M      -  26.1G  -
zroot/home@240718  10.8M      -  26.1G  -
zroot/home@250718  10.8M      -  26.1G  -
zroot/home@260718  10.9M      -  26.1G  -
zroot/home@270718  10.9M      -  26.1G  -
zroot/home@280718  10.9M      -  25.9G  -
zroot/home@290718   430K      -  25.9G  -
There's a reason why 290718 is only 430K at this time: because nothing has happened on the filesystem so far.

If I want to do a full backup every Monday and save it to another server, like this zfs send -Rv zroot@backup1 | gzip > ssh host2 /root/backup1.gz , then next Monday, how do I create a full snapshot instead of an incremental snapshot?
Once again, see zfs(8). You'd need -i to (quote): "Generate an incremental stream from the first snapshot (the incremental source) to the second snapshot (the incremental target).". Since you're not using -i why assume that you'd be creating an incremental backup? You're not.

It even says so in the main description (quote): "By default, a full stream is generated.".

The commands above merely send the state of the (whole) filesystem as it was during the creation of the snapshot. As such: it'll be a full backup. Context is a thing here and within ZFS some aspects behave a bit differently. Having said that I also wonder if you really want to use -R if all you care for is a full backup of your data. ZFS sends a full stream by default, so there's no direct need.

-R is only useful if you want a full ZFS filesystem replication. Note: ZFS filesystem, so not just the data that's on it. Using -R implies that you want to save the full state of the filesystem, including all relevant snapshots. Yet that means that your backup doesn't merely cover the current state of your data, it also includes all of the snapshots. Which seems highly counter productive to me, depending on your setup of course.

Finally... How did you came up with this: zfs send -Rv zroot@backup1 | gzip > ssh host2 /root/backup1.gz?

That obviously won't work, from ssh(1):
Code:
SYNOPSIS
     ssh [-1246AaCfGgKkMNnqsTtVvXxYy] [-b bind_address] [-c cipher_spec]
         [-D [bind_address:]port] [-E log_file] [-e escape_char]
         [-F configfile] [-I pkcs11] [-i identity_file]
         [-J [user@]host[:port]] [-L address] [-l login_name] [-m mac_spec]
         [-O ctl_cmd] [-o option] [-p port] [-Q query_option] [-R address]
         [-S ctl_path] [-W host:port] [-w local_tun[:remote_tun]]
         [user@]hostname [command]
Note how it mentions [command] at the end? This means that ssh expects a command, not a file. This is also showcased in zfs(8). Details are important on a Unix-like environment, you should try not to gloss over them like that.

And you're missing out on many of those details. Because... what do you think > does? From sh(1): "[n]> file redirect stdout (or file descriptor n) to file".

So what you'd be doing is create a file called ssh which would contain your ZFS snapshot data.

This is what I'd do:

zfs send zroot@backup | gzip | ssh backup@host2 "dd of=/opt/backups/backup1.zfs.gz".

I'm sorry that my post may start to look like a summing up on dozens of failures as if I'm trying to rub it in. Actually... that last part is somewhat true, I am doing just that, but this is not to anyones amusement, only to warn you that there are some serious problems here. If you just go along in the direction you're headed then, no joking, I foresee your system getting taken over sooner or later (note: there are some assumptions at work here on my end, I'm not saying this is going to happen but there are some signs of major risks involved here).

See: the command you showed us earlier almost makes it look as if you can "just" login to a remote backup host as the root user, even without using any passwords. If that is your current situation then that's seriously bad. Especially if this server can somehow be reached from the outside and is using password authentication (hence my previous comments about assumptions).

At best you'd make a non-privileged backup user, give it a decent password (better yet: set up key based authentication for SSH) and obviously give it write access to a specific backup area. Then use the command I mentioned earlier.

Hope this can give you some ideas.
 
Oh, I am totally wrong! I thought that zfs send a first snapshot would get a full backup like a traditional backup.

Now I finally understand what is a snapshot, but I am lost. I want to implement backup data like traditional, it seems that zfs is hard to do. This is really bad, isn't it? If I plan to recover on a newly installed system, only the snapshots look useless, it's right?
 
I want to implement backup data like traditional, it seems that zfs is hard to do. This is really bad, isn't it?

I probably won't give you such a long answer.
But to me all this is not at all "bad".

The advice I took, was
1) don't do it like the traditional tape backup
2) do not put the backup into .gz files,
but instead set up another zpool and put the filesystem there. From personal experience,for a whole installed system, I do not mount this backup pool, just because it would be mounted at /, just like the original pool.

If you really have tape and do backup to tape, for that I don't know the solution to zfs. I don't have tape.

here in a bit short, how I do a backup on a local machine with a local backup pool
sudo zpool import -N backup-pool #-N: don't mount that pool
sudo zfs snapshot rpool@2018-08-06T21h30m
su
zfs send -RceLv rpool@2018-08-06T21h30m | zfs receive -duv backup-pool
# you possibly should simulate with "zfs receive -duvn backup-pool" before actually writing
# for incremental
zfs snapshot rpool@2018-08-06T22h30m
zfs send -RceLvI rpool@2018-08-06T21h30m rpool@2018-08-06T22h30m | zfs receive -duv backup-pool

# it's already a bit late here, hopefully no big errors in that.

And if you want it more like the traditional, maybe take a hardisk for each day, or 3 for 3 weeks, and only send incrementally every specified day or week. I have all the data in the backup-pool, and don't see the disadvantage.
 
Oh, I am totally wrong! I thought that zfs send a first snapshot would get a full backup like a traditional backup.

You can send just the delta between snapshots (called incremental send/receive), but this would be useless to restore the filesystem as it includes only the delta. You need to first replicate a pool, then you can send|receive all further deltas (snapshots) from the origin pool, which will only include the blocks that have been changed.

Have a look at zfs(8). The "description" part essentially covers all the questions you've asked here. For send/receive variations just look at these parts in the manpage. The zfs and zpool manpages are exceptionally thorough, so make sure to have a look at them first as they usually answer almost all basic questions much faster than searching/asking on the interwebs...

And the usual reminder: snapshots and replications are _NOT_ backups!
 
  • Thanks
Reactions: sdf
No,
You can send just the delta between snapshots (called incremental send/receive), but this would be useless to restore the filesystem as it includes only the delta. You need to first replicate a pool, then you can send|receive all further deltas (snapshots) from the origin pool, which will only include the blocks that have been changed.

Have a look at zfs(8). The "description" part essentially covers all the questions you've asked here. For send/receive variations just look at these parts in the manpage. The zfs and zpool manpages are exceptionally thorough, so make sure to have a look at them first as they usually answer almost all basic questions much faster than searching/asking on the interwebs...

And the usual reminder: snapshots and replications are _NOT_ backups!

Yes, "snapshots and replications are _NOT_ backups!"
 
And the usual reminder: snapshots and replications are _NOT_ backups!
Of course they are. And they can be quite reliable too, depending on your setup. I mean; on a RAID system the chances that your pool would suddenly go corrupt are decently slim.

Seriously, what do you think a backup actually is? By definition it's a copy of your data which can then be restored at the event of data loss. Which is exactly what snapshots can do for you. Especially if you keep a (daily) retention.

Of course keeping an offsite backup is a better idea because it will also protect you from any physical damage done to the system which could make it impossible to recover your data. But just because a snapshot is on-site doesn't automatically imply that it isn't a backup. That's plain out stupid, because this is exactly what a snapshot can do for you.

I've been using snapshots for backup (and restoration!) purposes for over 5 years now. And with success. Which I'd deem impossible if it didn't actually back up data.
 
If Lost server or damaged operating system, only the snapshots is kept, then don't you can't recover it?

No other sync server.
 
If Lost server or damaged operating system, only the snapshots is kept, then don't you can't recover it?
In normal English please?

This has nothing to do with snapshots being a backup or not.

At a customers place backups are made on tape and then kept in a locked storage on-site. So just because these tapes can get lost when a fire breaks out you can no longer consider this a backup? I beg to disagree ;)

The only thing this showcases is that you should also keep off-site backups. But just because a backup is kept on-site (or in this case on the server) doesn't imply that it's suddenly no longer a valid backup. It is.
 
You simply need to analyze your risk.

You can do that on your own, just need to learn how to do it.

The house with your server gets destroyed, all robbed, for whatever reason your server disappears? Your data is gone, you know that yourself.

All the redundant disks fail? Your data is gone, you know that yourself.

Your server does not boot up anymore?
You may boot it with an install media.
Here you get the chance to destroy all your data by administrator accident. (human error) Oh' I should not install that system on that disk with deleting all of it's contents?

Now you go and make your own analysis.

How much is your data worth?

All the company with 10? 100? employees depend on it for how long? How much money will it cost to go on?

How much can you therefore invest in a backup solution? How often do you need to backup? Where do you need to place and where can you place your backups?

What does it cost? What is it worth? Private data? Put a copy to a relative once a year? In a bank locker? In a fireproof safe in the same house? In the cloud?

To snapshots:
As long as your pool is in working order, and your snapshots did not get deleted you can for sure get the data out of your snapshots.

Is that backup? Here people argue, I'm not sure to take a side.

Once physically it's not necessary a copy, it only becomes a copy once the blocks get changed.

I don't see much use in splitting hairs on this.

But, is a copy on the same harddisk a backup?

The usual saying was: RAID is no backup.
Taking that, it would apply to any zpool.
 
Of course they are. And they can be quite reliable too, depending on your setup. I mean; on a RAID system the chances that your pool would suddenly go corrupt are decently slim.

Seriously, what do you think a backup actually is? By definition it's a copy of your data which can then be restored at the event of data loss. Which is exactly what snapshots can do for you. Especially if you keep a (daily) retention.

To be a bit more precise: a backup should be an immutable copy of the data. That's only half true for snapshots i.e. they aren't writeable, but they can be deleted e.g. by a buggy backup routine which one day might have an empty value for the snapshot(s) it should discard and nukes all snapshots.
Also a filesystem bug in ZFS (e.g. like the one ZoL was hit by recently) could propagate to the replicated pool.
The key is to keep regular cold copies of a pool to be able to go back in time even if something bad (e.g. a filesystem bug) managed to propagate to the replicated pool.

An automated replication is a very good first tier for backups - that's what we do here and I do at home too because it is very efficient and easily accessible. But it shouldn't be the only and last stage for backups; especially for important data.

I replicate (almost) the whole pool of most systems to a local backup server, which then replicates a smaller subset (e.g. minus system datasets) to a NAS in another building and (with more exceptions) to another branch.
The _very_ important datasets (user/customer/business data) are then also backed up to cold storage AND the data they contain is tar'ed and also backed up to cold storage (and partially to tarsnap), so it can be recovered independently from the filesystem.

I had to learn the hard way a few years ago to _never_ rely on _any_ filesystem for backups - a single dead sector within a filesystem superblock on a backup drive led to ~70% of the data lost. With raw (or tar'ed) data (-> tape), the impact would have been most likely limited to a single file. So I always keep "filesystem-independent" backups at least of all important stuff around.

Another lesson well learned: keep redundant backups of important data. For "warm offsite" backups I'm using tarsnap and the private keybase storage for my most important stuff. With only ~100MB and relatively slow/minor deltas, tarsnap is dirt-cheap (less than 0.10$ per month!!) and my 10$ deposit will probably last for many, many years :D

Just to clarify: I'm not saying you shouldn't have pool replications around as backups, in fact you definitely should. But always keep in mind that replicating something also means you'll replicate everything that might happen to the original and therefore it shouldn't be considered a "true" backup (=>immutable!).
 
Back
Top