ZFS Exclude specific datasets from full incremental pool backup

zirias@

Developer
For my private server, I want to have a full backup of the ZFS pool, including all datasets, snapshots, clones (also newly created ones since the last backup). The replication mode (flag -R) of zfs send/recv basically does the right thing. So currently, I'm using this script that served me well for a long time now, which was created by modifying some script I found somewhere...
Bash:
#!/bin/sh -e

KEEPOLD=3
PREFIX=backup

MASTERPOOL=${1:-zroot}
BACKUPPOOL=${2:-backup}

echo Backup from $MASTERPOOL to $BACKUPPOOL.

zpool import -N $BACKUPPOOL

recentBSnap=$(zfs list -rt snap -H -o name $BACKUPPOOL/$MASTERPOOL \
        | grep "$BACKUPPOOL/$MASTERPOOL@${PREFIX}-" | tail -1 | cut -d@ -f2)
NEWSNAP=$MASTERPOOL@$PREFIX-$(date '+%Y%m%d-%H%M%S')

if test -z "$recentBSnap"; then
        zfs snapshot -r $NEWSNAP
        zfs send -Rcv $NEWSNAP | zfs recv -Fuv $BACKUPPOOL/$MASTERPOOL
else
        origBSnap=$(zfs list -rt snap -H -o name $MASTERPOOL \
                | grep $recentBSnap | head -n1 | cut -d@ -f2)

        if test "$recentBSnap" != "$origBSnap"; then
                echo Error, snapshot $recentBSnap does not exist in $MASTERPOOL.
                zpool export $BACKUPPOOL
                exit 1
        fi

        zfs snapshot -r $NEWSNAP
        zfs send -RcvI @$recentBSnap $NEWSNAP \
                | zfs recv -Fuv $BACKUPPOOL/$MASTERPOOL

        zfs list -rt snap -H -o name $BACKUPPOOL/$MASTERPOOL \
                | grep "$BACKUPPOOL/$MASTERPOOL@${PREFIX}-" | tail -r \
                | tail +$(($KEEPOLD + 1)) | xargs -n 1 zfs destroy -r
        zfs list -rt snap -H -o name $MASTERPOOL \
                | grep "$MASTERPOOL@${PREFIX}-" | tail -r \
                | tail +$(($KEEPOLD + 1)) | xargs -n 1 zfs destroy -r
fi

zpool export $BACKUPPOOL

The only issue is: there are a few datasets that are just "scratch data", so a backup doesn't make any sense. The most massive of them is my ccache for the poudriere builder. I don't mind the space it consumes, but it also consumes time, and it's a bit annoying having to wait for a backup of stuff that really really doesn't need a backup.

So, I'm looking for a simple and reliable way to just exclude a few specific datasets from my full backup, while still automatically including any new datasets etc (and also deleting datasets in the backup pool that don't exist any more).

If there's some premade tool based on zfs send/recv that could be configured to do what I want, please suggest that as well, I don't insist on using a simple script here ;) – but restoring the pool should be possible with just stock ZFS tooling.
 
Someting i do,
Code:
ZT/usr/home                                   22.5G   190G  21.8G  /usr/home
ZT/nosnapshot                                  821M   190G    96K  none
ZT/nosnapshot/xcache                           821M   190G   821M  /usr/home/x/.cache
When i backup /usr/home i don't backup .cache
 
Alain De Vos this kind of describes my goal, but I'm still missing the information how you exclude it? Or do you just backup ZT/usr/home (as opposed to ZT)? I do want a full backup of the pool...
 
zirias@ I don't have the books in front of me at the moment, but one of the Michael Lucas ZFS books talked about something like this (going by memory, so it may not have been exactly the same as you).
I think part of the solution was adding a property to datasets you wanted backed up, so only things with that property set were included, everything else was ignored.
It sounds like you want the inverse of that "Include all datasets except for ones with this property set".
I'm not sure how to implement it, but maybe it gives you something to think about.
 
mer "properties" might be a keyword to continue investigation, so, thanks :) (edit, and, as they are inherited by default, it might be enough to set it on the root dataset and explicitly clear it on the few datasets I don't want ... let's search the web now, hehe)
 
Alain De Vos this kind of describes my goal, but I'm still missing the information how you exclude it? Or do you just backup ZT/usr/home (as opposed to ZT)? I do want a full backup of the pool...
Yeah, i just backup ZT/usr/home.
I don't need a backup of the full zpool , as i contains data i don't need.
Note i also do a backup of:
Code:
ZT/ROOT/default                               52.7G   186G  52.7G  legacy
 
mer "properties" might be a keyword to continue investigation, so, thanks :) (edit, and, as they are inherited by default, it might be enough to set it on the root dataset and explicitly clear it on the few datasets I don't want ... let's search the web now, hehe)
I found the section I was referring to. His first book, it talks about using the package zfstools, specifically the zfs-auto-snapshot, the example is specifically setting a property on the pool so it gets inherited by everything and then explicitly turning to off for datasets you want to ignore.

Sounds pretty much what you want, no?

It also mentions zfs-auto-snapshot is written in Ruby, so you should be able to pull out the logic you need.
 
mer sounds exactly like my usecase indeed! And if I need some ruby-tool to do the backups, that would be fine as well, as long as a restore remains possible with nothing but stock ZFS tools. So, thanks again for providing some hints what to look for!
 
  • Like
Reactions: mer
Hello everyone, my first post here.

Alain gave a good solution. However, the simplest method is: right after creation of recursive snapshot, destroy snapshots of child datasets selected to be omitted.
 
To elaborate on Alain De Vos ’s comment, the least complicated way to achieve this is to have those filesystems you don’t wish to backup in a separate portion of the ZFS tree, and to set the mountpoint to place them where you would like them to logically appear.
 
Back
Top