ZFS How to guides or good resources for zfs

jb1277976 · Dec 11, 2023

The internet is filled with how to guides on zfs and so is youtube. I saw the handbook part on zfs but it only scratches the surface.

I want to basically keep a backup of my laptop that is running zfs i only have USB drives so i would like to mirror what is only my laptop so if something happens to my laptop i have a backup. of it

What are things i can do with zfs ?

What guides do you guys recommend?

Since i got between linux (Debian) and a (mac) computer can i access my zfs backup data or the pool from there?

Sorry for this question but i want to see first hand what is recommended instead of going on a wild goose chase.

patmaddox · Dec 12, 2023

Books by Michael Lucas:

FreeBSD Mastery: ZFS
FreeBSD Mastery: Advanced ZFS

ccammack · Dec 12, 2023

If you use ZFS on the external drives also, you can use one of the many zfs backup scripts to replicate the data from pool to pool.

Be sure to run zfs scrub on the laptop and external drives regularly so you'll get a warning before they go bad.

I've had good luck using syncoid to replicate to an external drive using a command like this:

syncoid --recursive --force-delete --no-privilege-elevation --debug zroot backup/zroot

astyle · Dec 12, 2023

I agree that the Handbook's chapter on ZFS only scratches the surface of what's possible with ZFS.

BUT - even that much is frankly enough to start designing a backup strategy. The manpage (zfs(8)) also provides plenty of examples.

Instead of going on a wild goose chase and trying to collect as much ZFS knowledge as possible, I'd recommend a strategy of keeping things simple. See if you can apply examples from the manpage or the Handbook to your scenario. For example, try thinking through the zfs clone example from the manpage.

So, in a nutshell: Keep It Simple. And stick to ZFS as much as possible, don't bother with other stuff. ZFS is THAT capable.

But if you wanna get fancy, did you know that poudriere-image(8) takes care of a LOT of details for you? Well, even the manpage for it says that it's very alpha-quality, not for machines you don't want to take chances on.

richardtoohey2 · Dec 12, 2023

astyle said:
And stick to ZFS as much as possible, don't bother with other stuff.

But make sure that you can definitely access ALL your files on your backup device(s) when using Linux & Mac.

If you are finding that you can't read the backups then you'll need a different strategy. So best to do some simple tests to make sure it's going to work for you and your data.

gpw928 · Dec 12, 2023

My ZFS server is also a backup server, and an NFS server.

I use rsnapshot(1) to maintain a time-series of de-duplicated FreeBSD and Linux client backups on-line.

The backups, along with photos, video recordings, software archives, and other sundry data on the ZFS server are all available to NFS clients.

All the data on my ZFS server are routinely copied to external media and sent off-site. To do that, I just zfs-send(8) the entire tank to a removable disk.

The Michael Lucas ZFS books are worth reading.

jb1277976 · Dec 13, 2023

Got it. I've learned so much in this VM and everywhere and this forum. i love this file system. once i know more and comfortable im going to do a fresh install with zfs

it's that good

jb1277976 · Dec 13, 2023

ccammack said:
If you use ZFS on the external drives also, you can use one of the many zfs backup scripts to replicate the data from pool to pool.

Be sure to run zfs scrub on the laptop and external drives regularly so you'll get a warning before they go bad.

I've had good luck using syncoid to replicate to an external drive using a command like this:

syncoid --recursive --force-delete --no-privilege-elevation --debug zroot backup/zroot

So according to that command. i can backup my whole zroot pool to an external and in case something happens or i reinstall i can re-import it and it should be fine? i think that's what you are saying ?

SirDice · Dec 13, 2023

Klara systems regularly posts interesting articles about ZFS. And you should be able to find some good articles in the FreeBSD journal too.

Articles

Follow our FreeBSD and ZFS article series, written by experts in the field, and find out useful facts and news from the FreeBSD and ZFS industry.

klarasystems.com

Browser-Based Edition – FreeBSD Foundation

freebsdfoundation.org

ccammack · Dec 13, 2023

jb1277976 said:
So according to that command. i can backup my whole zroot pool to an external and in case something happens or i reinstall i can re-import it and it should be fine? i think that's what you are saying ?

My recovery plan is to reinstall, then replicate the backup to /tmp, then cherry pick what I need restore. The server is mostly just a jail host though, so it's easy. (The backup drives contain data from multiple machines, so the backup pools are named backup/<host>/zroot)

For a personal machine, I would be curious to see what other people do.

Is there a reasonable way to back up to an external drive and then swap drives and boot from the backup?
Or can I replicate the backup on top of a new currently-running zroot to restore it to the previous state?
Or do people make an effort to separate the system files from the data to make recovery easier?

What are some of the options?

astyle · Dec 13, 2023

jb1277976 said:
So according to that command. i can backup my whole zroot pool to an external and in case something happens or i reinstall i can re-import it and it should be fine? i think that's what you are saying ?

I actually took a look at that syncoid / sanoid github page. It is an interesting tool, for sure. It does take care of some of the complexities of ZFS features by providing convenient meta-scripting flags.

I'd caution against using that tool blindly - until you have a good handle on using straight ZFS commands like zfs-snapshot(8) or zfs-rollback(8). It's important to pay attention to whether you're working with a zpool, a zvol, or a dataset - those ideas are covered by info in the Handbook. And I had no idea that snapshots are taken of the entire zpool, but rollbacks are per-dataset.

ccammack :

ccammack said:
Is there a reasonable way to back up to an external drive and then swap drives and boot from the backup?

I can refer you to poudriere-image that I mentioned earlier in this thread. If you want warm replications, you can take a snapshot of what you have, then zfs send that snapshot to the external drive.

ccammack said:
Or can I replicate the backup on top of a new currently-running zroot to restore it to the previous state?

That's what zfs-snapshot (per-pool) and zfs-rollback (per-dataset) are for. I had a discussion about that on these Forums, just use the search feature on top of the page.

ralphbsz · Dec 13, 2023

ccammack said:
My recovery plan is to reinstall, then replicate the backup to /tmp, then cherry pick what I need restore. The server is mostly just a jail host though, so it's easy. (The backup drives contain data from multiple machines, so the backup pools are named backup/<host>/zroot)

For a personal machine, I would be curious to see what other people do.

Is there a reasonable way to back up to an external drive and then swap drives and boot from the backup?

Or can I replicate the backup on top of a new currently-running zroot to restore it to the previous state?

Or do people make an effort to separate the system files from the data to make recovery easier?

What are some of the options?

Here's what I do (and I do not recommend following this). My backup software is home-written, and I'm the only user. It is intended to protect me against (a) physical destruction of my server at home (such as fire or water), and (b) deletion or modification of files. It doesn't need to protect against single disk errors, since the file system being backed up is already using mirroring (and yes I know I should move to dual-fault tolerant RAID, but there are bigger problems on my to-do list at home). It (c) protects against a catastrophic bug in the file system software stack (ZFS and FreeBSD), but recovery from that would be inconvenient.

It sweeps the whole file system hourly, and looks for any files that have changed since the last backup (by looking at mtime and size), or for new files, and for files that have vanished (have been deleted). It then stores all the backups on a separate ZFS file system, which is on a separate disk, about 2m (6 feet) away from the server in a fire-resistant safe. The backup is deduplicated (using whole file dedup), and never deletes any backups, so for files that change all the time, there tend to be hundreds of versions stored. I sometimes clean those up manually. The backup is only 1.5x larger than the real file system, which shows that my file system at home is mostly used in an archival fashion, and rapidly changing files are manually excluded from backup.

This "local" backup is then supposed to be automatically replicated to a remote backup (which used to be a machine physically running in my office, and is now a cloud server). That remote replication has sadly been broken since last winter (nearly a year ago), when we had weeks of network outages at home; so for now the remote replication is done by having a pair of USB-connected disks which are stored far away from home in a secure location, and every week or two one of them is brought home and manually refreshed using rsync. I need to put a weekend of work into re-engineering the remote backup to (a) be up to date again and run automatically, and (b) be resilient against long network outages.

The remote backup does intentionally not use FreeBSD and ZFS; it used to run in Linux using ext4, and is now using MacOS and APFS. But if my server at home were destroyed due to a failure of FreeBSD or ZFS, then restoring it would be very inconvenient, since it would require a giant cross-platform rsync.

One of the design principles of my backup system is: I don't bother backing up the OS install itself. So the backup only contains /home, and to make re-install easier /etc and /usr/local/etc. So if my server were to be physically destroyed, I would have to first install the OS and get that to work (using old copies of /etc as a guide), then copy /home back.

I used to have a system where every hour I would take a copy of the non-home file systems (initially that was done by dd'ing the boot SSD to a second SSD every hour, later using rsync to update from the boot SSD to a small spare area on the data disk). Since my recent re-install (when the root file system moved from UFS to ZFS), that has been abandoned too. I need to get that back to life, and that's pretty high up on my to-do list.

So to your questions:

No, I have not yet accomplished being able to boot from my local ZFS backup disk, but that hasn't been a goal. It would be nice to have, and I should invest a weekend of work into getting there but there's always something more urgent to work on first.
Replicating the backup on top of the currently running system is not easy. The backup file system contains lots of "deleted or modified" files, which are marked by having a "#" in their file name. So a simple "zfs send ... | zfs receive ..." would not work. Plus, some metadata (file attributes such as owner/permissions) are not kept in the backup file system itself, but in a separate database. So the restore script works by issuing a huge series of copy commands, and restore is not automated. I've never had to perform a full restore of the backup, and it would be a multi-day ordeal if I ever wanted to. But given the next answer, that's probably not too bad.
I deliberately do not backup the system files, meaning everything that can be restored by instead doing a new OS install, followed by ports and packages, and non-FreeBSD software (such as python modules from pip, my own software from my own mercurial server). This immediately implies that a full restore will be a massive multi-day ordeal anyway.

So far, the only restores I've had to do was for files that were unintentionally deleted, and then only a few files or directories at a time. I know the old saying "it's not a backup until you have done a restore", and I should really do a fire drill of attempting a full restore from the remote sometime. Maybe after I retire and before I have to look after grandkids ...

astyle · Dec 13, 2023

ralphbsz said:
So a simple "zfs send ... | zfs receive ..." would not work.

Yeah, it would - zfs send / receive works with snapshots, not just files.

Just send a snapshot over to the target machine, and use zfs rollback on the target machine to restore. To pull this off, you gotta know where the sent snapshots land on the target machine.

jb1277976 · Dec 14, 2023

patmaddox said:
Books by Michael Lucas:

FreeBSD Mastery: ZFS
FreeBSD Mastery: Advanced ZFS

Thanks these books are so awesome. I'm going to reinstall tonight with zfs

this VM is making me anxious

jb1277976 · Dec 15, 2023

So I read ( Got the concepts )

FreeBSD Mastery: ZFS
FreeBSD Mastery: Advanced ZFS

Good books I feel like i got the basics down enough to be dangerous lol.

I'm having some issues though and i need some clarification

lets say i send a snapshot to a external drive with zfs send zroot/home/test@test123 |gzip > /mnt/backuptest.gz thats fine right. 3 days later i want to send it to a newly installed system how do i send it to the new system ? I know its zfs recv but i don't know what else to do at that point.

Can you explain it or give me a command so i can study it

Thanks

patmaddox · Dec 15, 2023

zfs-receive receives snapshot data onto a dataset. So in your case you would gunzip and pipe into receive. Something like ssh backup-server “cat /mnt/backuptest.gz” | gunzip - | zfs-receive zroot/new-dataset. Lots of examples out there. The basic idea is that you have compressed data on another machine that you want to receive. You need to get the data to your machine, uncompress it, and then receive it. I did it in one line (which is probably broken) but you could do it in separate steps with scp etc.

jb1277976 · Dec 15, 2023

patmaddox said:
zfs-receive receives snapshot data onto a dataset. So in your case you would gunzip and pipe into receive. Something like ssh backup-server “cat /mnt/backuptest.gz” | gunzip - | zfs-receive zroot/new-dataset. Lots of examples out there. The basic idea is that you have compressed data on another machine that you want to receive. You need to get the data to your machine, uncompress it, and then receive it. I did it in one line (which is probably broken) but you could do it in separate steps with scp etc.

Thanks for the reply. Why do I have to create a new dataset? Is it because the metadata in the snapshot is using the old dataset or is that just how it works? Also what do you think is a good naming convention. test23 and others don't work I get confused. I'm doing this in a vm so when I'm more comfortable I will do it on my main machine.

patmaddox · Dec 15, 2023

You said you want to restore that snapshot to a fresh machine, right? In that case, the new machine doesn't have an existing dataset with a common parent to your saved snapshot. So it has to create a new one.

btw for zfs-specific testing, you can create a zpool backed by md(4) or even a file. So this way you can experiment to your heart's content with zfs commands, without messing with a disk (even a virtual one in a VM).

Here's an example Makefile, script that calls it, and results of running it. It's an example of how you can do very lightweight experimentation with zfs commands to learn how it works.

Makefile

Code:

.PHONY: zpool clean snapshot snapshots send-receive list

ZPOOL=        test-zpool
SPARSE_FILE=    ${ZPOOL}.sparse
SPARSE_SIZE=    100MB
ZPOOL_LIST=    zpool list ${ZPOOL} > /dev/null 2>&1

${ZPOOL}: ${SPARSE_FILE}
    if ! ${ZPOOL_LIST}; then doas zpool create ${ZPOOL} $$(realpath ${SPARSE_FILE}); fi

${SPARSE_FILE}:
    truncate -s ${SPARSE_SIZE} ${.TARGET}

clean:
    if ${ZPOOL_LIST}; then doas zpool destroy ${ZPOOL}; fi
    rm -f ${SPARSE_FILE}

snapshot:
    TZ=UTC doas zfs snapshot ${ZPOOL}@$$(date "+%Y%m%d%H%M%S")

snapshots:
    zfs list -t snap ${ZPOOL}

send-receive:
    doas zfs send $$(zfs list -H -t snap ${ZPOOL} | awk '{print $$1}' | sort -R | head -n 1) | doas zfs receive ${ZPOOL}/received-$$(pwgen 6 1)

list:
    zfs list -r -t filesystem ${ZPOOL}

script

Code:

#!/bin/sh
make clean
make
for i in $(jot 3); do make snapshot; sleep 1; done
make snapshots
for i in $(jot 3); do make send-receive; done
make list

output

Code:

$ ./example.sh
if zpool list test-zpool > /dev/null 2>&1; then doas zpool destroy test-zpool; fi
rm -f test-zpool.sparse
truncate -s 100MB test-zpool.sparse
if ! zpool list test-zpool > /dev/null 2>&1; then doas zpool create test-zpool $(realpath test-zpool.sparse); fi
TZ=UTC doas zfs snapshot test-zpool@$(date "+%Y%m%d%H%M%S")
TZ=UTC doas zfs snapshot test-zpool@$(date "+%Y%m%d%H%M%S")
TZ=UTC doas zfs snapshot test-zpool@$(date "+%Y%m%d%H%M%S")
zfs list -t snap test-zpool
NAME                        USED  AVAIL     REFER  MOUNTPOINT
test-zpool@20231214225643     0B      -       96K  -
test-zpool@20231214225644     0B      -       96K  -
test-zpool@20231214225645     0B      -       96K  -
doas zfs send $(zfs list -H -t snap test-zpool | awk '{print $1}' | sort -R | head -n 1) | doas zfs receive test-zpool/received-$(pwgen 6 1)
doas zfs send $(zfs list -H -t snap test-zpool | awk '{print $1}' | sort -R | head -n 1) | doas zfs receive test-zpool/received-$(pwgen 6 1)
doas zfs send $(zfs list -H -t snap test-zpool | awk '{print $1}' | sort -R | head -n 1) | doas zfs receive test-zpool/received-$(pwgen 6 1)
zfs list -r -t filesystem test-zpool
NAME                         USED  AVAIL     REFER  MOUNTPOINT
test-zpool                   852K  39.2M      104K  /test-zpool
test-zpool/received-aePh3C    96K  39.2M       96K  /test-zpool/received-aePh3C
test-zpool/received-aiMuv9    96K  39.2M       96K  /test-zpool/received-aiMuv9
test-zpool/received-aiph1O    96K  39.2M       96K  /test-zpool/received-aiph1O

Erichans · Dec 15, 2023

If you have absorbed the ZFS essentials, then you probably will already know this, but nevertheless.

As you seem to want to access your ZFS pool(s) from various OS-es, be very careful when considering upgrading your zpool with zpool-upgrade(8). Upgrading may well be jeopardising your access from one OS or the other and also previous OS versions that do not support the upgraded feature flags. Be aware which tools or properties are available on one OS and not on the other(s), for example BEs and its accompanying commands like bectl(8)

jb1277976 · Dec 15, 2023

Erichans said:
If you have absorbed the ZFS essentials, then you probably will already know this, but nevertheless.

As you seem to want to access your ZFS pool(s) from various OS-es, be very careful when considering upgrading your zpool with zpool-upgrade(8). Upgrading will may well be jeopardising your access from one OS or the other and also previous OS versions that do not support the upgraded feature flags. Be aware which tools or properties are available on one OS and not on the other(s), for example BEs and its accompanying commands like bectl(8)

Thanks

astyle · Dec 15, 2023

If it's just files, I frankly recommend NFS or Samba in addition to ZFS. That makes OP's files accessible from other OS'es, while making zpool-upgrade(8) a safer option. What matters here is direction of mounting a file share.

jb1277976 · Dec 15, 2023

patmaddox said:
zfs-receive receives snapshot data onto a dataset. So in your case you would gunzip and pipe into receive. Something like ssh backup-server “cat /mnt/backuptest.gz” | gunzip - | zfs-receive zroot/new-dataset. Lots of examples out there. The basic idea is that you have compressed data on another machine that you want to receive. You need to get the data to your machine, uncompress it, and then receive it. I did it in one line (which is probably broken) but you could do it in separate steps with scp etc.

Everyone in this thread has been helpful and thank you everyone. This specific post helped me with what i wanted in the end game. The books are awesome and i'm testing them out in a vm as i go.

I backed up the snapshot to a zvol that i made a mbr/dos volume. backed up the snapshot deleted the original snapshot. did the zfs recv rollbacked and bam. it was the way it was originally.

This helped so much

Thanks again everyone

homeadm · Dec 17, 2023

jb1277976 said:
Also what do you think is a good naming convention. test23 and others don't work I get confused.

A great convention for naming snapshots is date or time they were taken, as Patmaddox demonstrated in his script. You immediately see which snapshot was the previous and the next one. For my automatic snapshots taken by cron, I use following command:

zfs snapshot [name of dataset]@`date +%y%m%d`

And for dataset snapshotted more than once a day:

zfs snapshot [dataset]@`date +%y%m%d%H%M`

astyle · Dec 17, 2023

homeadm said:
A great convention for naming snapshots is date or time they were taken, as Patmaddox demonstrated in his script. You immediately see which snapshot was the previous and the next one. For my automatic snapshots taken by cron, I use following command:

zfs snapshot [name of dataset]@`date +%y%m%d`

And for dataset snapshotted more than once a day:

zfs snapshot [dataset]@`date +%y%m%d%H%M`

That can make it a little difficult to read, especially if you don't have good notes on why exactly you took that snapshot in the first place.

patmaddox · Dec 18, 2023

astyle said:
That can make it a little difficult to read, especially if you don't have good notes on why exactly you took that snapshot in the first place.

For automatic snapshots, there's not a distinct "why".

But for one-off snapshots, I add a suffix - zfs snapshot [dataset]@$(date +"%y%m%d%H%M)-before-i-break-my-system

ZFS How to guides or good resources for zfs

Administrator