I recently upgraded from 12.1 up to 13.1 and my backup process (using
zfs-auto-snapshot
+
zxfer
) has started failing with
out of space errors.
Code:
Aug 8 06:26:49 salus root[43373]: Sending zroot/iocage/jails/minerva/root@zfs-auto-snap_hourly-2022-08-07-01h00 to backup/venus/zroot/iocage/jails/minerva/root.
Aug 8 06:26:49 salus root[43373]: cannot receive new filesystem stream: out of space
Aug 8 06:26:49 salus root[43373]: Error when zfs send/receiving.
zxfer
sends snapshots from the file server (
venus/zroot) to the backup server (
salus/backup). Similar snapshots transfer without error during the backup. There should be plenty of room in the backup destination pool. The pools look like this after the failure:
Code:
ccammack@venus:~ $ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
zroot 3.62T 1.72T 1.90T - - 0% 47% 1.00x ONLINE -
Code:
ccammack@salus:~ $ zpool list
NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
backup 3.62T 1.71T 1.91T - - 0% 47% 1.00x ONLINE -
zroot 117G 18.6G 98.4G - - 22% 15% 1.00x ONLINE -
As I'm writing this, the root@zfs-auto-snap_hourly-2022-08-07-01h00 snapshot has already been deleted, so I don't know what it looked like, but the data doesn't change much, so the hourly snaps look mostly the same.
Code:
ccammack@venus:~ $ zfs list -t snapshot | grep -E 'minerva.+hourly'
zroot/iocage/jails/minerva@zfs-auto-snap_hourly-2022-08-08-08h00 0B - 108K -
zroot/iocage/jails/minerva/root@zfs-auto-snap_hourly-2022-08-07-09h00 340K - 1.71T -
zroot/iocage/jails/minerva/root@zfs-auto-snap_hourly-2022-08-07-10h00 340K - 1.71T -
zroot/iocage/jails/minerva/root@zfs-auto-snap_hourly-2022-08-07-11h00 340K - 1.71T -
Missing snapshots produce the error
dataset does not exist during the transfer, so I don't
think missing snapshots are mistakenly triggering the new out of space errors.
Code:
Aug 7 21:45:17 salus root[74987]: Sending zroot/var/audit@zfs-auto-snap_frequent-2022-08-07-21h30 to backup/salus/zroot/var/audit.
Aug 7 21:45:17 salus root[74987]: (incremental to zroot/var/audit@zfs-auto-snap_hourly-2022-08-07-21h00.)
Aug 7 21:45:17 salus root[74987]: cannot open 'zroot/var/audit@zfs-auto-snap_frequent-2022-08-07-21h30': dataset does not exist
Aug 7 21:45:17 salus root[74987]: cannot receive: failed to read from stream
Aug 7 21:45:17 salus root[74987]: Error when zfs send/receiving.
After upgrading, I did have to manually change the
zxfer
script to ignore some of the new ZFS properties to get it to run. Both the source and destination pools are GELI-encrypted rather than native ZFS-encrypted.
Code:
$ diff /usr/local/sbin/zxfer.old /usr/local/sbin/zxfer
181c181
< userrefs"
---
> userrefs,objsetid,keylocation,keyformat,pbkdf2iters,special_small_blocks"
Is there anything that might have changed between 12.1 and 13.1 that could cause conflicts between
zfs-auto-snapshot
and
zxfer
that make
zfs send | zfs receive
think the destination is out of space? Maybe I misconfigured something after upgrading?
The obvious answer is that it's actually out of space, but I just don't see how.