ZFS The nature of zfs zend | zfs receive

arapaima · Oct 15, 2015

As far as I understand it zfs send | zfs receive works in the same manner as snapshots, that is storing the differences from the origin dataset in the snapshot (or in this case as a browsable dataset).
So if I completely remove the disk from the pool I'm zfs sending from, will the disk pool I received at contain the complete datasets even when physically move the disk to a new computer?

usdmatt · Oct 15, 2015

The send|recv commands completely duplicate the contents of a snapshot.

When you run the first send, you will have a dataset full of data on the source pool, and none of that data on the receiving pool. At this point you have no other option that to do a full send which will send all data, creating a second, completely independent copy.

Code:

# zfs snapshot pool1/dataset@snap1
# zfs send pool1/dataset@snap1 pool2/dataset

You can do what you want with those two pools, they both contain the full data. Importantly they also both have the @snap1 snapshot, which can be used by ZFS as a reference point where it can be certain both datasets are identical.

Using that reference point, you can then send just what's changed, but you still end up with both pools containing completely independent copies of the data.

Code:

# zfs snapshot pool1/dataset@snap2
# zfs send -i snap1 pool1/dataset@snap2 pool2/dataset

Note that if you want to continue doing incremental sends (which are brilliant for backups), you always need to keep hold of at least one snapshot that you sent over*, so that ZFS has that identical snapshot on both that it can use as a reference point to know what data each end already has.

*In more recent versions you can zfs bookmark the snapshot, then delete it. This frees the space used by the snapshot, but the bookmark keeps hold of that reference point and can be used for an incremental send.

arapaima · Oct 15, 2015

usdmatt said:
The send|recv commands completely duplicate the contents of a snapshot.

When you run the first send, you will have a dataset full of data on the source pool, and none of that data on the receiving pool. At this point you have no other option that to do a full send which will send all data, creating a second, completely independent copy.

Code:

# zfs snapshot pool1/dataset@snap1 # zfs send pool1/dataset@snap1 pool2/dataset

You can do what you want with those two pools, they both contain the full data. Importantly they also both have the @snap1 snapshot, which can be used by ZFS as a reference point where it can be certain both datasets are identical.

Using that reference point, you can then send just what's changed, but you still end up with both pools containing completely independent copies of the data.

Code:

# zfs snapshot pool1/dataset@snap2 # zfs send -i snap1 pool1/dataset@snap2 pool2/dataset

(...)

So what zfs send | zfs receive does is actually transfer a completly independed copy? Meaning the copy can be fully accessible even though the source is removed?

I was thinking this as a way of temporarly storing data on a secondary disk. Something like this:

zfs snapshot myPool/usr/local@snap | zfs receive backup/storage

..and then remove the myPool disks and transfer data back later from backup pool.

Another option for this specific task would be just to use rsync or cp, but my impression is that using ZFS tools is more accurate and gives me the best performance.

usdmatt · Oct 15, 2015

Yes it makes a completely independent copy. And yes, it's a lot more efficient than cp or rsync, especially for incremental copies.

ZFS The nature of zfs zend | zfs receive

arapaima

usdmatt

arapaima

usdmatt