ZFS Can I dd copy a ZFS disk

Phishfry · Dec 13, 2023

I am using zfs on thumbdrives in testing now.
Can I do a literal dd backup of the 16GB USB disk like I do with UFS2? I know it is a waste because not all 16GB is used.

A disk is a disk right? Filesystem is not important to dd correct? Bits are bits?
I now understand ZFS-send and streams and clones are more appropriate.
Can you still use dd if you don't mind the sparsity?
Just to be clear I am talking single disk ZFS.

mer · Dec 13, 2023

Interesting question.
I would say, sure you should be able to dd a device with ZFS to somewhere else.
I think the problems start when you try to USE that copy.
I'm thinking about metadata here on the original, in theory is a bit copy to the destination.

Try it, then zpool export the original, and see what zpool import shows

ralphbsz · Dec 13, 2023

In theory, it should work.

Except that it can't work perfectly. What you really have on that thumb drive is a zpool, containing a single device (the thumb drive itself). That zpool has some metadata, which says that "this is pool XYZ, and it contains a single device called ABC". That metadata is written on the thumb drive itself. And then the information that "this physical disk is a ZFS volume called ABC" is also stored on the thumb drive.

Now imagine what happens when you plug BOTH thumb drives in at the same time. ZFS (the software, not the data structures on disk) will see two zpools, but both of them say that they are XYZ. It will see two physical devices, both of which claim that they are the device ABC. At least one of them must be lying.

And another example of a problem: You unplug the original thumb drive, and then plug the copy in. Will ZFS notice that this is a copy? It could if it wanted to (by recording the hardware identity of the physical device), but I don't think it does that. Dealing with identity of physical disks is possible (on all existing disk interfaces I know of), but tedious, and requires lots of testing and hacks for special cases.

So as mer said: Try it sometime, and then tell us what "zfs import" said.

cracauer@ · Dec 13, 2023

It will work fine if the ZFS file system is not mounted (or mounted readonly). Getting away with dd-cloning "yourself" while you are actively running on that ZFS is more dangerous than it is with UFS.

covacat · Dec 13, 2023

i tried with md devices
i have 2 identical files
only one can be imported at a time (either of them)

the other one is not even available when one is imported

Code:

[root@hp14 ~]# zpool import -d /dev/md1
   pool: bollocks
     id: 16792511843539772792
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

    bollocks    ONLINE
      md1       ONLINE
[root@hp14 ~]# zpool import -d /dev/md0
   pool: bollocks
     id: 16792511843539772792
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

    bollocks    ONLINE
      md0       ONLINE

Eric A. Borisch · Dec 13, 2023

Phishfry said:
I am using zfs on thumbdrives in testing now.
Can I do a literal dd backup of the 16GB USB disk like I do with UFS2? I know it is a waste because not all 16GB is used.

A disk is a disk right? Filesystem is not important to dd correct? Bits are bits?
I now understand ZFS-send and streams and clones are more appropriate.
Can you still use dd if you don't mind the sparsity?
Just to be clear I am talking single disk ZFS.

With what goal in mind? To distribute to multiple systems that will run independently? Sure. As a backup in case the “golden” one fails? Also sure. To then use both at the same time on the same machine? No. (At least, not without zpool-reguid(8) being run on one of them first.)

Only perform the actual copy when the resident pool is not live on the system.

That said, zfs send/recv with snapshots and incremental updates work really well.

PMc · Dec 13, 2023

Phishfry said:
A disk is a disk right? Filesystem is not important to dd correct? Bits are bits?

Absolutely, yes.
As I say: a zpool is just a collection of (rather big, usually) files. And dd will copy them literally.

There are, however, a few gotchas:

As long as a vdev is online, you cannot copy it. Because it gets constantly written, and what dd would collect is inconsistent and therefore useless.
A single vdev from a pool is useless. You need all of them, or at least enough to bring the pool online. And all of them need to be copied while being offline, together! Otherwise they will be inconsistent against each other, and therefore useless.

This being said, using dd copies for a backup is not practical, because you need to take the pool down for the whole time that dd needs to copy it.

But, if perchance you need to relocate a single vdev in a pool to a different place on disk, that works perfectly with dd:
Take the vdev offline, pool now runs in reduced redundancy. Copy it with dd to the desired partition, make sure the old partition is no longer visible to the system and the new gets the name of the old. Take the vdev online again.
You need to work precisely and do the math. But if executed correctly, this works flawless. (I'm doing it all the time when moving my specials and caches around.)

I do not recommend such operations if you're not really versatile with disks and partititon tables and pool structures. Obviousely the risk is to do something wrong and destroy the pool. But also the risk is to get two identical vdev images visible in the system, and I don't want to know what happens then.

Eric A. Borisch · Dec 13, 2023

PMc said:
But, if perchance you need to relocate a single vdev in a pool to a different place on disk, that works perfectly with dd:
Take the vdev offline, pool now runs in reduced redundancy. Copy it with dd to the desired partition, make sure the old partition is no longer visible to the system and the new gets the name of the old. Take the vdev online again.

Would zpool-replace(8) achieve the same ends? (Assuming the old and new locations are both available concurrently?)

sko · Dec 13, 2023

covacat said:

i tried with md devices
i have 2 identical files
only one can be imported at a time (either of them)

the other one is not even available when one is imported

Code:

[root@hp14 ~]# zpool import -d /dev/md1
   pool: bollocks
     id: 16792511843539772792
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

    bollocks    ONLINE
      md1       ONLINE
[root@hp14 ~]# zpool import -d /dev/md0
   pool: bollocks
     id: 16792511843539772792
  state: ONLINE
 action: The pool can be imported using its name or numeric identifier.
 config:

    bollocks    ONLINE
      md0       ONLINE

because after dd'ing they contain the *same* pool with the same UUID and metadata.

while dd'ing a disk or partition containing a ZFS pool (or even just a single provider) is perfectly fine and possible, it is usually much more efficient to just send|recv the data on that pool or resilver the new provider.

however, dd'ing a pool from a single provider (e.g. thumb drive) has one big advantage if being used as a backup: you can mount that image and import the pool and let zfs do its housekeeping, e.g. detecting errors. So even if you don't have redundancy, you at least *know* if that backup suffers from bitrot, and any other data in that pool that isn't affected is perfectly fine. This is not possible when directly saving zfs send streams to a file as a "backup" - errors in such a stream will lead to it being useless (i.e. any recv attempt will fail)

covacat · Dec 13, 2023

you can zfs send to a md based pool which you can dd to a disk sometime later if you need

PMc · Dec 13, 2023

Eric A. Borisch said:
Would zpool-replace(8) achieve the same ends? (Assuming the old and new locations are both available concurrently?)

I don't think that makes sense. Replace, AFAIK, recreates the vdev from the other vdevs - so the old one is no longer required.

Or, do You mean, the whole dd operation is useless because replace would achieve the same thing? Maybe, but at least for caches, I think, they cannot be replaced, only destroyed and then recreated so they will build up again over time.

homeadm · Dec 13, 2023

I use dd of root pool to file as a backup of OS, for quick recovery in event of a total disaster, such as root drive failure. Honestly, my backups are untested as this scenario has never happened. I don't see a reason for dd data pools; backing up selected files is much more convenient.

I have one horror story with dd-ing ZFS. I installed system on a new machine and configured it. Only one HDD yet, one ZFS pool on entire drive. I moved the hard drive to another computer and did dd to file. After this I was curious what would happen, when I try to import the new pool. I expected this to fail because ZFS version was higher and there can't be two root pools anyway. Import failed (I forgot actual error), I shut down OS and disconnected the drive. After that, operating system refused to boot. I discovered that OS hard drive had a partition, but it was empty, without any slices inside! All my slices were deleted, ZFS, UFS, everything. As far as I remember, even hard drive geometry was changed to default. Moral of the story? Never try to import additional root pool, and always note down your slices configuration. I did the latter, so after initial panic, I recreated my slices by carefully editing the partition map. All data was in place, only information about slices was cleared.

Phishfry · Dec 14, 2023

covacat said:
you can zfs send to a md based pool

Thanks. I didn't even consider that approach.
This 'data in transit' seems a little fragile. Is a memory disk more/less prone to errors?

I am sorry to flog a dead horse with the topic but I have enough information to be dangerous.
I am tiptoeing into single disk ZFS. Old habits like dd are hard to break.

Eric A. Borisch said:
Only perform the actual copy when the resident pool is not live on the system.

Thanks for that. It helps to know.

Eric A. Borisch · Dec 14, 2023

Phishfry said:
This 'data in transit' seems a little fragile. Is a memory disk more/less prone to errors?

The whole send/recv path is checksum-protected, just like everything else in ZFS.

Phishfry · Dec 14, 2023

sko said:
(i.e. any recv attempt will fail)

OK that is great to know. If bad stream it will fail to receive. I was wondering what happens with dirty stream.

Phishfry · Dec 14, 2023

One more question about zfs-send. Large file operations-transfers.
What happens on network failure for NFS or SCP on a zfs-send transfer.
Upon reconnection will it pick up and resume? Is this network protocol question not ZFS?

Eric A. Borisch · Dec 14, 2023

There is a resumable receive option, but in general a reliable TCP connection (interfaces not dropping) will not experience any issues.

Phishfry · Dec 14, 2023

I read a good news post for Illumos aboot ZFS timeout and why the feature should/should not be available.

Feature #1553: ZFS should not trust the layers underneath regarding drive timeouts/failure - illumos gate - illumos

Redmine

www.illumos.org

Andriy · Dec 14, 2023

Just in case, other ways to clone a (single-disk) pool:

attach a new disk to form a mirror, then do zpool-split(8)
dd the disk to a different disk (while the pool is not in use) and then do zpool-reguid(8)