ZFS What are options for resuming an interrupted send -R / receive -s ?

In looking for an answer, I see that it was true in past years (and in my testing seems still true) that when you're performing a "zfs send -R", that using "receive -s" does not help as much as I'd hoped/wanted. In my case I was sending an entire pool from a snapshot created for this, and it's tens of TB in size.

So, having already transferred half of what was going to be a two-week data transfer, what is the recommendation for transferring everything else? Has someone written a script to look at the two systems and determine which send/receive commands need to be used to finish the whole-pool "send -R"?

Thanks. I think I understand the situation enough that I could likely do the above by hand or script it, but I want to both understand if things are as I think, and if there is already advice/tools for this.

Thank you.
 
Thanks, but that is only the basic resume. That doesn't cover how to resume a "zfs send -R" at all, it only covers resuming a single filesystem/snapshot . I apologize that I didn't make it clear in my original post that I understood that.

I understand "receive -s" and how to use the token to resume with "send -t". But if the original send was a "send -R", then the "send -t" only resumes _one_ transfer, not the remainder of them. This is what I'm trying to find a way to recover from.
 
I understand "receive -s" and how to use the token to resume with "send -t". But if the original send was a "send -R", then the "send -t" only resumes _one_ transfer, not the remainder of them. This is what I'm trying to find a way to recover from.

Gotcha. I think you’ve got the picture right; it’s a bit of a manual process to determine what has / hasn’t finished, and performing send/recvs as necessary to get everyone up to the @B you’re trying to end at. If you have a sub-tree that is wholly not updated (still at @A), it can be consolidated into a single send -R execution. Note you’ll need to adjust recv destinations to be appropriate as you’re going through, too. Alternatively some tactical roll-backs on the destination to get entire sub-trees starting at @A and going to @B are worth considering to minimize executions, if the additional data transfers are manageable.

I’ve never seen a clean “recover” script. Normally (assuming stable connections) this is a rare event, but very long transfers are certainly a pain point. At least there is -s!

If you look at tools like zrepl, you’ll see they decide to take the pain up-front and plan/execute minimal update steps (no recursion, no -I) so that it one fails, the recovery is much easier to plan. Tradeoffs.
 
Back
Top