Sending large snapshots over the Internet

The story goes like this:

We have a server with huge ZFS capacity in a DC. (server is being backed up)

We have many clients who use ZFS storage boxes ranging from 2TB - 4TB. All clients send a full first snapshot of their data to a USB disk which is being shipped to the DC. Their data then are being received from the server.

Now, the clients need to be able to send differential snapshots to that server. Their daily differential could be anything between a few Megabytes to 10 Gigabytes! The bottleneck is their upload bandwidth which ranges between 1Mbit to 2Mbit.

The problem here is not so much their upload speed but the fact that a ZFS send operation over ssh can stall, due to network issues, without the ability to resume. Meaning that they have to start the process again.

My thoughts and please comment!

  • Save the differential snapshot to a compressed file.
  • Use a reliable means of transferring that file to the server.
  • Restore the snapshot once the transfer has been completed. (repeat the procedure if the snapshot fails)
I was looking for a reliable means to transfer files over the Internet with the ability to resume.

I haven't found the way to do this with SFTP, FTPS or SCP. Please let me know if I am wrong.

So, I though of torrent. Torrent was designed as a means to reliable transfer files peer2peer. I think that this could be a solution.

I would appreciate your inputs, comments on the above. If what I am saying sounds stupid please don't hesitate to point it out.
If you are already doing this in different way then I would be glad to know how!

Thanks
 
gkontos said:
So, I though of torrent. Torrent was designed as a means to reliable transfer files peer2peer. I think that this could be a solution.
Do you really want to have your data lingering about on a P2P network? I don't know Greek law but I would hope the data is encrypted.

I'll second kpa's suggestion of net/rsync.
 
SirDice said:
Do you really want to have your data lingering about on a P2P network?

I'll second kpa's suggestion of net/rsync.

I was thinking of encrypted, private P2P.

rsync(1) is a solution for replicating the files. That way we would have to forget about snapshots.

The reason I would prefer to send snapshots instead is because I can also send their properties. For example, if a new dataset is created that would be send also. With rsync(1) I would just create a new directory.
 
The --append and --append-verify options can be used for resuming aborted transfers if you also use the --partial option to keep partially transfered files on the receiving side.
 
kpa said:
The --append and --append-verify options can be used for resuming aborted transfers if you also use the --partial option to keep partially transfered files on the receiving side.

Thank you!

I was just reading about this in the man pages. How did I miss it? :e
 
Back
Top