Hi,
I am in a situation where I'm trying to replicate my production ZFS system to a backup server on a different site.
We are using ZFS snapshots and I want to be able to sync them daily to keep a longer history of snapshots on my backup server.
Our ZFS system is big: 200TB. So the full replication takes around 10 days (we have a 9GB connection between the 2 sites).
Last time, our full replication was interrupted because of a power outage. However, the copy of the first snapshot and the data it referred to was received. I can see that the second snapshot was incomplete because of the power outage.
From there, I've been trying to restart the transfer by sending the missing snapshots but I'm not able to. Every time I try to send new snapshots, the zfs send command starts, it tries to send the first packet then after around 15min, it just stops. I have no error message on the source or destination.
Here are the snapshots I have on the source:
Here is what I have on the destination:
Here is the command I run to try to resume the sync:
On my destination I'm running the following:
So I'm looking for a way to resume the transfer without having to restart from the beginning and spending 10 days of copy. But if it's the only way, I will do it, I need that replication to work.
I am by no mean an expert with zfs, so I'm thankful for any help.
I am in a situation where I'm trying to replicate my production ZFS system to a backup server on a different site.
We are using ZFS snapshots and I want to be able to sync them daily to keep a longer history of snapshots on my backup server.
Our ZFS system is big: 200TB. So the full replication takes around 10 days (we have a 9GB connection between the 2 sites).
Last time, our full replication was interrupted because of a power outage. However, the copy of the first snapshot and the data it referred to was received. I can see that the second snapshot was incomplete because of the power outage.
From there, I've been trying to restart the transfer by sending the missing snapshots but I'm not able to. Every time I try to send new snapshots, the zfs send command starts, it tries to send the first packet then after around 15min, it just stops. I have no error message on the source or destination.
Here are the snapshots I have on the source:
Code:
root@fs01:/etc# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
pool01/ds01@2021-12-28_12.00.01--2w 6.18T - 188T -
pool01/ds01@2021-12-29_12.00.01--2w 1.63T - 188T -
pool01/ds01@2021-12-30_12.00.01--2w 1.72T - 187T -
pool01/ds01@2021-12-31_12.00.01--2w 1.73T - 186T -
pool01/ds01@2022-01-01_12.00.01--2w 1.76T - 186T -
pool01/ds01@2022-01-02_12.00.01--2w 1.67T - 185T -
pool01/ds01@2022-01-03_12.00.01--2w 1.37T - 184T -
pool01/ds01@2022-01-04_12.00.01--2w 1.46T - 184T -
pool01/ds01@2022-01-05_12.00.01--2w 1.85T - 184T -
pool01/ds01@2022-01-06_12.00.01--2w 489G - 185T -
Here is what I have on the destination:
Code:
root@04fs01:/tmp# zfs list -t snapshot
NAME USED AVAIL REFER MOUNTPOINT
pool01/repl2-ds01@2021-12-28_12.00.01--2w 6.20T - 188T -
Here is the command I run to try to resume the sync:
Code:
root@fs01:/home# zfs send -cvRI pool01/ds01@2021-12-30_12.00.01--2w pool01/ds01@2021-12-31_12.00.01--2w | nc 192.168.13.2 54321
skipping snapshot pool01/ds01@2022-01-01_12.00.01--2w because it was created after the destination snapshot (2021-12-31_12.00.01--2w)
skipping snapshot pool01/ds01@2022-01-02_12.00.01--2w because it was created after the destination snapshot (2021-12-31_12.00.01--2w)
skipping snapshot pool01/ds01@2022-01-03_12.00.01--2w because it was created after the destination snapshot (2021-12-31_12.00.01--2w)
skipping snapshot pool01/ds01@2022-01-04_12.00.01--2w because it was created after the destination snapshot (2021-12-31_12.00.01--2w)
skipping snapshot pool01/ds01@2022-01-05_12.00.01--2w because it was created after the destination snapshot (2021-12-31_12.00.01--2w)
skipping snapshot pool01/ds01@2022-01-06_12.00.01--2w because it was created after the destination snapshot (2021-12-31_12.00.01--2w)
send from @2021-12-30_12.00.01--2w to pool01/ds01@2021-12-31_12.00.01--2w estimated size is 5.57T
total estimated size is 5.57T
TIME SENT SNAPSHOT pool01/ds01@2021-12-31_12.00.01--2w
21:49:07 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:08 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:09 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:10 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:11 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:12 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:13 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:14 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:15 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:16 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:17 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:18 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:19 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:20 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:21 392K pool01/ds01@2021-12-31_12.00.01--2w
21:49:22 392K pool01/ds01@2021-12-31_12.00.01--2w
On my destination I'm running the following:
Code:
nc -l 54321 -w 60 | zfs receive pool01/repl2-ds01
So I'm looking for a way to resume the transfer without having to restart from the beginning and spending 10 days of copy. But if it's the only way, I will do it, I need that replication to work.
I am by no mean an expert with zfs, so I'm thankful for any help.