Happy new year to all!
I am having a issue with a ZFS send/receive backup script that is constantly stalling overnight. The problem is that I don't know how to troubleshoot the issue.
The script is running on many servers without any problems. In this case we have 2 servers with ZFS on root in US and a offsite server in Germany who receives the differential snapshots of both US servers overnight via a cronjob.
The link is over an IPsec VPN and I have to admit that I am very disappointed with the speed of the receiving part in Germany, 4.5 Mbit max. The latency is not that bad though, average 105 ms.
The first server completes all the transfers successfully but the second does not.
For some reason the second server does not complete and I have to manually perform the operation. Although when I do it manually after, I see that most of the data has been almost transferred and it finishes within 2-3 hours. We are talking about 77GB of data and the differential is usually not more than 1 GB.
If you need more information please let me know. I would appreciate any help at this point.
I am having a issue with a ZFS send/receive backup script that is constantly stalling overnight. The problem is that I don't know how to troubleshoot the issue.
The script is running on many servers without any problems. In this case we have 2 servers with ZFS on root in US and a offsite server in Germany who receives the differential snapshots of both US servers overnight via a cronjob.
The link is over an IPsec VPN and I have to admit that I am very disappointed with the speed of the receiving part in Germany, 4.5 Mbit max. The latency is not that bad though, average 105 ms.
The first server completes all the transfers successfully but the second does not.
Code:
#!/bin/sh
pool="zroot"
destination="zxf2"
host="10.49.0.10"
today=`date +"$type-%Y-%m-%d"`
yesterday=`date -v -1d +"$type-%Y-%m-%d"`
# create today snapshot
snapshot_today="$pool@$today"
# look for a snapshot with this name
if zfs list -H -o name -t snapshot | sort | grep "$snapshot_today$" > /dev/null; then
echo " snapshot, $snapshot_today, already exists"
exit 1
else
echo " taking todays snapshot, $snapshot_today"
zfs snapshot -r $snapshot_today
fi
# look for yesterday snapshot
snapshot_yesterday="$pool@$yesterday"
if zfs list -H -o name -t snapshot | sort | grep "$snapshot_yesterday$" > /dev/null; then
echo " yesterday snapshot, $snapshot_yesterday, exists lets proceed with backup"
zfs send -R -i $snapshot_yesterday $snapshot_today | ssh root@$host zfs receive -Fdv $destination
echo " backup complete destroying yesterday snapshot"
zfs destroy -r $snapshot_yesterday
exit 0
else
echo " missing yesterday snapshot aborting, $snapshot_yesterday"
exit 1
fi
For some reason the second server does not complete and I have to manually perform the operation. Although when I do it manually after, I see that most of the data has been almost transferred and it finishes within 2-3 hours. We are talking about 77GB of data and the differential is usually not more than 1 GB.
If you need more information please let me know. I would appreciate any help at this point.