Remote backups server using FreeBSD, ZFS, and Rsync

Updates
2010-04-04:
We've rolled out version 3 of our rsbackup system. See this thread for more information on it. (Version 2 never really saw the light of day, but was used as a stepping stone to the refactored version 3.)

Intro
A co-worker and I developed a centralised backup solution using FreeBSD, ZFS, and Rsync. The following set of posts describe how we did it.

Note: this is fairly long, and includes code dumps from all the scripts and config files used.

Server Hardware
Our central backup server uses the following hardware:
  • Chenbro 5U rackmount case, with 24 hot-swappable drive bays, and a 4-way redundant PSU
  • Tyan h2000M motherboard
  • 2x dual-core Opteron 2200-series CPUs at 2.2 GHz
  • 8 GB ECC DDR2-SDRAM
  • 3Ware 9550SXU PCI-X RAID controller in a 64-bit/133 Mhz PCI-X slot
  • 3Ware 9650SE PCIe RAID controller in an 8x PCIe slot
  • Intel PRO/1000MT 4-port gigabit PCI-X NIC
  • 24x 500 GB SATA harddrives
  • 2x 2 GB CompactFlash cards in CF-to-IDE adapters

OS Configuration
We're currently running the 64-bit amd64 version of FreeBSD 7.1. We'll be upgrading to 7.2 once it's released. And we are anxiously awaiting the release of 8.0 with ZFSv13 support.

Two of the gigabit NIC ports are combined using lagg(4) and connected to one gigabit switch. We're considering adding the other two ports to the lagg interface, but we're waiting for a new managed switch that support LACP before we do.

The 2 CF cards are configured as gm0 using gmirror(8). / and /usr are installed on gm0.

The 3Ware RAID controllers are configured basically as glorified SATA controllers. Each drive is configured as a "SingleDrive" array, and appear to the OS as separate drives. Using SingleDrive instead of JBOD allows the RAID controller to use the onboard cache, and allows us to use the 3dm2 monitoring software. Each drive is also named after the slot/port it is connect to (disk01 through disk24).

The 24 harddrives are also labelled using glabel(8), according to the slot they are in, using the same names as the RAID controller uses (disk01 through disk24).

The drives are added to a ZFS pool as 3 separate 8-drive raidz2 vdevs, as follows:
Code:
# zpool create storage raidz2 label/disk01 label/disk02 label/disk03 label/disk04 label/disk05 label/disk06 label/disk07 label/disk08
# zpool add    storage raidz2 label/disk09 label/disk10 label/disk11 label/disk12 label/disk13 label/disk14 label/disk15 label/disk16
# zpool add    storage raidz2 label/disk17 label/disk18 label/disk19 label/disk20 label/disk21 label/disk22 label/disk23 label/disk24

This creates a "RAID0" stripe across the three "RAID6" arrays. The total storage pool size is just under 11 TB.

Code:
# zpool status
  pool: storage
 state: ONLINE
 scrub: none requested
config:

        NAME              STATE     READ WRITE CKSUM
        storage           ONLINE       0     0     0
          raidz2          ONLINE       0     0     0
            label/disk01  ONLINE       0     0     0
            label/disk02  ONLINE       0     0     0
            label/disk03  ONLINE       0     0     0
            label/disk04  ONLINE       0     0     0
            label/disk05  ONLINE       0     0     0
            label/disk06  ONLINE       0     0     0
            label/disk07  ONLINE       0     0     0
            label/disk08  ONLINE       0     0     0
          raidz2          ONLINE       0     0     0
            label/disk09  ONLINE       0     0     0
            label/disk10  ONLINE       0     0     0
            label/disk11  ONLINE       0     0     0
            label/disk12  ONLINE       0     0     0
            label/disk13  ONLINE       0     0     0
            label/disk14  ONLINE       0     0     0
            label/disk15  ONLINE       0     0     0
            label/disk16  ONLINE       0     0     0
          raidz2          ONLINE       0     0     0
            label/disk17  ONLINE       0     0     0
            label/disk18  ONLINE       0     0     0
            label/disk19  ONLINE       0     0     0
            label/disk20  ONLINE       0     0     0
            label/disk21  ONLINE       0     0     0
            label/disk22  ONLINE       0     0     0
            label/disk23  ONLINE       0     0     0
            label/disk24  ONLINE       0     0     0
Code:
# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
storage                10.9T   5.11T   5.76T    47%  ONLINE     -

We then created ZFS filesystems for basically everything except / and /usr:
  • /home
  • /tmp
  • /usr/local
  • /usr/obj
  • /usr/ports
  • /usr/ports/distfiles
  • /usr/src
  • /var
  • /storage/backup

We enabled lzjb compression on /usr/ports and /usr/src, and disabled it on /usr/ports/distfiles. And we enabled gzip-9 compression on /storage/backup. We also disabled atime updates on everything except /var.
 
RSBackup
We developed a "simple" set of shell scripts that perform remote backups of Linux and FreeBSD systems using rsync and ZFS snapshots. The scripts run a sequential series of rsync connections for all servers at a remote site, while also doing multiple sites in parallel. It uses SSH (as user rsbackup, with a password-less RSA key) to connect to the remote server, then uses rsync to send data back through the SSH connection. Backups are stored on a ZFS filesystem (/storage/backup/), with a separate directory for each site, and separate sub-directories for each server. Before each nightly backup run, a ZFS snapshot is taken of the /storage/backup filesystem, named using the current date, in YYYY-MM-DD format.

We called our solution rsbackup.

rsbackup is configured to run every night starting at 7 pm, via root's crontab. The crontab looks like this:

Code:
SHELL=/bin/sh
MAILTO=root
PATH=/sbin:/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin

#min    hour    day     month   weekday         command
*/15    *       *       *       *               /root/scripts/check-fs.sh

# Take a snapshot of the backups filesystem
50      18      *       *       *               /root/rsb/rsb-snapshot

# Run the rsbackup script
0       19      *       *       mon-fri         /root/rsb/rsb-wrapper force
0       19      *       *       sat-sun         /root/rsb/rsb-wrapper start
50      6       *       *       mon-fri         /root/rsb/rsb-wrapper stop

The crontab above shows the helper scripts that are used:
  • check-fs
  • rsb-snapshot
  • rsb-wrapper
  • rsb-one

check-fs checks the status of the gmirror and the zpool to make sure there are no checksum errors, dying drives, missing drives, degraded vdevs, and so on. If there are, then an e-mail is delivered with the details of the issues.

rsb-snapshot pulls in the rsbackup config file to determine which filesystem to snapshot, then creates a snapshot using the current date as the snapshot name (YYYY-MM-DD format).

rsb-wrapper pulls in the rsbackup config file, then checks if any other rsbackup processes are running. If there are any, a warning is displayed and the wrapper exits. If there are none, then the backup process is started. rsb-wrapper is also run just prior to 7 am, to check if any rsync processes are still running, and to kill them if they are (We didn't want backups running during the day, as they will hog all the upload bandwidth for the remote sites). Error and warning messages from all the log files are then sent via e-mail to the address listed in the crontab.

rsb-one can be used from the command-line to do a manual backup of a single server at a single site. It uses the same config file as the rest of the scripts. Command syntax is:
# rsb-one -s sitename -h hostname

The gist of the backup process is this:
  • every night, a ZFS snapshot is created of the /storage/backup filesytem. This becomes the historical backup for everything, as one can navigate through all the snapshots via /storage/backup/.zfs/snapshot/<snapname>/<sitename>/<server>/.
  • every night, a full rsync is done of virtually every file on the remote systems against a local directory for that server.

We are currently backing up 102 remote servers. The backups start at 7pm, the rsync for the last server starts around 2am, and everything is finished by 4am.

The size of the snapshots fluctuates daily, but the average is under 10 GB. The base storage required for those 102 servers is ~ 4 TB, which gives us over 500 days of daily backups, well over the 13 months we were hoping for.
 
Remote Server Config
The rsbackup system requires a bit of setup on both the central backup server and the remote server(s). The following shows how to configure a Debian Linux host for backups.

On the remote host

  1. install rsync (preferably 3.0.x, as it has much reduced CPU and RAM usage, and it starts sending file changes while generating the file list)
  2. create backups group
    addgroup --system backups
    create rsbackup user
    adduser rsbackup
  3. manually set the password to * in /etc/shadow to prevent console logins, the shell can be set to /bin/sh, as there are no interactive logins
  4. add rsbackup to group backups
    adduser rsbackup backups
  5. edit sudo config to allow backups group to run rsync with no password
    visudo
    Cmnd_Alias RSYNC = /usr/bin/rsync
    %backups ALL=(ALL) NOPASSWD: RSYNC
  6. create .ssh/ directory in ~rsbackup/
    mkdir ~rsbackup/.ssh
  7. create blank authorized_keys file
    touch ~rsbackup/.ssh/authorized_keys
  8. set correct permissions on .ssh/ directory and .ssh/authorized_keys file
    chmod 700 ~rsbackup/.ssh
    chmod 600 ~rsbackup/.ssh/authorized_keys

On the central backup server
  1. copy public SSH key for rsbackup to remote server
    scp /root/rsb/conf/rsbackup.rsa.pub remoteserver:
  2. on the remote server, move the rsbackup.rsa.pub file to ~rsbackup/.ssh/authorized_keys
  3. test SSH logins using the key (must be done as root)
    ssh -l rsbackup -i /root/rsbackup/conf/rsbackup.rsa -p <portnum> <server>
  4. test that rsbackup can run rsync via sudo without passwords, but cannot run any other commands via sudo
    sudo /bin/ls (should fail)
    sudo rsync --version (should work)
 
Central Backup Server Config
All rsbackup-related stuff is (currently) stored under /root/rsb/ (ideally, it should be stored under /usr/local/ to follow hier(7)).

The example below shows the configuration steps used for testserver.

  1. If this is the first server added for a site, create a site directory, using the DNS name for the site, under /root/rsb/sites/
    mkdir sites/testsite
  2. Create/edit the site_defaults file
    cp /root/rsb/conf/site_defaults /root/rsb/sites/testsit/
    ee sites/testsite/site_defaults
  3. Create a config file for the server.
    ee sites/testsite/testserver
    Add/edit at least the following:
    RSYNC_SERVER=testserver.hostname
    SERV_DIR=testsite/
  4. Add any overrides for items in the global defaults (mainly SSH port to use)
  5. If there are special excludes for this server, add the following
    RSYNC_EX_SERVER=$SITE_CONF/exclude.testserver
  6. Add/edit the exclude file listed above
    ee sitess/testsite/exclude.testserver
  7. Connect via ssh to add the host to the known_hosts file
    ssh -l rsbackup -i /root/rsb/conf/rsbackup.rsa testserver.hostname
  8. Add the site to the global sites list
    ee conf/sites.lst
  9. Rename the server config file to end in .cfg (only server config files ending in .cfg are processed)
    mv sites/testsite/testserver sites/testsite/testserver.cfg
  10. The site and server(s) will be picked up in the next run of rsbackup via cron

Try to only add 1 or 2 new servers per day. The initial rsync run takes a long time, as it has to copy over every file in the system. Any still-running rsync processes will be killed at 7 am weekdays, so the initial sync may be spread across multiple days. Adding servers on Friday is best, as the rsync processes will run until complete or Monday at 7 am, whichever comes first.
 
Restoring From Backups
Every snapshot that is created can be navigated via the hidden .zfs/snapshot/<snapshotname>/ directory hierarchy. The .zfs directory is placed in the root of the ZFS filesystem. As you navigate through the snapshot hierarchy, ZFS automatically mounts the snapshot as a read-only filesystem. You can also manually mount the snapshot as read-only using mount -t zfs. In this way, you can restore files from either the most recent backup (the normal filesystem hierarchy) or from any previous backup (the snapshot hierarchy).

To manually mount a snapshot (as root):
# mount -t zfs -r /storage/backup@2008-09-12 /mnt

You can clean up the output of mount by periodically running (as root):
# mount | grep 'backup@' | awk '{ print $3 }' | xargs -n 1 umount

Individual Files/Folders
  1. SSH to the central backups server
  2. Switch to root
  3. cd into the /storage/backup/.zfs/snapshot/ directory
  4. Do an ls to see all the available snapshot dates
  5. cd into the desired snapshot directory
  6. cd into the <site>/<server>/ directory
  7. find the file/folder you need and scp it back to the server in question

Complete System Restore - Linux
In order for this to work correctly, the username you use in the rsync command will need to be part of the sudoers users/groups that can run rsync on the central backup server.
  1. Boot replacement server off a Linux LiveCD (Knoppix/Kanotix/etc).
  2. Partition the drive(s) as needed using cfdisk (see fstab in the server's etc directory on the central backup server).
  3. Format the partitions as needed (see fstab in the server's etc directory on the central backup server).
    mkfs -t ext3 /dev/sda1
    mkfs -t xfs /dev/sda5
    mkfs -t xfs /dev/sda6
    and so on
  4. Mount the partitions under /mnt.
    mount -t xfs /dev/sda5 /mnt
    mkdir /mnt/boot /mnt/usr /mnt/home /mnt/var
    mount -t ext3 /dev/sda1 /mnt/boot
    mount -t xfs /dev/sda6 /mnt/usr
    and so on
  5. cd to /mnt (not really needed, but a good safety-net, just in case).
  6. Run rsync to copy everything from the central backup server to the local server
    Note 1: --numerical-ids is *very* important, do not forget this option, or things will fail in spectacular ways!
    Note 2: -H is needed to restore hardlinks to various files. Without this, the restore will be significantly larger.
    # rsync -vaH --partial --stats --numeric-ids --rsh=ssh --rsync-path="sudo rsync" username@backupserver:/storage/backup/<site>/<server>/ /mnt/
  7. Grab a coffee as it does the transfer. Time depends on the size of the dataset being restored.
  8. Install GRUB into the boot sector of the harddrive.
    grub-install --no-floppy --recheck /dev/sda
    grub-install --no-floppy /dev/sda
  9. Reboot the server to make sure everything comes up correctly.

For the last step, where you run rsync, you can use a ZFS snapshot directory to restore the server to any day. Instead of /storage/backup/<site>/<server>/ you can use /storage/backup/.zfs/<snapshotdate>/<site>/<server>/
 
Complete System Restore - FreeBSD
In order for this to work correctly, you will need to be part of the sudoers users/groups on the central backup server that can run rsync without requiring a password.

First, do a minimal install of FreeBSD, to make the drives bootable:
  1. Boot replacement server using the FreeBSD install CD.
  2. Select Canada as the country.
  3. Select USA ISO as the keymap.
  4. Select Standard install.
  5. Select OK on the warning message.
  6. Delete all existing partitions. Press A to create a single partition for FreeBSD. Mark it as Bootable. Press Q to save the changes.
  7. Select Standard MBR (no boot manager).
  8. Select OK on the warning message.
  9. Create the partitions needed (see the fstab under /storage/backup/<site>/<server>/etc/). Press Q to save the changes.
  10. Select Minimal install.
  11. Select FTP Passive for the installation media (or CD/DVD if using the full CD1).
  12. Select Main Site.
  13. Select the correct network device (xl0 on my test server).
  14. Select No for IPv6.
  15. Select Yes for DHCP.
  16. Enter the correct hostname.
  17. Select Yes on the warning message.
  18. Wait as it does the minimal install.
  19. Select OK on the completion message.
  20. Select No for "function as a network gateway".
  21. Select No for "configure inetd".
  22. Select No for "enable SSH login".
  23. Select No for "anonymous FTP".
  24. Select No for "NFS server".
  25. Select No for "NFS client".
  26. Select No for "customize system console".
  27. Select Yes for "set this machine's time zone'.
  28. Select No for "Is this machine's CMOS clock set to UTC".
  29. Select America -- North and South for region.
  30. Select Canada for country.
  31. Select Pacific Time - west British Columbia for timezone.
  32. Select Yes for "PDT".
  33. Select No for "Linux compatibility".
  34. Select No for "mouse".
  35. Select No for "browse the package collection".
  36. Select Yes for "add any initial user accounts".
  37. Select User for "User and group management".
  38. Fill in the blanks. The exact contents don't matter, as the rsync restore will wipe this out. This is just for testing during the initial boot.
  39. Select Exit for "User and group management".
  40. Select OK on the warning message.
  41. Type root's password twice.
  42. Select No on the warning message.
  43. Press Tab key to get to "Exit Install". Press enter.
  44. Select Yes on the warning message to exit the installer and reboot the system.

Test that the new install boots correctly, and that you can login from the console.

Then follow the steps below to restore the data from the backups server.

  1. Boot replacement server off a FreeBSD LiveCD that includes rsync (Frenzy/FreeSBIE/etc). Frenzy 1.1 seems to work best.
  2. Type nohdmnt at the boot menu, to prevent the existing filesystems from being mounted automatically.
  3. Enable modifying of drives while the system is running.
    sysctl -w kern.geom.debugflags=16
  4. Create a directory to use for the mount point of the harddrive partitions.
    mkdir /root/media
  5. Mount the partitions under /root/media
    mount /dev/ad4s1a /root/media
    mkdir /root/media/usr /root/media/var /root/media/home
    mount /dev/ad4s1d /mnt/usr
    mount /dev/ad4s1e /mnt/var
    mount /dev/ad4s1f /mnt/home
    and so on
  6. Change to /root/media (not really needed, but a good safety-net, just in case).
  7. Run rsync to copy everything from the central backup server to the local server.
    Note 1: --numerical-ids is *very* important, do not forget this option, or things will fail in spectacular ways!
    Note 2: -H is needed to restore hardlinks to various files. Without this, the restore will be huge, and will fail. FreeBSD uses hardlinks a lot!
    rsync -vaH --partial --inplace --stats --numeric-ids --rsh="ssh" --rsync-path="sudo rsync" username@backupserver:/storage/backup/<site>/<server>/ /mnt/
  8. Grab a coffee as it does the transfer. Length of the restore depends on the size of the dataset being restored.
  9. Reboot the server, without any CDs in the drive, to make sure everything comes up correctly. Test that you can login from the console.

For the last step, where you run rsync, you can use a ZFS snapshot directory to restore the server to any day. Instead of /storage/backup/<site>/<server>/ you can use /storage/backup/.zfs/<snapshotdate>/<site>/<server>/
 
The rsbackup Script
This is version two our our prototype rsbackup, it's still a little rough around the edges, and spread across too many separate files. It works well for us, but it's not as pretty as it should be. :) There are a couple of different coding styles, and some options may not be in use anymore. We're hoping to clean it up over the summer, when school is not in session (we don't want to disrupt the backups during the school year). We'd also like to amalgamate rsbackup, rsb-one, and rsb-snapshot together.

Code:
#!/bin/sh                                                                                                                                              

Defaults="rsbackup.conf"
. $Defaults             

# Functions used in this script
do_rsync()                     
{                              
        SITE_CONF="${SERVERS_DIR}/${1}"

        #find each .cfg file in the passed dir, load defaults, site_defaults, server_defaults, and run the rsync
        for I in $( find $SITE_CONF  -type f -name "*.cfg" ); do                                                  

                # Load Standard defaults
                . $Defaults            

                # Load site wide defaults
                if [ -f $SITE_CONF/site_defaults ]; then
                     . $SITE_CONF/site_defaults         
                fi                                      

                # Load server specific options
                if [ ! -z $I ]; then         
                        if [ -f $I ]; then   
                          . $I               
                        fi                   
                fi                           

                # make sure the site directory exists
                if [ ! -e $BACKUP_DIR/$SITE_DIR ]; then
                        mkdir $BACKUP_DIR/$SITE_DIR    
                fi                                     

                # just to make typing easier
                S_DIR="$BACKUP_DIR/$SITE_DIR/$SERV_DIR"

                # make sure the directory for the server itself exists
                if [ ! -e $S_DIR ]; then
                        mkdir $S_DIR
                fi

                echo ""
                echo "====>> $( date "+%b %d %Y: %H:%M" )  Starting rsync for $RSYNC_SERVER" >> $logfile
                echo ""

                # The actual rsync command
                rsync $RSYNC_OPTIONS $RSYNC_SITE_OPTIONS $RSYNC_SRV_OPTIONS \
                        --exclude-from=$RSYNC_EX_DEF $RSYNC_EXTRA_EXCLUDE \
                        --rsync-path="$RSYNC_EXEC" --rsh="$RSYNC_SSHCMD -p $RSYNC_PORT -i $RSYNC_SSH_KEY" \
                        --log-file=/var/log/rsbackup/$RSYNC_SERVER.log \
                        $RSYNC_USER@$RSYNC_SERVER:/ $S_DIR

                echo ""
                echo "====>> $( date "+%b %d %Y: %H:%M" )  Ending rsync run for $RSYNC_SERVER" >> $logfile

        done
}


# run the rsync for each directory listed in sites.conf
for site in $( cat ${CONF_DIR}/sites.lst ); do
        echo ""
        echo "****>> $( date "+%b %d %Y: %H:%M" )  Starting sequential run for servers at ${site}" >> $logfile
        do_rsync ${site} &
        sleep $SLEEPTIME
done
 
rsbackup.conf
This is the main configuration file that all the scripts use. It lists where the log files should be stored, how long to wait between sites, the default options used for the rsync command, and so on.

Code:
RS_DIR="/root/rsb"
SERVERS_DIR="$RS_DIR/sites"
CONF_DIR="$RS_DIR/conf"
logfile="/var/log/rsbackup/rsbackup.log"

# Where all the backups are stored
BACKUP_DIR="/storage/backup"

#Default options for rsync
# RSYNC_OPTIONS      are the defaults for the rsbackup system
# RSYNC_SITE_OPTIONS are the overrides that apply to all systems at one site (set in servers/<site>/site_defaults file)
# RSYNC_SRV_OPTIONS  are the overrides that apply to one specific server (set in servers/<site>/<server>.rs file)

RSYNC_OPTIONS="--archive --stats --numeric-ids --delete-during --partial --inplace --hard-links"
RSYNC_SITE_OPTIONS="--compress --compress-level=9"
#RSYNC_SITE_OPTIONS=""
RSYNC_USER="rsbackup"
RSYNC_PORT="55556"
RSYNC_EX_DEF="$CONF_DIR/exclude.default.linux"
RSYNC_SSH_KEY="$CONF_DIR/rsbackup.rsa"
RSYNC_EXEC="sudo rsync"
RSYNC_EX_MEDIA="$CONF_DIR/exclude.pass1"
RSYNC_EX_SERVER=""
RSYNC_SSHCMD="/usr/local/bin/ssh"

SLEEPTIME=250
 
rsb-wrapper
This is the wrapper script that is run via cron.

When called with the parameter force, it will start rsbackup, no questions asked.

When called with the parameter start, it will check for other running rsbackup processes. If there are any, then it outputs a warning message and exits without starting rsbackup. If there are no running rsbackup processes, then it starts one.

When called with the parameter stop, it will unquestionably kill any running rsync and rsbackup processes. It will then tail and grep all the log files for warnings and errors, and echo them so cron can send them as an e-mail.

Code:
#!/bin/sh                               

# Set custom PATH
export PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin

# Set the default exit value
exitval=0                   

# Grab the PID of the current script
pid=$$                              

# Get info on how we were called
curdir=$( /usr/bin/dirname ${0} )

# Pull in the config file for rsbackup, which should be in the same directory we are called from
if [ -e ${curdir}/rsbackup.conf ]; then                                                         
        . ${curdir}/rsbackup.conf                                                               
        cd $RS_DIR                                                                              
else                                                                                            
        echo "Error:  unable to load the config file."                                          
        exit 1                                                                                  
fi                                                                                              

# Functions used in this script
check_logs()                   
{                              
        local word="${1}"      

        cd /var/log/rsbackup

        for log in $( ls *.log ); do
                msg_head="${log}: " 
                msg_body="$(tail -1 ${log} | grep "${word}" )"
                if [ "${msg_body}x" != "x" ]; then            
                        echo ${msg_head}                      
                        echo ${msg_body}                      
                        echo ""                               
                fi                                            
        done                                                  
}                                                             


# Main script
case "$1" in 
        [Ff][Oo][Rr][Cc][Ee])
                echo "Forcing rsbackup to start"
                ./rsbackup > /dev/null 2>&1 &
                ;;
        [Ss][Tt][Aa][Rr][Tt])
                # Check if any rsync/rsbackup processes are already running, and abort if there are
                numrunning=$( pgrep -lf rsbackup | grep rsync | wc -l | cut -c 8- )

                if [ ${numrunning} -eq 0 ]; then
                        echo "Starting rsbackup"
                        cd ${RS_DIR}
                        ./rsbackup > /dev/null 2>&1 &
                else
                        echo "Warning:  other rsbackup processes are running.  Not starting."
                        exitval=2
                fi
                ;;
        [Ss][Tt][Oo][Pp])
                # Check if there are any running rsync/rsbackup processes, and abort if there aren't
                numrunning=$( pgrep -lf rsbackup | grep rsync | grep -v rsb-wrapper | wc -l | cut -c 8- )

                if [ ${numrunning} -gt 0 ]; then
                        echo -n "Attempting to forcibly stop rsbackup ... "

                        pkill -9 -f rsbackup

                        numrunning=$( pgrep -lf rsbackup | grep rsync | grep -v rsb-wrapper | wc -l | cut -c 8- )
                        sleep 3
                        if [ ${numrunning} -gt 0 ]; then
                                echo "ERROR!"
                                echo "Unable to stop all processes."
                                exitval=1
                        else
                                echo "done."
                                echo ""
                                exitval=0
                        fi
                else
                        echo "No running rsbackup processes.  Nothing to stop."
                        exitval=0
                fi


                echo "Checking logs for warnings"
                echo "----------------------------------"
                check_logs "warning"

                echo ""

                echo "Checking logs for errors"
                echo "----------------------------------"
                check_logs "error"
                ;;
esac

exit $exitval
 
rsb-snapshot
This script just pulls in the central config file, figures out which ZFS filesystem is being used, and creates a snapshot of it. The snapshot is named after the current date, using YYYY-MM-DD as the format.

Code:
#!/bin/sh

export PATH=/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin

# Get info on how we were called
curdir=$( /usr/bin/dirname ${0} )

# Pull in the config file for rsbackup, which should be in the same directory we are called from
if [ -e ${curdir}/rsbackup.conf ]; then
        . ${curdir}/rsbackup.conf
else
        echo "Error:  unable to load the config file."
        exit 1
fi

# Get today's date, formatted as YYYY-MM-DD
today=$( date "+%Y-%m-%d" )

# Remove any leading slashes from storage directory
if [ $( echo $BACKUP_DIR | /usr/bin/cut -c 1 ) = "/" ]; then
        backupdir=$( echo $BACKUP_DIR | /usr/bin/cut -c 2- )
else
        backupdir=$BACKUP_DIR
fi

# Create a snapshot using the date in the name
/sbin/zfs snapshot ${backupdir}@${today}

if [ $? -ne 0 ]; then
        echo "Error:  unable to create the snapshot (${backupdir}@${today})."
        exit 1
fi

exit 0


rsb-one
This script can be used to do a manual backup of a single server at a single site. Mainly used for testing, but has also come in handy on a couple of occasions when the automatic backup failed.

This pulls in the central config file, but duplicates the rsync command, so one has to keep this file and the rsbackup file in sync. We're planning on amalgamating this into the main rsbackup script.

Code:
#!/bin/sh                          

which="/usr/bin/which"
basename=$( ${which} basename )
dirname=$( ${which} dirname )  
scriptname=$( ${basename} ${0} )
scriptdir=$( ${dirname} ${0} )  
scriptversion=1.0               
sshcmd="/usr/local/bin/ssh "    

# Pull in the defaults file
defaults="rsbackup.conf"   
if [ -r ${scriptdir}/${defaults} ]; then
        . ${scriptdir}/${defaults}      
else                                    
        echo "Error!  Main config file doesn't exist."
        exit 1                                        
fi                                                    

# Arguments passed to this script are:
#  -s sitename     this tells the script where to find the site settings
#  -h hostname     this tells the script which host config file to grab
if [ $# -gt 0 ]; then
        while getopts "s:h:" OPTION; do
                case "${OPTION}" in
                        "s")
                                # Check if site config file exists, and read it in
                                if [ -r ${RS_DIR}/sites/${OPTARG}/site_defaults ]; then
                                        sitedir=${RS_DIR}/sites/${OPTARG}
                                        . ${sitedir}/site_defaults
                                else
                                        echo "Error!  Site directory doesn't exist."
                                        exit 1
                                fi
                                ;;
                        "h")
                                # Check if host config file exists, and read it in
                                if [ -r ${sitedir}/${OPTARG}.cfg ]; then
                                        hostconf=${sitedir}/${OPTARG}.cfg
                                        . ${hostconf}
                                else
                                        echo "Error!  Host conf file doesn't exist."
                                        exit 1
                                fi
                                ;;
                        *)
                                echo "Usage: ${0} -s sitename -h hostname"
                                ;;
                esac
        done
else
        echo "No arguments given.  Nothing to do."
        echo ""
        echo "Usage: ${0} -s sitename -h hostname"
        exit 1
fi

# Check whether there's a server-specific exclude file needed
if [ -z $RSYNC_EX_SERVER ]; then
        RSYNC_EXTRA_EXCLUDE=""
else
        RSYNC_EXTRA_EXCLUDE="--exclude-from=${sitedir}/$RSYNC_EX_SERVER"
fi

# Make sure that the backup directory exists
if [ ! -e $BACKUP_DIR/$SITE_DIR/$SERV_DIR ]; then
        mkdir $BACKUP_DIR/$SITE_DIR/$SERV_DIR
fi

# Do the rsync
rsync $RSYNC_OPTIONS $RSYNC_SITE_OPTIONS $RSYNC_SRV_OPTIONS \
        --exclude-from=$RSYNC_EX_DEF $RSYNC_EXTRA_EXCLUDE \
        --rsync-path="$RSYNC_EXEC" --rsh="$sshcmd -p $RSYNC_PORT -i $RSYNC_SSH_KEY" \
        --log-file=/var/log/rsbackup/$RSYNC_SERVER.log \
        $RSYNC_USER@$RSYNC_SERVER:/ $BACKUP_DIR/$SITE_DIR/$SERV_DIR
 
Example site_default file
This is the config file that lists defaults for all servers at a specific site, as well as the main directory to use for the backups for all the servers at that site.

Code:
#Site wide options
#required
SITE_DIR=site

Example server config file
This is the config file that each remote server would have. It lists any server-specific exclude files to use, the hostname of the server, and the name of the directory to store the backup under (usually named after the server).

Code:
# adding an additional exclude file
RSYNC_EX_SERVER=exclude.server

# These 2 are required, and specific to each server
RSYNC_SERVER=server.hostname
SERV_DIR=server

Default exclude file for Linux servers
This is an example of the default exclude file used for all Linux servers.

Code:
/sys/*
/proc/*
*mozilla/firefox/*/Cache/**
/var/lib/vservers/vs1/home/*
*/.googleearth/Cache/**
*/.googleearth/Cache/temp/**
/var/spool/squid/**
/backup/*
/var/spool/cups/**
/var/log/**.gz
*/cache/apt/archives/**
/var/lib/vservers/vs1/var/tmp/**
/home/programs/tmp/**
/home/programs/vmware/**
/home/**/.thumbnails/**
/home/**/.java*/deployment/cache/**
/home/**/profile/**
/home/**/.local/Trash/**
/home/**/.macromedia/**
 
check-fs
This script monitors the health of the gmirror and the zpool. It runs via cron. If any anomalies are detected, and e-mail is sent with all the details. It's based on the zpool check script run via periodic(8).

Code:
#!/bin/sh

send=0

# Check zpool status
status=$( zpool status -x )

if [ "${status}" != "all pools are healthy" ]; then
        zpoolmsg="Problems with ZFS: ${status}"
        send=1
fi

# Check gmirror status
status=$(gmirror status)

if $( gmirror status | grep DEGRADED > /dev/null ); then
        gmirrormsg="Problems with gmirror: ${status}"
        send=1
fi

# Send status e-mail if needed
if [ "${send}" -eq 1 ]; then
        echo "${zpoolmsg} ${gmirrormsg}" | mail -s "Filesystem Issues on backup server" someone@somewhere.com
fi

exit 0
 
Filesystem Layout
And finally, here's the directory structure used, to show where the different files go, where the backups go, etc.

/root/rsb/
conf/
rsb-one
rsb-snapshot
rsb-wrapper
rsbackup
rsbackup.conf
sites/

/root/rsb/conf/
exclude.default.bsd
exclude.default.linux
rsbackup.rsa
rsbackup.rsa.pub
server.rs.example
site_default.example
sites.lst

/root/rsb/sites/
site1/
site2
site3/
site4/

/root/rsb/sites/site1/
exclude.host1
exclude.host2
exclude.host3
host1.cfg
host2.cfg
host3.cfg
host4.cfg
site_defaults
 
Thank you for the great howto (*very* informative btw :)). I was wondering thou how stable do you find ZFS on FreeBSD? Did you have any issues? How about performance under heavy load?
 
The original server setup, using a single 24-drive raidz2 vdev in the storage pool, was not very good. We learnt the hard way that the IOps performance of a raidz vdev is equivalent to that of a single drive. IOW, a 24-drive raidz2 is no faster than a single SATA drive!!

Plus, when you have to replace a drive in the vdev, as we had to, it will thrash all the drives in the raidz vdev ... and thrashing 24 drives 24-hours a day *really* slows things down, usually leading to re-starts of the resilver process. After a week of that, we rebuilt the box using the 3x raidz2 vdevs using 8-drives each. Performance went through the roof after that.

Turns out, the official recommendation from SUN is to use <=10 drives per raidz vdev, preferably 6-8.

The original setup would complete ~60 server backups between 7pm and 7am. We really fiddled with the sleep times between starting the parallel rsyncs, and with the ordering of the sites, but we couldn't really get it much better than ~60 servers in one run.

Moving to the 3x raidz setup, we can complete 102 server backups within 5 hours, leaving plenty of time for extra servers.

We did have to do some manual tuning of various sysctls, and loader tunables. And we switched to using OpenSSH from the ports tree, with the HPN patches (went from ~30 Mbits/sec max network throughput to over 90 Mbits/sec, per SSH connection).

We monitor the server using SNMP, MRTG, and Routers2. Even though we can only poll the 32-bit disk counters every 60 seconds, we average 80 MBytes/sec disk I/O during the backup run, with the odd peak at 120 MBytes/sec. The system is still very responsive to SSH connections, log tailing, and other interactive duties.

We also push the contents of the /storage/backup directory out to a second, identical system, at an off-site location. Takes a little under 4 hours for that. Using a slightly modified rsync script (basically just a for loop through the directories under /storage/backup, with a separate rsync per sub-directory).

The kicker to all this: ~$10,000 CDN for each storage server!! And we're working on a method to automate backups for the few Windows stations we still have (also using ssh and rsync).

Another school district in the province spent over $250,000 CDN for their backup setup, with less storage space, a lot more administrative overhead, and more physical servers. Without off-site redundancy. :) Sometimes, I really like working with FreeBSD and Linux systems!!
 
Nice. I've also built a server but without ZFS. I run rsnapshot and shell scripts to backup 3 MySQL servers and 5 webservers. I'm using 1TBx4 hard disk with RAID 10. We make a full backup to tape.

Your setup is awesome. Did you able to run any disk I/O tests? If so could you paste your results?

TIA.
 
I did, way back when we first started, but didn't keep them (had nothing to compare them to). Just simple dd runs, so nothing really useful.

Any suggestions on disk benchmarks to run?
 
I ran some iozone benchmarks on one of the servers. Created a new ZFS filesystem, with all the default settings (noatime off, compression off).

The iozone commands used:
# iozone -M -e -+u -T -t <threads> -r 128k -s 40960 -i 0 -i 1 -i 2 -i 8 -+p 70 -C
I ran the command using 32, 64, 128, and 256 for <threads>

Write speeds range from 236 MBytes/sec to 582 MBytes/sec for sequential; and from 242 MBytes/sec to 550 MBytes/sec for random.

Read speeds range from 3.3 GBytes/sec to 5.5 GBytes/sec for sequential; and from 1.8 GBytes/sec to 5.5 GBytes/sec for random.

All the gory details are below.

Code:
32-threads:  Children see ...  32 initial writers =  582468.13 KB/sec
32-threads:  Parent sees  ...  32 initial writers =  108808.46 KB/sec
64-threads:  Children see ...  64 initial writers =  236144.47 KB/sec
64-threads:  Parent sees  ...  64 initial writers =   86942.94 KB/sec
128-threads: Children see ... 128 initial writers =  284706.68 KB/sec
128-threads: Parent sees  ... 128 initial writers =   10850.40 KB/sec
256-threads: Children see ... 256 initial writers =  258260.59 KB/sec
256-threads: Parent sees  ... 256 initial writers =    9882.16 KB/sec

32-threads:  Children see ...  32 rewriters =  545347.52 KB/sec
32-threads:  Parent sees  ...  32 rewriters =  339308.08 KB/sec
64-threads:  Children see ...  64 rewriters =  419838.51 KB/sec
64-threads:  Parent sees  ...  64 rewriters =  335620.45 KB/sec
128-threads: Children see ... 128 rewriters =  350668.51 KB/sec
128-threads: Parent sees  ... 128 rewriters =  319452.97 KB/sec
256-threads: Children see ... 256 rewriters =  317751.52 KB/sec
256-threads: Parent sees  ... 256 rewriters =  295579.66 KB/sec

32-threads:  Children see ...  32 random writers =  379256.37 KB/sec
32-threads:  Parent sees  ...  32 random writers =   95298.44 KB/sec
64-threads:  Children see ...  64 random writers =  551767.68 KB/sec
64-threads:  Parent sees  ...  64 random writers =  113397.95 KB/sec
128-threads: Children see ... 128 random writers =  241980.60 KB/sec
128-threads: Parent sees  ... 128 random writers =   74584.01 KB/sec
256-threads: Children see ... 256 random writers =  398427.84 KB/sec
256-threads: Parent sees  ... 256 random writers =   20219.56 KB/sec

32-threads:  Children see ...  32 readers = 5023742.86 KB/sec
32-threads:  Parent sees  ...  32 readers = 4661309.72 KB/sec
64-threads:  Children see ...  64 readers = 5516460.71 KB/sec
64-threads:  Parent sees  ...  64 readers = 3949337.61 KB/sec
128-threads: Children see ... 128 readers = 4748635.74 KB/sec
128-threads: Parent sees  ... 128 readers = 3208982.03 KB/sec
256-threads: Children see ... 256 readers = 4358453.38 KB/sec
256-threads: Parent sees  ... 256 readers = 2741593.08 KB/sec

32-threads:  Children see ...  32 re-readers = 5502926.62 KB/sec
32-threads:  Parent sees  ...  32 re-readers = 4650327.75 KB/sec
64-threads:  Children see ...  64 re-readers = 5509400.02 KB/sec
64-threads:  Parent sees  ...  64 re-readers = 4526444.40 KB/sec
128-threads: Children see ... 128 re-readers = 4072363.55 KB/sec
128-threads: Parent sees  ... 128 re-readers = 2840317.47 KB/sec
256-threads: Children see ... 256 re-readers = 3329375.95 KB/sec
256-threads: Parent sees  ... 256 re-readers = 2183894.33 KB/sec

32-threads:  Children see ...  32 random readers = 5555090.45 KB/sec
32-threads:  Parent sees  ...  32 random readers = 4602383.62 KB/sec
64-threads:  Children see ...  64 random readers = 4402270.77 KB/sec
64-threads:  Parent sees  ...  64 random readers = 2059081.52 KB/sec
128-threads: Children see ... 128 random readers = 3070466.93 KB/sec
128-threads: Parent sees  ... 128 random readers =  525076.11 KB/sec
256-threads: Children see ... 256 random readers = 1888676.12 KB/sec
256-threads: Parent sees  ... 256 random readers =  293304.53 KB/sec

32-threads:  Children see ...  32 mixed workload = 3130000.18 KB/sec
32-threads:  Parent sees  ...  32 mixed workload =  123281.78 KB/sec
64-threads:  Children see ...  64 mixed workload = 1587053.33 KB/sec
64-threads:  Parent sees  ...  64 mixed workload =  294586.82 KB/sec
128-threads: Children see ... 128 mixed workload =  807349.95 KB/sec
128-threads: Parent sees  ... 128 mixed workload =   98998.77 KB/sec
256-threads: Children see ... 256 mixed workload =  393469.55 KB/sec
256-threads: Parent sees  ... 256 mixed workload =  112394.90 KB/sec
 
Great post, useless benchmarks ;)

Hi

Fantastic posts, I wish I had found this thread earlier. I created a similar setting, though using higher capacity drives.

One note however, the iozone benchmarks are useless here, especially the read speed.
All it is showing is that the data is in RAM or CPU cache...

A more valid test would be:
[cmd=]iozone -R -a -i 0 -i 1 -i 2 -g <size> -f <testfile> -b <excelfile>[/cmd]

size needs to be at least twice more than the amount of RAM and rounded to the power of 2 (for more accuracy). E.g with 6GB of ram, use size = 16g

testfile is the path to the file on the disk to test...

There's physically no way a RAID array with 24 disks, each having a physical limit of around 100MB/s could achieve over 5GB/s reading, heck that's more than what a single lane PCI-e can carry !
The 3Ware 9650 is a PCI-e 1.0 8 lanes ; the PCI-e 8X 1.0 port can carry 2GB/s maximum...

I tested a RAIDZ setup with 6 x 2TB (Western Digital RE4 drives) and achieve 280MB/s write and 320MB/s read. Which is ok (faster than what the dual-NIC could output), but not exceptionally great.

A linux box with a E8200 (2.6GHz) dual core with 2GB of RAM and 5 x 1.5TB consumer-level drive achieved with md 270MB/s write but 455MB/s read ...
 
Nice Setup & Howto :)

I have a question to your Settings on the 3Ware Controller.

Did you have the WriteCache enabled in the 3dm2 Webinterface?

I got a very poor throughtput without WriteCache enabled on 7.2 Release AMD64
on different machines with different 3ware controllers. And it looks like im not the only person who hit that bug(?) .

twa, 3ware performance issue

thx & best regards
 
Yes, we have the write cache enabled on the controller, and use the performance profile for each of the disks. This gives us a nice, fast, "2nd level" cache (disk cache -> controller cache -> ZFS ARC), and it allows the controller to re-order writes to the drives, as needed.

Makes it a bit more intelligent than a plain JBOD setup, where the controller would be just a dumb SATA controller.
 
Back
Top