Remote dd

Hi everyone!

I manage multiple servers and I have to backup them.

I'd like to create an image of my hard drive to insure fast recovery and i'd like to backup this image on the network.

I know the old-school dd if= of=.

I was wondering if such a tool could export the content of my hard drive (read: with partitions, MBR,...) to an image on a remote server.

Maybe by mounting an NFS share and making my image on it?

And to restore, i could simply boot a linux livecd and dd if= of= from the share on my disk.

I'd run this (the backup) once a week in addition to the rsync data backup.

Is there another way to create an image on my disk? Maybe with incremental support (i seriously doubt about it)?

I don't know if exporting an image over the network on-the-go is a good idea...

Thanks for you advices!

François
 
You don't know it yet, but you really want ZFS.

Other then that, fdisk can write bootsectors to file, which the FreeBSD live CD can write back to disk, then use plain old dump/restore as usual.
 
Yeah i've read good things about ZFS. But it doesn't help in doing remote backup does it?

So dd doesn't backup bootsectors? I thought it did thanks for warning it could have been dangerous :S
 
like really, wikipedia, get over it

dd "does" everything, including boot sectors. The problem being that it's pretty indiscriminate.

I would advise dump/restore, or tar (if you're like me and only care about certain bits).

That said:
http://en.wikipedia.org/wiki/Dd_(Unix)
is a (as of 15aprMMIX (it'll probably be deleted someday as violating NPOV (screw you, wikipedia))) pretty decent rundown of things you can do with dd, and even an example of your remote backup solution.
 
franz said:
Yeah i've read good things about ZFS. But it doesn't help in doing remote backup does it?

:D Oh, but it most certainly does. FreeBSD using ZFS and Rsync makes for a most excellent remote backup server. We do this, with automated remote backups for ~90 servers. Entire backup run take ~5 hours, mostly due to the crappy upload speeds of the remote ADSL links. Then we push those changes over to another backup server at a remote site. And use ZFS snapshots to keep a daily record of what changes. We're averaging 10 GB of changed data per day. We figure we'll be good for at least a year of daily snapshots on our 10 TB storage servers. :)

If you (or anyone else) are interested, I can post our methods. The gist of it, though, is to have the central server connect to the remote servers using rsync-over-ssh, save the data into a separate directory for each server, and then snapsht the directory that houses those.

System restores then become:
* boot from LiveCD
* partition and format the disk(s)
* mount the filesystems into /mnt
* rsync the data from the central backup server to /mnt
* make disk bootable
* reboot to test things work

We can restore a firewall image in under 20 minutes, and a full school server in about an hour (gigabit network for restores).

For ~$10,000 CDN, we have a custom, homebuilt backup solution that beats the one another district in the province spent over $200,000 CDN on. :D Using all off-the-shelf, replaceable parts.
 
Yes, I'm interested, a lot!

Looks like your solution is efficient.

If I understand well, your solution is file-based?

You're rsync'ing files from "/" to a remote server?

Does it store ZFS attributes such as snapshots or so?

I'll be very interested in reading your solution, thanks ;-)
 
Okay, I've received the go-ahead from the powers-that-be. I'll sanitise the scripts, and post them up here shortly.

Looks like your solution is efficient.

Mostly efficient. It only stores the changed files going forward, but it does store duplicates of files that are common across servers (all our servers are Debian Linux, so there are multiple copies of everything under /bin, /sbin, /usr/bin, and so on). The initial backup run can take multiple days, as it's limited by the upload speed of the remote sites, but the daily backup after that takes only a few minutes.

There's no de-duplication of data on the backup server. In theory, we could probably reduce the amount of storage space used by a good chunk, but in practice it hasn't been a hindrance.

If I understand well, your solution is file-based?

Correct. We have a ZFS filesystem /storage/backups/ and a separate sub-folder for each remote site (/storage/backups/site1/, /storage/backups/site2/, etc), and separate sub-folders for each server at the site (/storage/backups/site1/server1/, /storage/backups/site1/server2/, etc).

Each rsync run syncs /storage/backups/<site>/<server>/ with / on the remote server (ie, it syncs everything). We use an exclude file to skip directories like /proc, /sys, /tmp, /dev, and so on, and to skip non-essential directories like Mozilla Firefox cache directories, Squid cache directories, and stuff like that.

You're rsync'ing files from "/" to a remote server?

Other way around. We sync / on the remote server to a local directory on the backup server. Everything is done from the backup server.

Does it store ZFS attributes such as snapshots or so?

It uses ZFS snapshots to keep track of the changes each day. This is the "daily" backup. Using snapshots, we can go back in time and restore either individual files or the entire server. That's the core of our setup. :) Our eventual goal is to wrap a web-based GUI around the .zfs/ directory, and allow IT people access to the backups for their sites, so that they can grab individual files and groups of files as needed. Right now, they call the helpdesk, and we use scp to manually transfer files.
 
Back
Top