dump vs GNU/BSD tar, which one do you use?

I use GNU tar. I like the file based backup scenario. I just backup my ~/ to archive.tar.bz2 on external HD every week and that's pretty much my backup plan. I don't get too fancy because I'm lazy. I have a script for tar so that all I do is plug in the portable HD and ./backupscript.sh... done. I don't have to track which dump archive belongs to what partiton (it's easy but I don't like details). I just want to untar and go!

I don't like dump because the restore partition has to be similar to the original partition. Tar is like a girl that is easy but not too easy. Dump is like that girl that put all those requirements on you.

What do you guys use?
 
I don't like dump because the restore partition has to be similar to the original partition.
If that's your reason for not liking dump you should think again. There's no such limitation. You can restore any part of a dump to anywhere you like.
 
Seconded for rsync, there are acouple of limitations to your plan.

* Your going to use alot of disk to backup each time your entire ~/ dir is backed up
* CPU time spent having to tar up such a large amount of data + the CPU/USB overheads

There are the positives in the fact you have a day by day snapshot going back in time, but i would implement some type of script to maybe delete by a certain date range. Maybe use find(1) to delete everything older than a month or you will have alot of data taking up space that is all but duplicated.

Below is an example

Code:
ubuntu@ubuntu:~/Desktop$ cat nitebackup.sh 

#
#	Macbook backup script will email leschnik@gmail.com with the results
#

USERFILE="/Users/orange"
LOCFILE="$USERFILE/Pictures $USERFILE/Desktop"
DESTIN="/sharePoint/.Backups/PC\ Backup/Macbook"	
SENDTO="leschnik@gmail.com"			
THEDATE=`date "+On Date: %Y-%m%d @ Time: %H:%M:%S"`
EMAILSUB="Backup has been completed:"

cd /Users/$USER
rsync -aH --rsh='ssh' -v --stats $LOCFILE jason@hotdog:"$DESTIN" | mail -s "$EMAILSUB $THEDATE" $SENDTO

I actually only created this yesterday and still need to add alot of error checking, but at the time i had no fault tolerance with my photos or desktop files. So this is a good start.
 
The downside of rsync is that it doesn't keep historical data. Any error on the original data is propagated to the mirror. It can be helped somewhat with a healthy number of snapshots on source and/or destination, but it's still not comparable to 'real' backup.
 
jalla said:
The downside of rsync is that it doesn't keep historical data. Any error on the original data is propagated to the mirror. It can be helped somewhat with a healthy number of snapshots on source and/or destination, but it's still not comparable to 'real' backup.

rsync(1) with -b backup option should fix that

And then what you can do is use the --suffix= and do something like use the current date

Code:
DATE=`date "+%Y%m%d"`
-b --suffix=.$DATE

or

-b --suffix=.`date "+%Y%m"` <Source> <Destination>

to append this to all files which are overwritten...

ie. NewFile.txt
NewFile.txt.201010005
NewFile.txt.201010004 ... etc
 
shitson said:
rsync(1) with -b backup option should fix that

And then what you can do is use the --suffix= and do something like use the current date

Code:
DATE=`date "+%Y%m%d"`
-b --suffix=.$DATE

or

-b --suffix=.`date "+%Y%m"` <Source> <Destination>

to append this to all files which are overwritten...

ie. NewFile.txt
NewFile.txt.201010005
NewFile.txt.201010004 ... etc

The date appended would be the date the backup was run, not the date the file changed. That would matter more the farther apart backups are. Also, it looks like this would eliminate the speed/size advantage of rsync.

Despite that, it's an interesting idea.
 
The filesystem type must be the same, ext3, btrfs, ufs2, etc. With tar, I can go from ISO, to UFS, to NTFS, to EXT4 all within short time. Dump is FS/partition specific.

jalla said:
If that's your reason for not liking dump you should think again. There's no such limitation. You can restore any part of a dump to anywhere you like.
 
jalla said:
The downside of rsync is that it doesn't keep historical data. Any error on the original data is propagated to the mirror. It can be helped somewhat with a healthy number of snapshots on source and/or destination, but it's still not comparable to 'real' backup.

That's why you rsync to a filesystem running ZFS, and take a snapshot before running rsync. :) Then you get historical data, incremental backups, and only use disk space for files that change. :) Best of all worlds.
 
Wow I have no idea that so many people liked rsync. I used it to backup a couple of months ago on a NTFS drive because it was a compressed NTFS filesystem. Because it was compressed, I turned an 80GB portable into 120GB :). This was in 2007 when I bought the Maxtor portable drive for around $120. Now in 2010 I can get a terabyte for less money! I stopped using rsync because of rsync's peculiarities. Come on you know what I mean if you ever used rsync. Rsync has peculiarities and quirks: enough to fill a couple of manpages.

How do you do compression with rsync on a portable drive?
 
lockfile said:
How do you do compression with rsync on a portable drive?

The same way you do with any drive: via the filesystem. The destination filesystem is where all the magic happens, has nothing to do with rsync. Rsync just transfers data from one filesystem to another.
 
phoenix said:
The same way you do with any drive: via the filesystem. The destination filesystem is where all the magic happens, has nothing to do with rsync. Rsync just transfers data from one filesystem to another.

I guess the reason for the question is that tar supports compression natively, and as rsync creates a copy (of every individual file) not a backup archive its not exactly comparable. However, as you mention, ZFS is very nice with rsync as you can do snapshots for historical versioning and you can also turn on file system level compression with ZFS. Quite a neat solution :)
 
If you want to do rolling backup sets, you can write a script that will do a backup similar to Apple's Time Machine using rsync. Check the rsync manpage and look at the --link-dest option.

Basically, run your first backup:
Code:
rsync -avH /path/to/src /path/to/dest/timestamp.X

Then time passes, run another backup with the link-dest argument:
Code:
rsync -avH --link-dest=/path/to/dest/timestamp.X /path/to/src /patch/to/dest/timestamp.X+1

More time passes:
Code:
rsync -avH --link-dest=/path/to/dest/timestamp.X+1 /path/to/src /patch/to/dest/timestamp.X+2

Viola, you get to keep both images of your disk at those points in time. Since you are using link-dest, it hard links files that are the same between the timestamp.X and timestamp.X+1 directory. This keeps your disk consumption down to a minimum; only files that are changed actually use more disk space on the destination.

rsync is quirky though. It's horribly designed and backwards when executing server/client transfers (it puts all of the processing load on the server instead of the client, that's just kinda dumb). Despite that, it's an incredibly powerful tool. At my work, we use it to move huge datasets around daily. We couldn't live without it.
 
So it's official, people like rsync and ZFS snapshots. That's okay, but I'll just stick to tarballs. Filesystem/partition based backup schemes are too much work for me. Details are bad bad things. I simply keep my archives in a cool, dark, place.

Code:
mount /media/XternalHD
tar -pcjf /media/XternalHD/home_backup_20100112.tar.bz2   $dir_file_list
umount /media/XternalHD
 
phoenix said:
That's why you rsync to a filesystem running ZFS, and take a snapshot before running rsync. :) Then you get historical data, incremental backups, and only use disk space for files that change. :) Best of all worlds.

This is what I do as well :).

I have encountered one problem with this setup though. I can't rsync directories when accessing snapshots with /path/to/fs/.zfs/snapshots/.
 
I use dump/restore for backup, had it for years with ufs, but not yet expirience with zfs.

* The -L option for dump (use snapshot for live filesystem) is very nice.
* Incremental backups works fine (and fast) due dump ,,looks'' inside the filesystem instead of running over the fielsystem and all files to check the modification date.
* The interactive mode from restore is very nice, even working with multiple incremental files!
* using CTRL+T gives nice info ;-)
* don't try to specify -C for dump to increase the cache size ;-)

For me, tar ist a very good "archive builder" but dump/restore is the better backup/restore solution.
 
dennylin93 said:
This is what I do as well :).

I have encountered one problem with this setup though. I can't rsync directories when accessing snapshots with /path/to/fs/.zfs/snapshots/.

You need to set the snapdir property to visible in order for shells and commands to be able to access the .zfs directory tree. With that property set to hidden, you can manually enter the directory, but things that need to determine the directoy tree will fail (ls, stat, cd, rsync, etc).

I do rsync from old snapshots all the time, works great for restoring systems. :)
 
phoenix said:
You need to set the snapdir property to visible in order for shells and commands to be able to access the .zfs directory tree. With that property set to hidden, you can manually enter the directory, but things that need to determine the directoy tree will fail (ls, stat, cd, rsync, etc).

I do rsync from old snapshots all the time, works great for restoring systems. :)

Excellent! I've been looking for this for ages.
 
Back
Top