Rsync-Incr

What is rsync-incr

rsync-incr is a linux wrapper shell (bash) script around rsync to perform automated, unattended, incremental, disk to disk backups, automatically removing old backups to make room for new ones. It produces standard mirror copies browsable and restorable without specific tools.I have been using it in production daily at work and at home since 2004.

This page is at http://colas.nahaboo.net/Software/RsyncIncr

Goals

I wanted to have a backup system with the following properties:
  • standard based on standard tools (rsync), and restorable with only standard tools.
  • simple as possible.
  • automatable to be run daily (or more) by crontab, managing error conditions reliably so we can mail on errors, and making automatically room for new backups.

User Manual

rsync-incr [options] N sourcedir destdir

rsync-incr will create destdir as a perfect mirror, and save in destdir.past a directory per run with copies of changed files old versions. These N (at most) directories of old versions of changed files are named by their dates in the form destdir.past/YYYY-MM-DD.HHhMN-SIZEm

  • e.g: for rsync-incr 10 /home /backups/home
    /backups/home.past/2005-01-24.04h23-122m

Date of last backup is in the contents of file destdir.past/LAST_DATE. No need to append a trailing / to source and destination.
Dest must be on the local machine (maybe NFS-mounted), sourcedir can be on a remote machine via the ssh syntax host:dir.
SIZE in name is the disk space taken that this backup, in megabytes, before an optional compression via --cbf, to help you find the good value of N m (e.g: the max of past SIZEs), as this size is hard to find in --snap mode SIZE is rounded to upper bound: 0m means 0 bytes, 2m less than 2m

This a simple script, making backups usables by standard rsync (no need for a dedicated restore script). It has 2 basic modes of operation:

  • default: make a perfect copy, of all hard links, devices, sparse files, and just stores in dirs the previous versions of only the changed files. This makes it easy to find the different states a files went through, but make it harder to get a perfect snapshot of what was the full state N days before. E.g, if a directory contains files A and B, and B is modified today, yesterday backup will contain only yesterday version of B, and today will contain current versions of A and B.
  • --snap: makes full snapshots of what the source was like at backup times as described in Mike Rubel article It is easier to get to full snapshots of previous states, and should run faster than the default. E.g, if a directory contains files A and B, and B is modified today, both yesterday and today backups will contain A and B, but A will be the same file (hard links to the same inode), and the Bs will be two different fles.
If N has "m" appended (2m, 34m, ..) old versions are removed before backup until we have at least N megabytes free on dest, and the max of space taken by previous backups (+ 10%, see --pbsm). Otherwise, just keep the last N backups.

Options are passed to rsync, but must be a single word parts (e.g: use --rsh=ssh, not -e ssh)

  • e.g: rsync-incr -z --bwlimit=12 --rsh=ssh server:/home/me /backups/me

(this will create an perfect backup in /backups/me and a series of previous versions as dirs like /backups/me.past/2004-10-26.04:40:20-234 ...) rsync-incr uses rsync with options: -HSax --delete --force

Special non-rsync options:

  • --nohl do not use the -H / --hard-links option (do not preserve hard links), faster if you do not need to preserve hard links.
  • --cbf compresses (gzip -r) all backuped files (will not compress files with hard links)
  • --snap old backups are full snapshots of previous stats, as in Mike Rubel article but this do not preserve hard links
  • --grem global remove: will remove oldest backups globally on the filesystem (otherwise space-making on a small backup could wipe out all backups in order to desperately make room, not noticing that removing a single old backup of a bigger backup could do the job).
    You should place a list (one per line) of all the absolute paths of LAST_DATE files on the system in the env variable RSYNCINCR_LASTDATES, for instance by a statement:
    • export RSYNCINCR_LASTDATES=`locate LAST_DATE`
otherwise a global find will be used, which can be very slow. It will only remove backups on same filesystem as destdir, so you can list all LAST_DATE paths on all disks. For instance if all the backups are organized as /backups/host/partition, the following should be included at the start of backup scripts to decrease startup time:
RSYNCINCR_LASTDATES=`ls -1 /backups/*/*.past/LAST_DATE`
  • --pbsm=P Previous Backups Space Margin: reserve space before backup for at least the max size of previous backups + P% (P default to 10). If P ends with "m" (like 7m) it is taken as P megabytes to add rather as a percentage.
  • --clean Just make enough room for the backups, but do not actually perform the backups
To restore a backup, use standard rsync (trailing slashes are IMPORTANT):
  • rsync -HSax --delete --force backup / original /

Rsync errors are propagated (the script exits with rsync exit status), except for the error #24 which is trapped, as this error can happen on backups of live systems (being modified while backuped)

Installation

Just copy the rsync-incr shell script anywherein your PATH, e.g: /usr/local/bin.

Implementation

A 120 lines of shell script (excluding the embedded doc), based on the fantastic rsync, and Mike Rubel article

License

Pure Open source: GPL

More details

See also what inspired me:

New versions announcements

New releases and important issues will be annonced on the Rsyncincr blog, so I strongly suggest you monitor it, either
  • by RSS to be warned of new releases and major changes (recommended)
  • by email, by email gateways to the above RSS, for instance by
    Subscribe via
  • by email, for instance by Changedetection.com, on:

Examples of use

See a detailed example of what rsync-incr does

Also, here are (modified for privacy) real scripts I use daily:

  • Example1 a script run daily on the host backserv to archive incrementally various machine partitions and mailing in case of errors at $email
  • Example2 this script is auto-run on start of the backup server: it connects to the main server (named "m"), backups, and halts the backup server It does it in 2 paralled processes impacting disks on different controllers for added speed. Only m root partition is done incrementally
  • To connect to sites that use a non-standard port for ssh, let's say 26 the trick is to create a shell script, for instance named ssh-p26, containing the line: exec ssh -p 26 "$@" and call rsync-incr with the option --rsh=ssh-p26

Other backup systems

Rsync-incr is not the only open source smart and simple backup system. It is I think unique in its automated claculation of free space and smart removal of old backups to make room for new ones. But other ones, can be more relevant to your needs, for instance:
  • rdiff-backup that only stores differences in files. For instance for big log files, rsync-incr will archive a full copy of the log (although it would have transferred on the network only the changed lines), whereas rdiff-backup will only store the actual added lines and is much more space-efficient backup-side. However, you do not get the easy browsing, access and comparison on backups that rsync-incr offers
  • dirvish nice fast backup, but do not automate expiration of backups
  • rsnapshot is more powerful and integrated, in perl having features that rsync-incr leave to a wrapper script

History

  • v1.7 2009-01-01, bug fix: could fail if was restarted on the same dirs in the same minute
    2009-01-05 this documentation updated: the --no option is in fact --clean. Code is inchanged, no need to re-download.
    new subpage to give a detailed example
  • v1.6 2008-12-19, works with destdir on NFS now (before was not removing old backups due to free space calculations not working on NFS),
    new --no option.
  • v1.5 2008-10-12, nothing changed, only packaging doc, and web page moved here. No need to upgrade
  • v1.4 2007-02-26, bug fix by Jeremy Lingmann: on system with long device names wrapping enabled, rsync-incr was unable to compute free space. We now use df -P to fix this. This is the only change, upgrade recommended.
  • v1.3 2006-08-08, bug fix: rsync options with metacharacters were not working (e.g the * in: --exclude='/tmp/*')
  • v1.2 2006-06-21, bug fix by Jiri Voves: --pbsm option worked only for sizes given in megabytes (with appended "m")
  • v1.1 2005-03-25, bug fix: in some cases some old backups were not deleted. if day of month started with 0.
  • v1.0 2005-02-19, first public release
  • v0.9 2004-12-15, internal beta test

Comments


RSS feed of this page comments

 
Navigate: Changes -  Index -  Map -  Search -  Print version -  RSS Feed
Advanced: Backlinks -  Children -  Raw View -  Email changes -  History
Admin: Admin -  Statistics -  Preferences -  Notifications -  Your Account
Edit: Edit -  Raw edit -  Attach -  Create New Topic -  More topic actions
Software.RsyncIncr
Topic revision: r23 - 2009-05-01 - 12:57:34 - ColasNahaboo
Powered by This site is powered by the TWiki collaboration platformTWiki - Send feedback
Hosted by Linode on Debian