Simple home network backup strategy ?

lagagnon · 2010-04-03 16:49:40

I have a simple home network of 4 machines on a router, all running Arch. None of the boxes are a server. I do not have NFS nor Samba running. I want a simple, incremental cron-based weekly backup process, to be backed up to a spare hard drive on one of the computers (eg hostname "master"). I believe rsync with ssh might be the easiest way to go, something like: "rsync -e ssh -varuzP /home/whoever master:/mnt/tmp/backups".

The cron job would first of all make sure the master computer is online, or send a message to the user to that effect. My problem is I do not know how to remotely mount, from a cron job, the second hard drive on "master". I suppose I could ensure that "master" auto mounts that drive each session but I would prefer not to do that. Any ideas appreciated.

briest · 2010-04-03 21:45:37

Well, I think whole setup would be simpler if backups were initiated on master side; so, it'd be master's cron, that mounts/rsyncs/umounts. But to keep closer to initial idea -- maybe --rsync-path will do? Something like --rsync-path="mount /mnt/tmp && rsync; umount /mnt/tmp". I have to admit I never did it this way, but I think it should work.

lagagnon · 2010-04-05 15:06:21

Yes, what I have done in the end is not ideal and not yet automated, but I run a small script on the master computer which mounts the secondary drive and starts up sshd. I then go to each computer to backup in turn and run a small rsync script which does an incremental backup to that drive. Only takes about 5 minutes all together so is reasonable. Automating it via a cron job is the next task...

IgnorantGuru · 2010-04-06 14:07:18

Another strategy for you to consider...

At a slightly different time each night (or whatever interval you want), have each computer on your network create a backup archive of itself. I recommend xz compression for this because it is slow to compress and small, but fast to decompress. eg

include="$include /root"
include="$include /home"
exclude="$exclude --exclude=/home/*/.mozilla/firefox/*/*Cache/*"
exclude="$exclude --exclude=/home/*/.mozilla/firefox/*/bookmarkbackups/*"
exclude="$exclude --exclude=/home/*/.mozilla/firefox/*/urlclassifier3.sqlite"
datestamp="$(date +%Y-%m-%d-%H%M%S)"

tar --ignore-failed-read $exclude -cf - $include | xz > "/backups/backup-$HOSTNAME-$datestamp.txz"

Next, if the security is needed, have the computer encrypt the archive using a GPG public key created for this purpose. (By using a public key, you don't need to be there to type in the password, and you don't need to include a password on the command line.)

gpg --always-trust -r 0xAAAAAAAA -o "/backups/backup-$HOSTNAME-$datestamp.txz.gpg" -e "/backups/backup-$HOSTNAME-$datestamp.txz"

(Where "0xAAAAAAAA" is your public key ID)
You can then delete the .txz original.

Then have each computer copy its backup file to each of the other computers on the network. You can do this by sharing an NFS backup folder on each computer, rsync, or using any other method or protocol. In some cases you could alternatively have a computer fetch a backup from another computer.

You can write a single script to do all of this on every computer, and have it read the $HOSTNAME variable so it can determine what machine it is running on.

Later in the night, have each computer run a separate script that checks that all the backups for today are present in its own backup folder, including those from all other computers. If there is a problem, have it send an admin alert, perhaps using local email or a CLI smtp client like msmtp. If there isn't a problem, have each computer run a cleanup routine to delete old backups. eg

# Remove backups older than 10 days
find -L /backups/ -type f -mtime +10 -name "backup-*.gpg" -execdir rm {} \;

This strategy is good for relatively low-volume dynamic data, like a /home folder and /etc. I find I like having separate backups from each of the past 10 days, rather than one incremental backup. For larger static data (music, movies, etc) you may want to use an incremental backup.

Advantages:

Redundancy - every computer keeps backups of all the others

Decentralized - no need for, or reliance on, a dedicated master

If the network is down, each computer at least creates a local backup.

With static tar archives instead of incremental, you have snapshots of the folders as they were several days or weeks ago. This way if you discover a problem later, you can go further back than an incremental backup allows.

Each computer can backup a custom set of folders for itself. And each computer can store backups for a different number of days, depending on its storage capacity.

Issues:

Be sure to have a copy of the GPG secret key! If the only backups of it you have are encrypted with the public key, you're in trouble. One way to avoid this is to have your backup script support a manual mode which encypts with a conventional passphrase instead of the public key. You can run this manual mode to create a static backup occassionally, and it will prompt you for the passphrase. The backup will thus contain an accessible copy of the key pair used for the automatic backups.

Sometimes NFS will hang for hours rather than producing an error. This is why it's good to have a separate script on a separate cron job which checks if the backups were created successfully and sends an admin alert.

Always test backups! It is terrible to discover your backup wasn't being created as you expected when you go to use it. Test to make sure you can decrypt and decompress it, and that it contains everything you expect.

Last edited by IgnorantGuru (2010-04-06 14:31:22)

cpslanet · 2010-04-07 14:09:49

You might want to try something like backupninja which uses rdiff to create incremental backups. basically it will ssh into your host(s) and rdiff to your "master" box.

I don't understand why you need to mount the disk, why not have it mounted at boot?

Profjim · 2010-04-08 12:37:06

rsnapshot is a perl script that uses rsync to create incremental backups. Each of the incremental backups looks like a full image of the source, but hard links are used so altogether 5 or 10 backups of a single source that hasn't changed much only use up a bit more space than a single backup.

You'd run rsnapshot on the machine to which the backup filesystem is attached. It would connect to the source machines that you want to backup over ssh. I've set up passwordless keypairs to allow root ssh connections into a restricted shell that only runs rsync. That way, I can do proper backups of the whole source machine. If you only want to backup directories owned by source_user1, you could instead set up passwordless keypairs to allow source_user1 ssh connections. That'd be more secure; although even then you may want to look into the restricted shell options. Search on "rrsync".

If you have the rsync daemon running on your source machines, some of this may be easier to setup. I don't know.

rdiff-backup is a python script that uses librsync (not the same as rsync, but it uses the same basic algorithm, as I understand it) to create incremental backups that are structured differently. Here you have a single complete copy of your source, as well as compressed diff snapshots going backwards in time. If any of the diffs get screwed up, rdiff-backup may catch it during the next backup and try to fix it. But if not, then I think you'll have lost changes earlier than the screwed up diff. This happened to me a couple of times. The rsnapshot method takes up a bit more space (changed files exist in full on the backup drive, though only once for each time they're changed). But it's easier to work with, and less fragile.

Arch Linux

#1 2010-04-03 16:49:40

Simple home network backup strategy ?

#2 2010-04-03 21:45:37

Re: Simple home network backup strategy ?

#3 2010-04-05 15:06:21

Re: Simple home network backup strategy ?

#4 2010-04-06 14:07:18

Re: Simple home network backup strategy ?

#5 2010-04-07 14:09:49

Re: Simple home network backup strategy ?

#6 2010-04-08 12:37:06

Re: Simple home network backup strategy ?

Board footer