Experimenting with Backups: From EC2 (or any networked unix box) to S3

If you’re wondering if there’s a difference between backups in the Service-oriented Cloud and backups that the rest of the world is familiar with… well… there are 5 key differences mostly stemming from the fact that cloud backups are service-based (surprise-surprise)… but the rest of the world is probably not familiar with backing things up anyway, so let’s continue.

In SOA, a “cloud backup” is done by taking a snapshot outside the virtual device being backed up. It’s not really new. This can be done using XEN, Veritas, Amazon Elastic Block Storage snapshots, etc, etc…

Some people don’t like the idea of backing up an entire instance with it’s binaries, log files, and duplicate data. I believe that redundancy is useful if not necessary for reliable backups, so I take the big snapshots once per day or so, but I also back up smaller files more frequently for added roll-back-ability or whatever you want to call it. Here’s how I back up scripts from cloud appliances to my S3 bit buckets. Remember that a backup is only as good as your ability to restore it and automatic backups should be tested often. You might also want to periodically delete old backups that you don’t need, but this is optional and could be hasty. Redundancy can help ensure better data integrity for backups, but it’s at the cost of disk space and some network bandwidth… and you have to keep backups safe!

#!/bin/bash
# Asher Bond 2010
# http://www.asherbond.com/blog/2010/09/15/service-oriented-backups-from-ec2-to-s3/
# backup-scripts.sh
# backup scripts every hour
# slightly tested on Debian Lenny
# put this in your /etc/cron.hourly


# no trailing slashes
local_backup_dir='/var/backups';
remote_backup_dir='/mnt/backups.asherbond.com';

# script directories to recursively back up
script_dirs='/etc'

# learn the date in rfc-3339 format
date=`date --rfc-3339=seconds | cut -d: -f1 | tr ' ' '-'`;

hostname=`hostname`;

file_prefix="$hostname-backup-scripts-";

# files look like: myhostname-backup-scripts-YYYY-MM-DD-HH.tar.gz when they're done
filename="$file_prefix$date.tar";

cd $local_backup_dir

# delete any local backups older than 7 days
echo "Deleting backups older than 7 days..."
find . -type f -ctime +7 -name "$file_prefix*.tar.gz" -exec rm -f {} ;

echo "Archiving files..."
tar -cvf $filename $script_dirs
gzip $filename

# mount s3 backup bit bucket using FUSE
# http://www.asherbond.com/blog/2010/09/14/mount-an-amazon-s3-bit-bucket-as-a-drive-in-unix-using-fuse/
/etc/asher-bond-cloud/s3-mount.sh backups.asherbond.com

# copy to remote bit bucket
echo "Copying backup to Amazon S3 bit bucket..."
cp $filename.gz $remote_backup_dir

# unmount fuse when done
echo "Dismounting from S3, what a trusty workhorse..."
sleep 30 && umount fuse

echo "FIN."

The output will look something like this:

Archiving Files…
/etc/
/etc/mysql/
/etc/mysql/debian.cnf
/etc/mysql/my.cnf
/etc/asher-bond-cloud/
/etc/asher-bond-cloud/loadavg.py
/etc/asher-bond-cloud/backup-scripts.sh
/etc/asher-bond-cloud/s3-mount.sh
/etc/etc/etc/etc/lol
Deleting backups older than 7 days…
Copying backup to Amazon S3 bit bucket…
Getting object list from S3 …
Validating cache …
Setup complete
Dismounting from S3, what a trusty workhorse.
Fin.

It’s a long way down if your head is in the CLOUD.
– Asher Bond

Tagged with: , , , , , , , , ,
Posted in cloud computing, designing scalable systems
One comment on “Experimenting with Backups: From EC2 (or any networked unix box) to S3
  1. asia.wsj says:

    Thanks for sharing, I just wanted to let you know that your blog doesnt show up perfectly cardy uggs on the blackberry browser but I am probably still in a minority of users.