An automated EBS backup solution using Amazon EC2

11th Mar 2010 // By Chris // Systems

For a while now, we've been looking for a solution to a common problem on Amazon EC2 hosting: how do you automatically create regular snapshots of your many EBS (Elastic Block Store) volumes, but also not create so many snapshots that it becomes unmanageable? We might have found the answer.

Creating a snapshot of any given volume is not really a problem, but given that you want to create your snapshots regularly (in our case, once a day), so that if something goes wrong, you always have a very recent copy of your clients' data, you can quickly build up a huge collection, a lot of which is not really needed.

This is where 'The Cloud Walker' obviously encountered the same problem. He has written a very nice PHP script, based on work by Oren Solomianik, that can manage older snapshots for you. It requires that you specify an EBS volume ID, for example, vol-12345678, and it will search for all of the snapshots for that volume. It then deletes any that are older than a week, unless they fall on a Sunday, in which case they're kept for a month, or they fall on the first of the month, in which case they're kept indefintely.

What you're left with is a full week of daily backups, then weekly backups going back a month, then monthly backups after that. This is based on the notion that the more time elapses, the less likely you are to need backups made on a specific date.

The script itself is written in PHP, and comes bundled with Amazon's PHP tools. There are, however, a few requirements:

Firstly, you will need the PHP command line interpreter. Many systems will come with PHP for the web server, say Apache, but not PHP that can be run from a command line. On Ubuntu (and probably Debian too), you can just sudo apt-get install php5-cli to install it, and then you will be able to run PHP scripts by typing php myscript.php from the command line.

Secondly, you will need the PHP XSLT processor, or you will see a message like Fatal error: Class ‘XsltProcessor’ not found. On Ubuntu systems (and probably Debian too) you can simply sudo apt-get install php5-xsl to overcome this.

Thirdly, The Cloud Walker's script, at the time of writing, uses long command line options. These are switches such as --region=foo rather than simply -r foo. Unfortunately, on most systems, prior to PHP 5.3, PHP did not support long options. Because we use Drupal 6 on our production sites, and at the time of writing, Drupal 6 does not work with PHP 5.3, we have had to stay with PHP 5.2 and simply strip the long options out.

We needed to open up The Cloud Walker's script and insert our Amazon access key and secret key, and since we only use one region, we hard-coded that region into the script, but then we were able to supply the volume ID to the script, such as php ec2-manage-snapshots.php -v vol-12345678 and it automatically decided which snaps to keep and which to delete.

Now we just needed an automated way of actually creating the snapshots. In the bundled Amazon PHP tools, in the examples folder, is CreateSnapshotSample.php, which allows you to pass in a volume ID and have a snapshot created for that volume. We made a copy of this, and also a copy of the .config.inc.php in the same folder, because this is where Amazon wants you to put the access key and secret key.

We needed to change the code slightly, because it is not 'region aware' and we needed to instruct it to use a particular region. Luckily, the code for this is inside The Cloud Walker's script, and this worked fine for us:

  1. $ec2Config = array ('ServiceURL' => 'https://eu-west-1.ec2.amazonaws.com');
  2. $service = new Amazon_EC2_Client(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, $ec2Config);

We tested this by creating a few snapshots from the command line, and things looked like they were ready to rock and roll. The only problem was that now our snapshot creation and our snapshot managements scripts could only operate on one volume ID at a time. We needed a way to manage a whole bunch of EBS volumes in one go, so we put together a bash script that looks a bit like this:

  1. #!/bin/bash
  2.  
  3. DATESTAMP=`date +%Y%m%d`
  4. TIMESTAMP=`date +%H%M`
  5. LOGFILE="/var/log/tigerfish_ebs_backup.log"
  6.  
  7. VOLUMES=( vol-12345678 vol-23456789 vol-34567890 )
  8.  
  9. echo "TIGERFISH EBS BACKUP $DATESTAMP $TIMESTAMP" 2>&1 | tee -a $LOGFILE
  10. echo " " 2>&1 | tee -a $LOGFILE
  11.  
  12. # Create a snapshot of each volume.
  13. for volume in ${VOLUMES[@]}
  14. do
  15.   php CreateSnapshot.php -v $volume 2>&1 | tee -a $LOGFILE
  16. done
  17.  
  18. # Remove older snapshots we don't need to keep any more.
  19. for volume in ${VOLUMES[@]}
  20. do
  21.   php ec2-manage-snapshots.php -v $volume 2>&1 | tee -a $LOGFILE
  22. done

Don't forget to touch your log file before you run this script, because we're using tee in append mode, so it will error if the file does not actually exist!

The last step was simple: set up a cron job to run the bash script daily. There is one thing to note here: we have our databases on an EBS volume, and Amazon's tools make particular mention of the fact that you should shut off your database server while you're performing the snapshot, otherwise the snapshot could become corrupted. This is something we won't cover here, but is reasonably straightforward.

To sum up, a big thanks to The Cloud Walker and Oren Solomianik for doing the work that we based our automatic EBS snapshots upon.

About The Author

Chris's Profile Picture
Chris

Chris Cohen is a seasoned web developer who began in Perl, before moving on to Java, C# and currently PHP and Drupal. He regularly attends DrupalCon and is "Certified to Rock" (certifiedtorock.com).

18

Comments

Anonymous's picture

I am unclear on how you got the CreateSnapshotSample.php script to run via the command line. Can you give an example?

Here is what I got...

$ php CreateSnapshotSample.php -v vol-xxxxxxxx
Caught Exception: The request must contain the parameter volume
Response Status Code: 400
Error Code: MissingParameter
Error Type: Unknown
Request ID: xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx
XML: <?xml version="1.0"?>
MissingParameterThe request must contain the parameter volumexxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx

Chris's picture
Hi there, It looks as though when you are using the -v switch, that's not working properly for some reason. We actually copied CreateSnapshotSample.php to make CreateSnapshot.php and modified it a little bit. For the sake of clarity, I didn't cover the modifications here. Check PHP's getopt() to make sure that the -v parameter is getting passed into the script properly, and that it is being passed to the code responsible for issuing the actual EC2 command.
Anonymous's picture

FWIW, there's no getopt() in the CreateSnapshotsample.php script. It'd be more clear if you published your CreateSnapshot.php script. Thanks!

Mahrob's picture

Hi Otto,

Great thread, I did all my setup following this.

I am also getting the above mentioned error

Caught Exception: The request must contain the parameter volume
Response Status Code: 400
Error Code: MissingParameter
Error Type: Unknown
Request ID: xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx
XML: <?xml version="1.0"?>
MissingParameterThe request must contain the parameter volumexxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx

Can I hardcode a volume id instead of passing it via prompt? If yes, where should I put it?

Thanks for the great article!
Mahrob

Anonymous's picture

Hi,

I m not able to add the request parameter in the create sanpshop api. Can u guide me how to do it

Caught Exception: The request must contain the parameter volume
Response Status Code: 400
Error Code: MissingParameter
Error Type: Unknown
Request ID: 7aff6baa-863c-4e3a-8e2b-50daca6135f0
XML:
MissingParameterThe request must contain the parameter volume7aff6baa-863c-4e3a-8e2b-50daca6135f0

Ganesh's picture

Hi,
Does any know where to place ec2-manage-snapshots.php file ? And also "Don't forget to touch your log file ....." what does this mean ? Thanks.

Chris's picture
It doesn't actually matter where the php script goes. You can simply run it from wherever it lies, with: php /path/to/my/phpfile.php Touching a file that does not exist will cause an empty file to be created. This is important if you need logging because if you don't do this, nothing will be logged. It's easy and you only need to do it once: touch /path/to/my/logfile.log
tommy's picture

Hi

Great work.

However I'm getting the following error trying to execute sudo php createsnapshot.php

Fatal error: Class 'Amazon_EC2_Client' not found in /.../.../createsnapshot.php on line 36

I have already included my ACCESS KEY ID & SECRET ACCESS KEY on the .config.inc.php file. Also this file is in the same directory as createsnapshot.php

Please advise.

Chris's picture
@tommy: when you download the Cloud Walker's scripts, there is a directory in the zip file called 'Amazon' which contains all the class libraries needed. You should have an Amazon directory at the same location as your createsnapshot.php. If you don't, you should go back to the Cloud Walker's archive and check you extracted everything.
ajmfulcher's picture

I've put together a web-based application to do this, called Ebs2s3. More details are here: http://ajmfulcher.blogspot.com/2011/04/ebs2s3-automated-backup-for-amazo...

Hope this helps out!

Bijo's picture

Hi,

I get this error. Can you figure out the issue.

# php CreateSnapshotSample.php -v vol-xxxxxxxx
PHP Notice: Undefined variable: request in /var/www/vhosts/ukourt.com/httpdocs/backup/Amazon/EC2/Samples/CreateSnapshotSample.php on line 60
Caught Exception: The request must contain the parameter volume
Response Status Code: 400
Error Code: MissingParameter
Error Type: Unknown
Request ID: xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx
XML: <?xml version="1.0" encoding="UTF-8"?>
MissingParameterThe request must contain the parameter volumexxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx

I could see the -v option is working fine.

Pls advise.

Thx
Bijo

Chris's picture
Bijo, it seems like lots of people are having this problem. It doesn't seem to be a problem for us, however. If you do figure out what's going on here, do post it and we can update the article!
George's picture

Anyone playing with CreateSnapshotSample.php will need to add a $request array with the volume id and if you are using a different region set the ServiceURL on the connection object.

The $request array is as simple as:

$request = array('VolumeId = > 'vol-12345678');

To use the eu-west-1 region the connection should be something like this:

$ec2Config = array ('ServiceURL' => "https://eu-west-1.ec2.amazonaws.com");

$service = new Amazon_EC2_Client(AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY, $ec2Config);

Hope this helps someone!

Bijo's picture

It works fine now. Thanks George :)

stephen's picture

Georges script has a small syntax error in it... Here is mine which is for US East:

$request = array('VolumeId' => 'vol-12345678');
$ec2Config = array ('ServiceURL' => "https://us-east-1.ec2.amazonaws.com");
$service = new Amazon_EC2_Client(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, $ec2Config);

Anonymous's picture

Anyone else have problems due to snapshots for AMIs? You can't delete them until ami is deregistered, but I want to keep the ami

Matt's picture

Snapshots created for AMIs cannot be deleted until the AMI itself is deleted.

Matt's picture

If you don't want to set this up yourself, you can use Skeddly to create automatic EC2 snapshots.

http://www.skeddly.com

Post new comment

The content of this field is kept private and will not be shown publicly.

New job vacancies button

Categories

Search