An automated EBS backup solution using Amazon EC2

Chris's picture
Comments (22)
Post a new comment
Chris
11 March, 2010 - 13:23

For a while now, we've been looking for a solution to a common problem on Amazon EC2 hosting: how do you automatically create regular snapshots of your many EBS (Elastic Block Store) volumes, but also not create so many snapshots that it becomes unmanageable? We might have found the answer.

Creating a snapshot of any given volume is not really a problem, but given that you want to create your snapshots regularly (in our case, once a day), so that if something goes wrong, you always have a very recent copy of your clients' data, you can quickly build up a huge collection, a lot of which is not really needed.

This is where 'The Cloud Walker' obviously encountered the same problem. He has written a very nice PHP script, based on work by Oren Solomianik, that can manage older snapshots for you. It requires that you specify an EBS volume ID, for example, vol-12345678, and it will search for all of the snapshots for that volume. It then deletes any that are older than a week, unless they fall on a Sunday, in which case they're kept for a month, or they fall on the first of the month, in which case they're kept indefintely.

What you're left with is a full week of daily backups, then weekly backups going back a month, then monthly backups after that. This is based on the notion that the more time elapses, the less likely you are to need backups made on a specific date.

The script itself is written in PHP, and comes bundled with Amazon's PHP tools. There are, however, a few requirements:

Firstly, you will need the PHP command line interpreter. Many systems will come with PHP for the web server, say Apache, but not PHP that can be run from a command line. On Ubuntu (and probably Debian too), you can just sudo apt-get install php5-cli to install it, and then you will be able to run PHP scripts by typing php myscript.php from the command line.

Secondly, you will need the PHP XSLT processor, or you will see a message like Fatal error: Class ‘XsltProcessor’ not found. On Ubuntu systems (and probably Debian too) you can simply sudo apt-get install php5-xsl to overcome this.

Thirdly, The Cloud Walker's script, at the time of writing, uses long command line options. These are switches such as --region=foo rather than simply -r foo. Unfortunately, on most systems, prior to PHP 5.3, PHP did not support long options. Because we use Drupal 6 on our production sites, and at the time of writing, Drupal 6 does not work with PHP 5.3, we have had to stay with PHP 5.2 and simply strip the long options out.

We needed to open up The Cloud Walker's script and insert our Amazon access key and secret key, and since we only use one region, we hard-coded that region into the script, but then we were able to supply the volume ID to the script, such as php ec2-manage-snapshots.php -v vol-12345678 and it automatically decided which snaps to keep and which to delete.

Now we just needed an automated way of actually creating the snapshots. In the bundled Amazon PHP tools, in the examples folder, is CreateSnapshotSample.php, which allows you to pass in a volume ID and have a snapshot created for that volume. We made a copy of this, and also a copy of the .config.inc.php in the same folder, because this is where Amazon wants you to put the access key and secret key.

We needed to change the code slightly, because it is not 'region aware' and we needed to instruct it to use a particular region. Luckily, the code for this is inside The Cloud Walker's script, and this worked fine for us:

  1. $ec2Config = array ('ServiceURL' => '<a href="https://eu-west-1.ec2.amazonaws.com'">https://eu-west-1.ec2.amazonaws.com'</a>);
  2. $service = new Amazon_EC2_Client(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, $ec2Config);

We tested this by creating a few snapshots from the command line, and things looked like they were ready to rock and roll. The only problem was that now our snapshot creation and our snapshot managements scripts could only operate on one volume ID at a time. We needed a way to manage a whole bunch of EBS volumes in one go, so we put together a bash script that looks a bit like this:

  1. #!/bin/bash
  2.  
  3. DATESTAMP=`date +%Y%m%d`
  4. TIMESTAMP=`date +%H%M`
  5. LOGFILE="/var/log/tigerfish_ebs_backup.log"
  6.  
  7. VOLUMES=( vol-12345678 vol-23456789 vol-34567890 )
  8.  
  9. echo "TIGERFISH EBS BACKUP $DATESTAMP $TIMESTAMP" 2>&1 | tee -a $LOGFILE
  10. echo " " 2>&1 | tee -a $LOGFILE
  11.  
  12. # Create a snapshot of each volume.
  13. for volume in ${VOLUMES[@]}
  14. do
  15.  php CreateSnapshot.php -v $volume 2>&1 | tee -a $LOGFILE
  16. done
  17.  
  18. # Remove older snapshots we don't need to keep any more.
  19. for volume in ${VOLUMES[@]}
  20. do
  21.  php ec2-manage-snapshots.php -v $volume 2>&1 | tee -a $LOGFILE
  22. done

Don't forget to touch your log file before you run this script, because we're using tee in append mode, so it will error if the file does not actually exist!

The last step was simple: set up a cron job to run the bash script daily. There is one thing to note here: we have our databases on an EBS volume, and Amazon's tools make particular mention of the fact that you should shut off your database server while you're performing the snapshot, otherwise the snapshot could become corrupted. This is something we won't cover here, but is reasonably straightforward.

To sum up, a big thanks to The Cloud Walker and Oren Solomianik for doing the work that we based our automatic EBS snapshots upon.

22

Comments

Anonymous's picture
Anonymous2 April, 2010 - 21:57

I am unclear on how you got the CreateSnapshotSample.php script to run via the command line. Can you give an example?

Here is what I got...

$ php CreateSnapshotSample.php -v vol-xxxxxxxx
Caught Exception: The request must contain the parameter volume
Response Status Code: 400
Error Code: MissingParameter
Error Type: Unknown
Request ID: xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx
XML: <?xml version="1.0"?>
MissingParameterThe request must contain the parameter volumexxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx

Chris's picture
Chris7 April, 2010 - 08:25
Hi there, It looks as though when you are using the -v switch, that's not working properly for some reason. We actually copied CreateSnapshotSample.php to make CreateSnapshot.php and modified it a little bit. For the sake of clarity, I didn't cover the modifications here. Check PHP's getopt() to make sure that the -v parameter is getting passed into the script properly, and that it is being passed to the code responsible for issuing the actual EC2 command.
Anonymous's picture
Anonymous7 June, 2011 - 21:06

FWIW, there's no getopt() in the CreateSnapshotsample.php script. It'd be more clear if you published your CreateSnapshot.php script. Thanks!

Mahrob's picture
Mahrob29 May, 2010 - 10:48

Hi Otto,

Great thread, I did all my setup following this.

I am also getting the above mentioned error

Caught Exception: The request must contain the parameter volume
Response Status Code: 400
Error Code: MissingParameter
Error Type: Unknown
Request ID: xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx
XML: <?xml version="1.0"?>
MissingParameterThe request must contain the parameter volumexxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx

Can I hardcode a volume id instead of passing it via prompt? If yes, where should I put it?

Thanks for the great article!
Mahrob

Anonymous's picture
Anonymous23 August, 2010 - 13:47

Hi,

I m not able to add the request parameter in the create sanpshop api. Can u guide me how to do it

Caught Exception: The request must contain the parameter volume
Response Status Code: 400
Error Code: MissingParameter
Error Type: Unknown
Request ID: 7aff6baa-863c-4e3a-8e2b-50daca6135f0
XML:
MissingParameterThe request must contain the parameter volume7aff6baa-863c-4e3a-8e2b-50daca6135f0

Ganesh's picture
Ganesh24 November, 2010 - 12:27

Hi,
Does any know where to place ec2-manage-snapshots.php file ? And also "Don't forget to touch your log file ....." what does this mean ? Thanks.

Chris's picture
Chris24 November, 2010 - 12:37
It doesn't actually matter where the php script goes. You can simply run it from wherever it lies, with: php /path/to/my/phpfile.php Touching a file that does not exist will cause an empty file to be created. This is important if you need logging because if you don't do this, nothing will be logged. It's easy and you only need to do it once: touch /path/to/my/logfile.log
tommy's picture
tommy19 January, 2011 - 18:01

Hi

Great work.

However I'm getting the following error trying to execute sudo php createsnapshot.php

Fatal error: Class 'Amazon_EC2_Client' not found in /.../.../createsnapshot.php on line 36

I have already included my ACCESS KEY ID & SECRET ACCESS KEY on the .config.inc.php file. Also this file is in the same directory as createsnapshot.php

Please advise.

Chris's picture
Chris20 January, 2011 - 10:19
@tommy: when you download the Cloud Walker's scripts, there is a directory in the zip file called 'Amazon' which contains all the class libraries needed. You should have an Amazon directory at the same location as your createsnapshot.php. If you don't, you should go back to the Cloud Walker's archive and check you extracted everything.
ajmfulcher's picture
ajmfulcher29 April, 2011 - 20:31

I've put together a web-based application to do this, called Ebs2s3. More details are here: http://ajmfulcher.blogspot.com/2011/04/ebs2s3-automated-backup-for-amazo...

Hope this helps out!

Bijo's picture
Bijo16 June, 2011 - 08:02

Hi,

I get this error. Can you figure out the issue.

# php CreateSnapshotSample.php -v vol-xxxxxxxx
PHP Notice: Undefined variable: request in /var/www/vhosts/ukourt.com/httpdocs/backup/Amazon/EC2/Samples/CreateSnapshotSample.php on line 60
Caught Exception: The request must contain the parameter volume
Response Status Code: 400
Error Code: MissingParameter
Error Type: Unknown
Request ID: xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx
XML: <?xml version="1.0" encoding="UTF-8"?>
MissingParameterThe request must contain the parameter volumexxxxxxxx-xxxx-xxxx-xxxxxxxxxxxx

I could see the -v option is working fine.

Pls advise.

Thx
Bijo

Chris's picture
Chris16 June, 2011 - 08:09
Bijo, it seems like lots of people are having this problem. It doesn't seem to be a problem for us, however. If you do figure out what's going on here, do post it and we can update the article!
George's picture
George21 June, 2011 - 12:40

Anyone playing with CreateSnapshotSample.php will need to add a $request array with the volume id and if you are using a different region set the ServiceURL on the connection object.

The $request array is as simple as:

$request = array('VolumeId = > 'vol-12345678');

To use the eu-west-1 region the connection should be something like this:

$ec2Config = array ('ServiceURL' => "https://eu-west-1.ec2.amazonaws.com");

$service = new Amazon_EC2_Client(AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY, $ec2Config);

Hope this helps someone!

Bijo's picture
Bijo22 June, 2011 - 07:45

It works fine now. Thanks George :)

stephen's picture
stephen4 October, 2011 - 02:00

Georges script has a small syntax error in it... Here is mine which is for US East:

$request = array('VolumeId' => 'vol-12345678');
$ec2Config = array ('ServiceURL' => "https://us-east-1.ec2.amazonaws.com");
$service = new Amazon_EC2_Client(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, $ec2Config);

Anonymous's picture
Anonymous24 March, 2012 - 07:35

Anyone else have problems due to snapshots for AMIs? You can't delete them until ami is deregistered, but I want to keep the ami

Matt's picture
Matt16 May, 2012 - 19:07

Snapshots created for AMIs cannot be deleted until the AMI itself is deleted.

Matt's picture
Matt16 May, 2012 - 19:06

If you don't want to set this up yourself, you can use Skeddly to create automatic EC2 snapshots.

http://www.skeddly.com

TroyB's picture
TroyB15 August, 2012 - 22:52

Does the ec2-manage-snapshots.php script ignore errors on volumes that cannot be deleted due to the fact that they are associated with an AMI? We have a script now that stops due to those volumes and our snapshot list grows unnecessarily.

Lera1985's picture
Lera19855 September, 2012 - 06:29

Regards for sharing the information with us on tiger-fish.com.

Colin Johnson's picture
Colin Johnson20 September, 2012 - 15:12

I have written a very similar open-source backup tool using bash instead of PHP - it is available at http://awsmissingtools.com - look for the tool "ec2-automate-backups." The tool has a few very cool features - it can backup based on tags and it can backup multiple EBS volumes in one run.

stuck in Windows world's picture
stuck in Window...26 April, 2013 - 14:29

I am stuck with Windows instances, what do I do?

Add new comment