Automating AWS Lightsail backups using snapshots and Lambda

Some of the most glaring omissions from Lightsail are scheduled tasks or triggers – which would provide the ability to automate backups. Competitors in this space like DigitalOcean are all set, as they offer a backup option, whereas for AWS I’m assuming they hope you’ll shift over to EC2 as fast as possible to get the extra bells and whistles.

Of course you can manually create snapshots – just log in and hit the button. It’s just the scheduling that’s missing.

I have one Lightsail server that’s been running for 6 months now, and it’s all been rosy. Except – I had been using a combination of first AWS-CLI automated backups (which wasn’t ideal as it needed a machine to run them), and then some GUI automation via Skeddly. However – while Skeddly works just fine, I’d rather DIY this problem using Lambda and keep everything in cloud native functions.

Introducing…a Lambda function written in Python!

There’s a repo under the Amazon Archives called lightsail-auto-snapshots – I used this as the base for my function (I didn’t try the deployment template as I wanted to create it end to end). Although it’s been archived, it still works and was a good starting point.

1: Create the function, and a suitable role

Head over to the Lambda console to create the function.

Head to the Lambda console to create your function – created in the same region as my Lightsail instances for my sanity

To create, hit the “Create function” button and fill in the basics. We’ll need to create a custom role to determine what the function can access.

Name: lightsail-auto-snapshots; Runtime: Python 2.7; Role: Create a custom role

The role will need to be able to put logs to CloudWatch, but I’ll configure the settings in the next step.

Role Name: lightsail-snapshot-lambda; Policy Document: only log permissions for now

Once the policy was created, I headed over to IAM to attach a new policy to this role. Alternatively, I could have created one policy for the role by combining this with the default policy above.

The second policy definition:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "VisualEditor0",
            "Effect": "Allow",
            "Action": [
                "lightsail:GetInstances",
                "lightsail:DeleteInstanceSnapshot",
                "lightsail:GetInstanceSnapshots",
                "lightsail:CreateInstanceSnapshot"
            ],
            "Resource": "*"
        }
    ]
}
The second policy has been added (“LightsailBackup”) to the lightsail-snapshot-lambda role
Back in Lambda, we can see that both services appear on the right – meaning the policy has been applied to the relevant role

Now that the function has been created with the right role, we just need to configure the function code and parameters.

2: Add code and parameters

Let’s start by adding some script. Create index.py and take the script from the repository (or my slightly modified version below):

from __future__ import print_function
import boto3
from datetime import datetime, timedelta
from os import getenv
from sys import stdout
from time import time

DEFAULT_RETENTION_DAYS = 14
AUTO_SNAPSHOT_SUFFIX = 'auto'


def handler(event, context):
    client = boto3.client('lightsail')
    retention_days = int(getenv('RETENTION_DAYS', DEFAULT_RETENTION_DAYS))
    retention_period = timedelta(days=retention_days)

    print('Maintaining snapshots for ' + str(retention_days) + ' days')
    _snapshot_instances(client)
    _prune_snapshots(client, retention_period)


def _snapshot_instances(client, time=time, out=stdout):
    for page in client.get_paginator('get_instances').paginate():
        for instance in page['instances']:
            snapshot_name = '{}-system-{}-{}'.format(instance['name'],
                                                     int(time() * 1000),
                                                     AUTO_SNAPSHOT_SUFFIX)

            client.create_instance_snapshot(instanceName=instance['name'],
                                            instanceSnapshotName=snapshot_name)
            print('Created Snapshot name="{}"'.format(snapshot_name), file=out)


def _prune_snapshots(client, retention_period, datetime=datetime, out=stdout):
    for page in client.get_paginator('get_instance_snapshots').paginate():
        for snapshot in page['instanceSnapshots']:
            name, created_at = snapshot['name'], snapshot['createdAt']
            now = datetime.now(created_at.tzinfo)
            is_automated_snapshot = name.endswith(AUTO_SNAPSHOT_SUFFIX)
            has_elapsed_retention_period = now - created_at > retention_period

            if (is_automated_snapshot and has_elapsed_retention_period):
                client.delete_instance_snapshot(instanceSnapshotName=name)
                print('Deleted Snapshot name="{}"'.format(name), file=out)
Filename: index.py; Runtime: Python 2.7; Handler: index.handler

Ensure the runtime is still set to Python 2.7 and the handler refers to the index file.

Variable name: RETENTION_DAYS; Value: 14

I only used the retention period via the environmental variables, which is lower to control cost.

Role: lightsail-snapshot-lambda; Timeout: 30s

The last bit of config here is to ensure the role is correct, and that you’ve set a description and timeout.

3: Create the CloudWatch Event Rule

The event rule will trigger our function, and is very straightforward to set up. You can create a new rule through the Lambda designer.

Rule Name: Daily_Midnight; Schedule expression: cron(0 0 * * ? *)

This creates a rule which executes every day at midnight, using a cron expression.

The finished designer view has the trigger from CloudWatch Events on the left, and actions for the two resources on the right

And we’re done!

What about the cost?

As AWS operates on a pay-per-use system, it’s interesting to note the potential cost for both taking, and storing these snapshots.

On a test run, it took 6.3s to run the function to snapshot 6 lightsail instances, and remove 6 old snapshots

On the Lambda side, it took ~6 seconds to run for 6 instances in Lightsail (Lightsail will take a few minutes to create these snapshots, but our script doesn’t check for completion).

In a 128MB function, there is a considerable free tier, and further requests are not bank breaking

If we assume we’ll run this function up to 31 times a month (i.e. every day), then we’ll consume perhaps (6*31=186) 186 seconds a month. There’s a lot of head room here on the free tier, but even without that it’ll still cost less than $0.01.

The actual snapshot is listed for the provisioned size of the instance or disk

The storage of the snapshot is a different matter, and is billed as one of the Lightsail services. This part is much less clear due to the way these are billed, and on the way that they are presented in the Lightsail console.

Simply, you pay for the provisioned size for the first instance, then a top of for the delta on the following snapshots

Bearing the above in mind, it suggests that you will pay a base rate of $2/month for a 40GB snapshot. On top of this, you might have a permanent 14 days of backups, each with a 1GB delta (this would be generous for a web server) – adding another $0.70 to total $2.70/month for a rolling 14 days of snapshots.

Suddenly things are getting expensive!