Migrating a Production PostgresSQL DB to Amazon RDS

Loren BurtonNovember 16, 2013

Scenario: You currently have a production Rails app and a Postgres database server running in a single EC2 instance. You may have heard this is not great.

Why this is bad

As explained by Amazon’s CTO, Werner Vogels, "An EC2 instance is not a server — it’s a building block." Effectively, EC2 instances running your application code are disposable (read more about this). If anything goes wrong on an EC2 instance, or if an instance is unable to keep up with load and needs more resources, that instance should be killed and replaced with a new one. The problem arises if an EC2 instance that's running your application also contains your database - you can't simply terminate it and spin up a new one - you'll lose your data.

The solution is to rip your database from your application and run them as two separate instances, then make them talk to each other. Doing this early also simplifies things down the road when you're scaling and provisioning multiple app instances that need to share a database, but that's a topic for another post.

Hello Amazon Relational Database Service (RDS)

An RDS DB instance is similar to an EC2 instance, but it's preconfigured and optimized for hosting databases - and ONLY databases (no application code). You specify the type of database you want and allocate resources, and Amazon takes care of everything else.

As an alternative, you could spin up a dedicated EC2 instance and manually configure it for your database, but there are still some benefits to RDS:

The biggest benefit to RDS over rolling your own solution is that Amazon handles all of the administrative and management tasks, so you don't have to think about any of it - and it's dead simple to setup and link to your Rails app on EC2.

Getting ready

A few things you should know about your EC2 instance and Rails app before migrating your data to RDS:

Authorize EC2 access with a DB Security Group

A DB security group controls who can talk to your database instance. In our case, we want our EC2 instance to talk to the database, and that's it.
We'll create the security group now, so we can attach it to the the database instance when we provision it.

From the RDS Dashboard in your AWS Console:

  1. Click Security Groups on the left, then Create DB Security Group
  2. Choose a name (eg ec2-access) and description
  3. Click the details icon for the new security group
  4. For Connection Type, select EC2 Security Group
  5. Select the security group of your EC2 instance and click Add

Your (future) DB instance is now set to comminucate with your EC2 instance!

Provision your Postgres RDS DB instance

Setup your DB instance through the RDS Dashboard in AWS console. Choose Launch DB Instance > Postgres. The guide is pretty straightforward, but some things to note:

Give your instance a few minutes to spin up.

Migrate data and point EC2 to RDS

Now the fun part - moving your data to the new DB instance!

  1. Lock down your EC2 database to prevent writes. Easiest way to do this is throw up a maintance page, or shut your app server down, however you see fit.
  2. Run this pg_dump command from your EC2 instance to populate the new database on RDS:

    sudo -u ec2_user pg_dump -U ec2_user ec2_production | psql -h dbinstancename.xxxxx.us-east-1.rds.amazonaws.com -U ec2_user myrds_production
    

    where myrds_production is the database name you specified for your DB instance, and dbinstancename.xxxxx.us-east-1.rds.amazonaws.com is the DB instance endpoint without the port (eg 5432) from RDS Dashboard.

    If you run into any errors here, most likely you:

    a) did not setup security groups correctly (make sure EC2 security group matches DB security group), or

    b) did not specify usernames correctly (make sure new database owner is same as old database owner)

    To verify that your new DB was populated, you can do something like:

    psql -h dbinstancename.xxxxx.us-east-1.rds.amazonaws.com -U ec2_user myrds_production;
    myrds_production=> \d
    

    which should print out the relations of your table. If it's empty, something went wrong.

  3. Point EC2 to RDS by modifying your database.yml file:

    production:
      adapter: postgresql
      encoding: unicode
      database: myrds_production
      pool: 10
      username: ec2_user
      password: rds_password
      host: dbinstancename.xxxxx.us-east-1.rds.amazonaws.com
      # port: 5432
      # socket: /tmp/postgresql.sock
    

    Note that port and socket are not needed. At this point, it's a good idea to fire up a rails console to make sure your app is pointing to RDS and your data is there.

  4. If all went well, you can remove your maintance page and restart your server.

That's it!

Your Rails app is now be using the new database on your PostgreSQL RDS DB instance. Any migrations you run from your app will modify the database on RDS, just as if it were a local database.

Enjoy your new DB instance!