Zero-downtime schema migrations in Postgres using views
Schema migrations are tricky, especially when planned downtime is not an option. Changes need to be carefully designed to stay backwards compatible so that both the new and old schema are available simultaneously. They must also be designed to not interfere with any other queries, for example by locking up tables. Plenty has been written on how to this right but one pattern I haven’t seen mentioned often is using views.
The idea is to encapsulate the data in the database. Clients will not query the actual tables storing data, but instead query views that map directly to the underlying table. Each version of the schema will have a set of views that never change. This way we can perform changes to the underlying table whilst maintaing a stable schema for clients.
The reason this technique isn’t commonly referenced might just be that it’s dumb, but that shouldn’t stop us from exploring it!
Renaming a column
Let’s test this pattern out with a deceptively simple migration: renaming a column. Say we have a non-nilable column named id
which we want to rename to user_id
. The recommended way of achieving this without downtime is to:
- Add a new nillable column called
user_id
- Add triggers that transfer any saved data from
user_id
toid
and vice versa - Backfill the new column with values from the old one
- Set
user_id
to be non-nilable - Once no clients use the old
id
column, remove it
Phew, that’s quite the process! How can views and encapsulation help us here? We start by setting up our table and inserting some data. Remember that this table will never be queried directly.
Now we need some views so that we can actually retrieve this data. A nice trick is to use Postgres schemas to hold our views. Every version of our schema will get its own Postgres schema and they will contain one view for each underlying table. If we call our starting schema “A”, we can set it up like:
Now all the clients have to do is specify which version of the schema they want during queries. To keep the application code agnostic of migrations, we can set the search path when connecting:
Now for the grand finale, how do we rename the id
column? If we perform a regular rename, Postgres will update the existing A.users
view to reference the new name while still calling it id
, keeping the change backwards compatible. We also need to introduce a new version of our schema which uses the new user_id
name, let’s call it “B”:
The rename is done! No downtime, no extra column, no triggers, no backfill. Just good old encapsulation.
Wrapping things up
Renaming a column is admittedly the simplest possible migration and throwing in things like foreign keys and indices will invariably complicate things. Hopefully this view-encapsulation technique will be helpful in more complex scenarios as well. Setting up a clear, versioned interface to the database can help keep application code simpler by keeping out migration-specific implementation details.
I’m currently working on a tool that aims to make zero-downtime schema migrations effortless for developers. If you think this sounds interesting, then please shoot me an email! If you’d like to discuss this article or just have a chat, then feel free to reach out as well.