Fabian Lindfors

Zero-downtime schema migrations in Postgres using views

Schema migrations are tricky, especially when planned downtime is not an option. Changes need to be carefully designed to stay backwards compatible so that both the new and old schema are available simultaneously. They must also be designed to not interfere with any other queries, for example by locking up tables. Plenty has been written on how to this right but one pattern I haven’t seen mentioned often is using views.

The idea is to encapsulate the data in the database. Clients will not query the actual tables storing data, but instead query views that map directly to the underlying table. Each version of the schema will have a set of views that never change. This way we can perform changes to the underlying table whilst maintaing a stable schema for clients.

The reason this technique isn’t commonly referenced might just be that it’s dumb, but that shouldn’t stop us from exploring it!

Renaming a column

Let’s test this pattern out with a deceptively simple migration: renaming a column. Say we have a non-nilable column named id which we want to rename to user_id. The recommended way of achieving this without downtime is to:

  1. Add a new nillable column called user_id
  2. Add triggers that transfer any saved data from user_id to id and vice versa
  3. Backfill the new column with values from the old one
  4. Set user_id to be non-nilable
  5. Once no clients use the old id column, remove it

Phew, that’s quite the process! How can views and encapsulation help us here? We start by setting up our table and inserting some data. Remember that this table will never be queried directly.

CREATE TABLE users (
	id SERIAL PRIMARY KEY,
	email TEXT,
	name TEXT
);

INSERT INTO users (email, name) VALUES
	('jerry.seinfeld@aol.com', 'Jerry Seinfeld'),
	('george.constanza@aol.com', 'George Costanza');

Now we need some views so that we can actually retrieve this data. A nice trick is to use Postgres schemas to hold our views. Every version of our schema will get its own Postgres schema and they will contain one view for each underlying table. If we call our starting schema “A”, we can set it up like:

CREATE SCHEMA A;

CREATE VIEW A.users AS
	SELECT id, email, name
	FROM public.users -- public is the default schema where the table was created

Now all the clients have to do is specify which version of the schema they want during queries. To keep the application code agnostic of migrations, we can set the search path when connecting:

SET search_path TO A;

-- users will now refer to A.users
SELECT id, email, name FROM users;

-- Results:
-- +----+-------------------------+-----------------+
-- | id |          email          |      name       |
-- +----+-------------------------+-----------------+
-- |  1 | jerry.seinfeld@aol.com  | Jerry Seinfeld  |
-- |  2 | george.costanza@aol.com | George Costanza |
-- +----+-------------------------+-----------------+

Now for the grand finale, how do we rename the id column? If we perform a regular rename, Postgres will update the existing A.users view to reference the new name while still calling it id, keeping the change backwards compatible. We also need to introduce a new version of our schema which uses the new user_id name, let’s call it “B”:

ALTER TABLE public.users
RENAME COLUMN id TO user_id;

CREATE SCHEMA B;
CREATE VIEW B.users AS
	SELECT user_id, email, name
	FROM public.users;

The rename is done! No downtime, no extra column, no triggers, no backfill. Just good old encapsulation.

-- Older clients can still access the old column name
SET search_path TO A;
SELECT id, email, name FROM users;

-- Results:
-- +----+-------------------------+-----------------+
-- | id |          email          |      name       |
-- +----+-------------------------+-----------------+
-- |  1 | jerry.seinfeld@aol.com  | Jerry Seinfeld  |
-- |  2 | george.costanza@aol.com | George Costanza |
-- +----+-------------------------+-----------------+


-- New clients just have to update their search path
SET search_path TO B;
SELECT user_id, email, name FROM users;

-- Results:
-- +---------+-------------------------+-----------------+
-- | user_id |          email          |      name       |
-- +---------+-------------------------+-----------------+
-- |       1 | jerry.seinfeld@aol.com  | Jerry Seinfeld  |
-- |       2 | george.costanza@aol.com | George Costanza |
-- +---------+-------------------------+-----------------+

Wrapping things up

Renaming a column is admittedly the simplest possible migration and throwing in things like foreign keys and indices will invariably complicate things. Hopefully this view-encapsulation technique will be helpful in more complex scenarios as well. Setting up a clear, versioned interface to the database can help keep application code simpler by keeping out migration-specific implementation details.

I’m currently working on a tool that aims to make zero-downtime schema migrations effortless for developers. If you think this sounds interesting, then please shoot me an email! If you’d like to discuss this article or just have a chat, then feel free to reach out as well.