Back

Branching

A closer look at database branches: how they work and what to keep in mind when choosing a serverless database provider.

If you’re coming from the Git versioning world of programming, you’ll be familiar with the concept of a branch and regardless of your Git provider, GitHub, GitLab, BitBucket etc, code branching always works in the same way. You branch off from e.g main, which creates an exact replica of the codebase from that point in time.

From here you’re able to develop new features, track down and fix bugs, all from the safety of an isolated environment and when you’re ready, you can create a Pull Request, then merge the changes back into the main branch.

But, does database branching work in the same way?

In this report we’ll discuss: What a database branch is, the different ways they work and how much they cost for the following providers:

What is a database branch?

At first glance, you might assume a database branch will work just like a code branch. But since a database has two distinct components, branching actually plays out a bit differently.

branch types diagram

Schema

First there’s the database schema. This database schema lays out the structure of the data. Data types, table names, and the relationships between tables.

Data

Secondly there’s the actual data, and there could be loads of it!

Understandably, for a database to be a database, both schema and data need to co-exist. So one would assume then, when creating a branch from e.g main, you’d create an exact replica of the schema and data from that point in time. Only some database providers work this way, while others only replicate the schema, and require you to seed the database with dummy data. There are pros and cons to each approach, and in the next section we’ll discuss what they are.

Production vs synthetic data in database branches

Production data is when you branch off from e.g main, and much like with a codebase, your branch is an exact replica of production, and uses the same schema and data.

Synthetic data relates to a database branch that only replicates the schema, but the data is synthetically created using a seeding step, or perhaps an anonymized subset of data from the production database.

Is testing with synthetic data good?

You can use synthetic data to test new features or investigate specific bugs. For instance, if you have populated your branch with synthetic data this approach would let you track down a bug in your application for a field like is_admin that was mistakenly set to VARCHAR(255) instead of BOOLEAN, or perhaps you could create a new feature and add a new field, such as user_age, for a form. In these kinds of cases, synthetic data is fine as it’s the data types that matters, not the amount of data.

Is testing with production data better?

Testing with a smaller amount of synthetic data means the database won’t behave as it does in production environments. Both PostgreSQL’s query planner and MySQL’s optimizer rely on data distribution statistics to create efficient execution plans. With production data, they can accurately assess whether to use an index or a sequential scan, or pick the best path for complex queries involving joins and subqueries. With synthetic data, the lack of realistic distribution can cause the optimizer to make suboptimal choices. This can result in queries that appear fast in development / testing, but slow down significantly in production.

Is production data safe to use?

In some cases replicating production data is simply not permitted. Many industries such as healthcare, finance, education, insurance, and government, restrict access to customer data, making it challenging for developers to test against production data. In such cases, sensitive user data, or specific fields within specific tables must be anonymized, and if this is required for each and every branch that is created, it could take a significant amount of time to create a new branch and subsequently impact developer productivity.

That said, many applications don’t contain PII (Personally Identifiable Information). PII generally refers to “user data”, or more specifically, some fields within user tables but, in many cases, data stored in the database isn’t related to individual users at all. In these situations, you’d almost always want a branch that includes real production data.

Key takeaways

Ultimately, there’s no right answer. If your data is sensitive, you’ll need to branch using schema-only and add a step to either generate synthetic data, or anonymize production data. But if your data isn’t subject to PII regulations, branching with production data is often a better choice.

Provider Data + Schema Schema Only
Neon
Supabase
PlanetScale

Database branching workflows

Like before, there’s no right answer to database branching workflows.

branching workflows diagram

There are many scenarios when there’s only one application using the database, and it can be helpful to link changes made to the database with an application Git branch, so when changes are made, they happen simultaneously.

But this only works with a one-to-one relationship, there are of course scenarios, generally with larger organizations, where the production database will be used by a number of applications.

Linking a database branch to a codebase branch, whilst helpful in the one-to-one relationship scenario, can become a hindrance when working with multiple applications. If all applications need to be updated, which Git repository would you use to link the database branch to?

Key takeaways

In almost all scenarios you’d probably want to keep the database branch completely separate from the codebase branches. This approach gives you much more flexibility to manage migrations and application updates.

Provider Linked with Codebase branch
Neon
Supabase
PlanetScale
It’s also worth mentioning that certain database changes can be made without requiring any changes to the codebase, such as adding indexes, creating or modifying views, increasing column lengths, or setting default values.

Branch creation speeds

Creating a code branch is instant since it’s all usually local. But understandably, setting up a new branch of a managed serverless database takes longer. How long depends on the technology your database provider uses. For example, if you’re using RDS, launching a new instance and provisioning resources can be a slow process. And as mentioned earlier, creating a schema only branch, setting up a production replica, adding seed data, or anonymizing production data can all add to the time it takes to spin up a database branch.

The chart below shows the results of some tests we conducted using 30 MB of data split across two tables for Neon, Supabase and PlanetScale.

Data + Schema Time (minutes) 0 1 2 3 4 5 Neon
Schema Only Time (minutes) 0 1 2 3 4 5 PlanetScale Supabase

Key takeaways

  • With Neon you get instant branches with schema + data and pretty soon Neon will have schema-only branches.

  • Supabase only offers branching with schema-only and took ~5 minutes to create a branch.

  • PlanetScale offers both schema-only, and schema + data (in beta with the Scaler Pro plan). We were only able to test schema-only with PlanetScale, which was able to spin up a new branch in seconds rather than minutes.

Branch Branching

Branching from main or production is typically a common approach, but there are many cases where branching from either a development, testing or staging branch is required.

branch branching diagram

Not every provider allows for this functionally and restricts branching from the main (production) branch, or where linked to a codebase Git branch where further limitations may apply.

Key takeaways

Similar points to Database branching workflows. Needing to create a codebase branch to create a database branch can result in a cumbersome branching strategy. There are many scenarios where the freedom to create a database branch independently of a codebase branch, or from any existing database branch would be a more convenient approach.

Provider Branch from branch
Neon
Supabase
PlanetScale

Resource Isolation

To cite RDS once again, typically when a “dev” instance is created it’ll use the same EC2 instance for compute.

resource isolation diagram

Depending on what kind of development work is being carried out will determine how much of the EC2’s compute is used, but regardless, it’s all still taking compute power away from the production database. Many of you will know this as the noisy neighbor effect.

Key takeaways

With all of the following providers, each branch is given its own compute. Any work being carried out on a branch won’t use resources assigned to the production environment resulting in no performance implications on the production database / end user.

Provider Isolated Resource
Neon
Supabase
PlanetScale

The cost of database branches

Running multiple database environments can add up fast. With RDS, for example, having “dev” and “test” databases running along with production database could triple your monthly costs. Branches are usually cheaper, but providers don’t all charge the same way for storage and compute. Without careful budgeting, you might still end up with a surprisingly high bill each month.

Key Takeaways

Disclaimer We’ve done our best to understand pricing, and it’s a pretty complicated area so check with each provider to confirm the latest pricing options.

branching costs diagram

Neon offers branching at no extra cost. With a free tier account you can create up to 10 branches per project, and with a pro account, up 500 branches per project.

In contrast, neither Supabase or PlanetScale include branching in their free plans, and both base their charges on usage time. Supabase charges ~$0.32 per branch per day, even if the branch isn’t active. PlanetScale, on the other hand gives ~1,440 free hours, and only charges when a branch is in use, at a rate of $0.014 per hour once you exceed the free allowance.

Provider Available to free tier predictable pricing
Neon
Supabase
PlanetScale

Finished

We hope this report gives you a clearer overview of the branching options currently available, along with some key considerations — including a few things you may not have thought of before. We’ve spent a good amount of time exploring branching solutions, and there’s a lot to unpack.

But to us the choice is clear. If you’re working with Postgres and need branching, Neon appears to be the most practical solution. The only feature it’s missing right now is schema-only branches, but rumor has it this feature is coming in 2025, so stay tuned!