Read Anthropic’s case study about Graphite Reviewer

What is a merge queue

Greg Foster
Greg Foster
Graphite software engineer
Try Graphite

A merge queue is a dev tool that ensures zero merge skew in a code repository. Almost all teams adopt a merge queue or similar solution when they reach a fast enough rate of code changes merging into a repo. Why? Because with enough engineers merging code to the same repo, the rate of semantic conflicts reaches a breaking point and starts blocking eng productivity across the team. CI and builds begin failing more and more frequently on trunk, and development becomes cripplingly slow.

There's little harm in adopting a merge queue early besides the generally increased complexity of added tooling. At the simplest level, a merge queue ensures that your changes still build and pass tests even if trunk has advanced since you created the PR. The risk of "breaking trunk" due to these race-condition style collisions can exist on repos with as few as one or two collaborators. However, for slower teams, the cost of breaking trunk may not outweigh the slight complexity of introducing a queue.

As a team increases in size and the rate of merging accelerates, two things happen: the chance of semantic conflicts increases, and the cost of breaking trunk increases. Let's consider both.

As the team and volume of pull requests grow, each individual merge becomes higher risk - even if it passes CI before merge. Why?

Each PR, on median, takes 14 hours to merge. This duration matters because, during this time, other PRs might also be merged, potentially leading to what we call a semantic conflict. A semantic conflict occurs when a PR that was fine when tested independently no longer works correctly once another PR is merged into the trunk before it. These collisions can result in failures that are not just simple merge conflicts (which might be detected by version control systems) but are runtime or build failures that only manifest after the combined changes are integrated.

Assuming that each merge carries a probability of collision of 1/100, the probability is relatively low for any single merge but accumulates as more changes are integrated between the initiation and completion of any given PR. Note that teams can reduce this rate by architecting their repository to minimize coupling.

To assess the impact, we need to estimate the number of PRs that might merge in the typical 14-hour window it takes for a PR to merge. As the team and the PR rate grow, the number of such in-flight PRs increases, amplifying the probability of a collision occurring.

Suppose the team reaches a point where, on average, 10 PRs merge per day. During the 14-hour window a PR is open, around 7 other PRs might merge (assuming merges are uniformly distributed throughout the day).

The probability of experiencing at least one collision while a PR is waiting to be merged is given by:

Terminal
P(<at least one collision>) = 1 - (1 - <collision probability>) ^ <number of merging PRs>
P(<at least one collision>) = 1 - (1 - 0.01)^7 = 0.0679

Assuming each collision results in trunk breakage that takes scrambling engineers one hour to notice and resolve, we can estimate the frequency of trunk breakages. As the rate of PR merges increases, so does the frequency of these disruptive events.

Daily PR MergesPRs During 14-Hour WindowProbability of Collision per PRExpected Daily CollisionsDaily Trunk Breakage Time (hrs)
53.53.44%0.120.12
1076.79%0.680.68
201413.26%2.652.65
302119.03%5.715.71
402824.22%9.699.69

The table illustrates how, as the rate of PR merges per day increases, the expected daily collisions—and thus the daily trunk breakage time—also increase. When the development team reaches higher PR rates like 30 or 40 merges per day, the system can expect multiple hours of downtime daily, which significantly hampers productivity.

As you can see from this table, even small repositories with only a handful of merges a day can benefit from a merge queue to avoid collisions and trunk breakages.

The most effective way to reduce collisions and breakages is by adding a merge queue to eliminate merge skew. But there's more teams can do to improve the math in their favor. First, teams can try to reduce PR open windows to reduce natural merge skew. Said more simply, the faster the median time-to-merge for PRs, the more accurate the original CI is for the pull request at the time of merge. While the median pull request takes 14 hours to merge, we see faster teams bring that number down to 4-6 hours.

The second thing teams can do is reduce the average size of PRs, as well as coupling. Smaller, more isolated code changes are less likely to have semantic collisions. Smaller PRs also have a naturally faster time to merge, which helps aid the prior point.

The third thing engineers can do to reduce collisions is stack their pull requests. By stacking PRs, engineers self-enforce a pre-merge order. The pre-merge CI executions are guaranteed to have no mid-air collisions with other PRs in the stack, though they can still collide with other stacks. Stacking can be done solo, or collaboratively with other engineers on a case-by-case basis.

Built for the world's fastest engineering teams, now available for everyone