Merge skew

Greg Foster
Greg Foster
Graphite software engineer

The term "merge skew" is defined as "the number of commits on a trunk branch, ahead of the merge base of a merging code change." Said more simply, the merge skew for a pull request is how many other changes have merged since it was created or last rebased.

In a zero merge-skew world, engineers would create a code change based off the latest commit on trunk, and then merge that change in before any other changes were started. This world would also be abysmally slow - each pull request would need to hold a lock on the code base while it was written, tested, reviewed, and merged.

In practice, engineers parallelize creating many code changes in the same repository and merging them whenever they're ready. This asynchronous development is faster and appeals to distributed engineering teams, but it also creates a race condition that arises during integration in the form of merge skew.

Why should engineers care if a code change is not perfectly rebased on the latest trunk commit? Merge skew results in two problems.

The first issue, merge conflicts, is widely recognized and straightforward to identify. A merge conflict occurs when a pull request that initially seemed fine—based on the state of the code when it was created—cannot be merged later because of other changes that have been integrated into the codebase since then. Tools like Git and platforms such as GitHub detect these discrepancies and prevent the merge from proceeding. They require users to manually rebase their branches and resolve these conflicts by hand. Although resolving merge conflicts can be frustrating, they generally affect only the engineer working on the branch and can be systematically identified and resolved.

The second problem - semantic conflicts - is less known but more costly at scale. A semantic conflict is where some feature or functionality works given two independent code changes but no longer works given the merged combination of both. Unlike a merge conflict, semantic conflicts cannot be trivially detected. At best, they show up in the form of a failed build or failed unit test - at worst, they quietly regress functionality in production. While merge conflicts can be cheaply checked for at the exact time of the merge, semantic conflicts would require re-running tens of minutes of CI, during which yet more changes might merge.

Consider the practical cost of a semantic conflict within a sizable engineering organization. You create a pull request, dutifully pass CI, and receive approval for a code review. But after merging to trunk, the codebase no longer builds. Teammates begin messaging Slack, complaining that the latest commits on trunk no are throwing local errors. Folks waste time checking if it's "just them" or if the main branch is broken. Eventually, an engineer bisects the trunk, finds your offending change, and reverts it. Had the issue been a merge commit, you would have been blocked from merging, but you would have been able to fix it in a matter of minutes. Instead, the semantic conflict breaks the trunk and disrupts every engineer within the repository for half an hour or longer.

What causes semantic conflicts? How can CI pass before merge but fail after (ignoring the possibility of flakes)? The answer is merge skew. The opensource project Bors explained the relationship well in its documentation

If you use CI, this has happened before.

Build Failed

Maybe you've got Travis set to build every pull request.

In which case, there will be a notification at the bottom of the pull request page that describes whether it succeeded or failed.

But it only checks each one in isolation. The merged build can still fail.

This happens when two commits make changes to different parts of the code that clash with each other, like if one adds a new call to a function that the other renames

You need to check the combination, before it goes to master.

When CI passes on a pull request, it validates the change's original merge base commit and the author's new code change. That is a different snapshot of code compared to the codebase immediately after the change is merged into trunk — unless the change has just been rebased and has a merge skew of zero. With a high merge skew, there's an ever-present chance that the code no longer behaves correctly after merging.

Here's a simple example in code to illustrate a scenario where merge skew causes a semantic conflict:

Terminal
# Initial code on trunk
def calculate_discount(price, discount_rate):
return price * (1 - discount_rate)
# Alice's code change (Pull Request #1)
# Changes the discount calculation to ensure discount rate doesn't exceed 100%
def calculate_discount(price, discount_rate):
if discount_rate > 1:
discount_rate = 1
return price * (1 - discount_rate)
# Bob's code change (Pull Request #2)
# Adds tax calculation to the price calculation
def calculate_discount(price, discount_rate):
tax_rate = 0.05 # Assume a constant tax rate
return price * (1 - discount_rate) * (1 + tax_rate)

Explanation:

  1. Initial Code: The calculate_discount function calculates the discounted price based on a given price and discount rate.
  2. Alice's Change: Alice modifies the function to cap the discount rate at 100%. This prevents a discount rate higher than 100% from giving a negative price. Alice's pull request was based on a version of the trunk that did not yet include Bob's changes.
  3. Bob's Change: Simultaneously, Bob modifies the calculate_discount function to include tax in the price calculation. His pull request is based on the same initial version as Alice's.

Resulting conflicts:

  • Merge Conflict: If both Alice and Bob have their changes merged around the same time without rebasing, a merge conflict will likely occur because both modified the same function. This is straightforward to detect as both pull requests change the same lines of code.
  • Semantic Conflict: Suppose Alice's change is merged first and Bob, unaware of the modification, does not rebase his pull request. When Bob's change is merged, the function will not cap the discount rate anymore, as his change overrides Alice's. Although there is no merge conflict (since they changed different lines within the same function), a semantic conflict occurs because the behavior of the function with respect to the discount cap is regressed.

This example shows how both types of conflicts are related to merge skew, and why keeping pull requests closely in sync with the trunk (low merge skew) is crucial to avoiding these issues.

Give your PR workflow
an upgrade today

Stack easier | Ship smaller | Review quicker

Or install our CLI.
Product Screenshot 1
Product Screenshot 2