Parallel CI speeds up your merge queue by running your CI checks in parallel for multiple stacks (including individual PRs not part of a stack) at once, without compromising on correctness. This is especially helpful if your repo sees a high volume of PRs, long CI times, or both.
Tip
Companies who have enabled Parallel CI on their Graphite Merge Queues have seen 1.5x faster merges, which includes time spent running CI (33% decrease for p95, 26% for p75).
Orgs heavily using stacked PRs can expect to see even greater speed gains. Customers have seen up to 2.5x faster merges (60% decrease for p95, 34% decrease for p75).
How parallel CI works
Parallel CI uses speculative execution, similar to branch prediction, to run CI for multiple enqueued stacks at the same time. This significantly speeds up time-to-merge: instead of waiting for CI to run one-by-one, it can run (for example) 3 at a time. For repos with long queues, this can shorten your time-to-merge to a fraction of the time. In many cases, this brings the expected wait time down to just 1 CI cycle.
Example
Suppose you’ve configured Graphite to run up to 3 parallel CI runs, and you have 5 unrelated stacks enqueued at a similar time: A
, B
, C
, D
, and E
.
1. CI starts for A
. In parallel, Graphite creates these temporary groupings and starts CI at the same time:
A
←B
(i.e.B
rebased onA
), thereby testing this group of 2 PR’s at onceA
←B
←C
, thereby testing this group of 3 PR’s at once
2. Once A
succeeds, it’s merged.
Graphite then starts CI for the grouping:
B
←C
←D
, thereby testing this group of 3 PR’s at once.
3. Once B
succeeds, the same process repeats: a group for C
← D
← E
is created and CI runs.
4. If at this point C
fails, then:
C
is evicted from the queue.The runs for groups
C
←D
andC
←D
←E
are both canceled.
5. D
then becomes the first PR in the queue:
CI starts for
D
.Graphite starts CI for the grouping:
D
←E
.
If your CI tests aren’t flaky, the cost is low and the benefits are high: speculative execution only runs more CI when CI fails in the merge queue.
However, because parallel CI assumes that your CI tests in the merge queue will pass, be careful with flaky tests. If CI tests fail, you not only need to evict the failing PR, but restart CI runs on any subsequently enqueued PR’s with speculatively running CI. While this doesn’t make your time-to-merge any slower than when parallel CI is disabled, it does generate more CI runs.
Graphite merge queue’s parallel CI
Graphite’s merge queue operates on stacks as the primary unit rather than a PR (where single PR’s are equivalent to a stack of size 1), and this applies to parallel CI as well. If any PR in the stack encounters a failure, the whole stack will fail to merge, allowing you to treat stack merges as atomic operations.
When setting up parallel CI mode, you can choose whether to:
Run CI on each PR in the stack individually. This is the highest level of correctness guarantees: it ensures no PR in your stack would independently break trunk.
Run CI on the topmost PR in the stack. This relaxes CI guarantees, while further reducing CI runs. If you require each merged stack to keep trunk green, but don’t have that same strictness for each PR within a stack, then we recommend this mode for a combination of higher speed and lower CI costs.
Enabling parallel CI for the Graphite merge queue
Prerequisites:
You must allow the
graphite-app
bot in GitHub to bypass merge restrictions, via your existing branch protection rules or rulesets. See how to set up this up hereYour repo must support draft PR's
To enable: go to Merge queue in the Graphite app settings page (https://app.graphite.dev/settings/merge-queue), and:
If you haven't already, enable the Graphite merge queue in your repo
If you already have the merge queue enabled in the repo, find it in the list and click the Edit icon
In the merge queue configuration panel, enable Parallel CI
Select an option for How should CI run? - see the section above for more details.
Specify a Concurrency value, which determines the number of stacks to run CI for in parallel
Tip
Not sure which concurrency value to use? You'll get the most benefit by from having enough concurrent runs to handle your PR volume given your typical CI runtimes. For example, if your CI takes 30min, and your peak hour PR volume is 3 PR's per 30min, 3 concurrent runs will give you the most benefit.
If your tests are flaky, you may want to start lower and then gradually increase it as you see how your CI performs.
End-user experience when using parallel CI
In order to implement this strategy, Graphite groups stacks into a temporary draft PR. This draft PR is used to execute CI runs. The Graphite PR page will point you to this draft PR. These PRs’ titles are always prefixed with [Graphite MQ] Draft PR
to make them easy to identify.
When Graphite groups stacks in the merge queue for running CI, it creates a branch with a predictable prefix: gtmq_
. You can use this for customizing CI runs or other behavior for enqueued PR’s.
When an enqueued PR merges, it’ll be marked as closed in GitHub instead of merged. Graphite will render and treat it as merged across the product, including the PR inbox, PR page, and statistics shared by Insights. This allows you to keep the GitHub branch protection rule on to require a linear history.
Additional resources
Read more about speculative execution in Uber's paper
If you use tools that monitor whether the PR is merged, your integration may stop working. Many tools have options to monitor merged commits rather than PR status: for example, see Linear's guide on linking commits.