Live streamJoin us for "State of code review 2024" on July 18thRegister today

How to do GitHub code reviews that don't take all week

Greg Foster
Greg Foster
Graphite software engineer

Code reviews should ideally take about 60 minutes.

Then why did our analysis find pull requests (PRs) that took over five years to merge?

That example is a bit of a joke. Outliers like this are usually caused by something other than a slow code review. Even so, many teams need help maintaining a reasonable timeline for reviewing, resolving, and merging PRs.

Overly complex code, insufficient documentation, and diverse expertise levels slow down reviews, leading to delays from weeks to months, even years.

Delays can add up to wasted work hours and developer burnout. Consequently, developers are not only waiting but also hindered from tackling new features or addressing urgent issues.

To better understand what creates these delays—and how to fix them—we have to start by zooming out on the whole process.

The traditional GitHub code review process can be one cause of development delays. If you use GitHub daily, you know the struggle of waiting to merge code.

Not only does this code review process take time, but it also blocks the rest of your team's workflow.

Sometimes, one task blocks another and reduces the entire team's efficiency. Thus, you waste valuable resources and time and increase expenses.

meme of an astronaut with text "Let's do a quick code review. This maneuver is gonna cost us 51 years."

To address this issue, we need to dive deeper into the code review workflow.

Let’s start by addressing a bigger question about what the code review is meant to achieve. This might feel a bit like CS201, but it’s valuable to frame the process against its intended purpose. From here, we can identify solutions that might improve the process without undermining the point of reviewing code in the first place.

At its most basic, robust code review addresses issues such as suboptimal code. It also protects against the creation of future technical debt by identifying things like out-of-date solutions and poor code hygiene.

This fundamentally breaks down into a few main buckets:

  • Code quality: You can address development style issues, suggest ways to improve code quality, and encourage cleaner, more maintainable code, improving the codebase health.

  • Bugs: Code reviews are essential to find and fix bugs in code. You can catch logical errors, edge cases, and potential issues before production, reducing costly post-release bug fixes.

  • Security issues: Team code reviews can help identify and mitigate security risks early in development, preventing software breaches and attacks.

  • Code duplication: You can refactor and consolidate redundant code through Github code reviews. It simplifies maintenance, improves codebase consistency, and reduces duplicated logic bugs.

Another major benefit and goal of code review is to facilitate team collaboration. Team members can share knowledge during code reviews. Experts within the team can share software development best practices with newer team members. Knowledge of the codebase and previous solutions can be passed down.

We should keep these goals in mind. Our goal is to address the challenges of code review without losing the value that the process creates.

Now, let’s unpack some of the key challenges that lead to slow reviews.

It may seem obvious, but large PRs require more time and effort to review. They also increase the chances of conflicts and issues when merging into the main branch.

How does this affect the rest of the process?

This friction often leads to:

  • Review complexity: Large PRs take more work to analyze. Understanding the entire feature or bug fix can be overwhelming for the reviewer.

  • Increased review time: More lines of code and complex changes require more time for reviewers to provide helpful feedback.

  • Feedback loop delays: Every round of feedback and revision increases the timeline. If a PR needs multiple reviews, merging can take exponentially longer.

  • Blocking developer progress: While a large PR is under review, it blocks you from working on dependent features. It stalls other parts of the project.

average and median time to merge PRs by lines changed

PRs with 500+ line changes might take around nine days to merge, suggesting that more concise PRs could streamline the development process.

One solution: Keep PRs to about 200 lines.

The data we analyzed suggests that this is an optimal size for most teams. We found, on average, PRs with 100-500 lines are merged within about 93 hours. That’s about 40% faster than PRs with 500-1,000 lines.

This strategy allows for quicker review and integration cycles while reducing the likelihood of conflicts and issues when merging into the main branch.

Most teams begin the GitHub code review process when you commit local changes to a feature branch. Then, you create a PR against the main development branch to start the review process.

The practice of feature branching and bundling commits itself creates another common bottleneck.

You may work on a complex feature that requires multiple commits. You often bundle these commits into a single large PR. This practice—although it may seem practical—lends itself to the creation of larger PRs.

Thus, you end up with the same problems that we discussed above.

To address this problem, we’ll need to rethink this practice and consider alternative solutions like trunk-based development.

The technical workflow of publishing, reviewing, and merging PRs can also create a bottleneck.  This process involves a series of branching and merging, with each branch representing a new feature or fixes and each merge representing an integration point.

You must keep your feature branches in sync with the main branch, resolve merge conflicts, and ensure that changes integrate seamlessly with the evolving codebase.

This creates complexity. Even if the code review itself goes smoothly, downstream conflicts can delay the merge process.

The complexity and difficulty of merging increases with the complexity of the PR itself. We can understand this, in part, by considering the number of files that are affected by any given PR.

a bar chart showing an analysis of over 1.5 million GitHub PRs that PRs affecting 2-3 files are merged 90% faster than those that include changes to 16-31 files.

According to an analysis of over 1.5 million GitHub PRs, we can see that PRs affecting two to three files are merged 90% faster than those that include changes to 16-31 files.

Interestingly, this data also reveals that merge times decrease after peaking at 25.2 hours. It appears that, after this point, increased complexity actually accelerates the merge process.

Why would that be? It’s counterintuitive for PRs affecting 512-1023 files to merge at approximately the same pace as those affecting just four to seven files.

Further analysis into the data shows it likely stems from another issue: Code review apathy.

Humans have a finite capacity for attention. This is a fundamental challenge for code review (especially with complex PRs).

One way this manifests is a decrease in code review rigor as the number of lines and files in a PR increase.

Zooming in on the data shared above, we can see this trend clearly:

Average time to review per file by number of files change in pr

When PRs exceed a moderate complexity level, the time spent reviewing each file decreases significantly. Our analysis found that reviewers spend about two hours per file reviewing PRs with 8-15 files changed. They only spend about 30 minutes reviewing each file when PRs contain changes to 32-63 files.

This implies the more files contained in a pull request, the less attention each file receives from reviewers.

The friction of code reviews can make it feel like a chore.

There’s the temptation to take review shortcuts so we can meet critical deadlines or improve performance metrics.

People start to phone them in. They attempt to circumvent the process.

screenshot of a twitter post: 10 lines of code equals 10 issues, 500 lines of code equals looks fine.

This mindset and subsequent behavior can lead to code reviews that fail at their intended purpose (to enforce code quality) *and *still waste cycles while they sit in the queue. Even if code review metrics improve superficially, the introduction of defects and creation of technical debt will undoubtedly become a drag on the team in future cycles.

Unpacking each of these issues gives us insight into where, how, and why code reviews often break down. Using this understanding, we can then delve into practices and solutions that can help address these specific challenges.

The stacking workflow simplifies the traditional code review process by altering how commits and pull requests are managed. This subtle change significantly impacts the efficiency and effectiveness of code reviews.

In standard GitHub workflows, PRs often encompass a large set of changes. This size complicates the review process, increasing the likelihood of merge conflicts and delays.

You’re also stuck waiting for that PR to be reviewed, merging it into the main, and branching off it again. Then, you can start working on the front-end change that uses the server code you wrote.

developing everything in a large feature branch showing the PR from main with change 1, development is blocked while awaiting review

The stacking workflow addresses this by encouraging a workflow that breaks down larger changes into a series of smaller, more manageable PRs.

Using stacking in practice shows a smaller PR pulled with change and remerged while working on the next change which unblocks dev

You can continue adding new commits as separate PRs, each based on the one before it. This method allows you to proceed with subsequent changes without being blocked, as each new PR is stacked on top of the previous, awaiting review, approval, and merging.

Carl Gao, a New York-based software engineer, says:

“Stacking code changes has been such a game changer that I wouldn't even consider working at a company where this developer workflow doesn't exist.”

Let’s look at stacking in the wild.

As an example, let's break down the stacked PRs workflow using an ecommerce checkout flow.

Here, you can see how to break down the checkout flow into smaller pull requests rather than a single PR for the entire functionality:

Ecommerce checkout process broken down into smaller PRs: pr 1 checkout button, pr 2 checkout form, pr 3 integrate checkout with payment gateway, pr 4 handle order submission and completion

  • PR 1: Checkout button.

  • PR 2: Checkout form.

  • PR 3: Integrate checkout with payment gateway.

  • PR 4: Handle order submission and completion.

Each PR represents a portion of the feature being developed. Rather than compiling all related commits into a single feature branch, each commit is its own PR and branch.

These branches are then stacked—meaning each one is dependent on the approval and merge of the preceding PR.

Rather than branching off the main or master branch for each new change, the code for each subsequent PR is branched off from the branch used in the immediately preceding PR.

Another way to improve the code review process is to focus on reducing noise and improving communications across the team.

GitHub and its integrated tools (such as Travis CI, CircleCI, Codecov, and Coveralls) notify you of new PR comments, deployments, code test coverage, and other updates.

Often, developers are buried in pings and prods. They wake up and see 31 notifications, 79 emails, and 11 PR requests—quite a challenge to tackle in the morning!

The avalanche of notifications across inboxes and workspaces just adds to the complexity of the process and can bury important code review requests.

It steals focus and slows down the work.

Having a single space for seeing, managing, and responding to code review requests can help you prioritize work more effectively. One way to achieve this is with a unified pull request inbox.

This single dashboard gives you a full view of your pull requests across multiple repositories and stages of the review cycle.

screenshot of graphite's pull request inbox

The pull request inbox syncs updated pull requests and adds them to the queue. It also displays vital information such as who has commented, the CI status, and stack information.

The holy grail of code review is an automated process. The reality is that most software is simply too complex to rely entirely on automated code reviews. Even so, augmenting the review process with automation can lead to big improvements in efficiency.

The most obvious automation for expediting code review is linting.

Many teams already automate first-order code reviews using custom-built solutions or commercial and open-source programs like JSLint or PHPLint.

This can help reduce the amount of manual review time taking place across the team.

AI will lead to improved efficiency across many parts of software development. Code review is certainly a promising area.

The question is: In which contexts and for which types of code review is it most effective?

Many teams, including our own, have experimented with AI across a range of functional use cases in the development lifecycle. The findings are different for each team.

Our efforts to implement AI found it most useful for:

  • Generating PR descriptions.

  • Notification summaries.

  • Translating suggestions and comments into code.

  • Codebase Q&A.

Is AI good for reviewing the code itself?

We found some promising use cases that expand on standard linting programs. Examples included checking for unit test coverage, enforcing coding styles and conventions, and detecting duplicate code.

Ultimately, though, we still have concerns about relying on AI for code review.

Maybe in the future.

In many cases, the simplest thing to automate is not the code review process itself. Instead, it’s all of the meta work that surrounds the actual review.

Automation is ideal for tedious tasks like:

  • Assigning specific team members to PRs and review tasks.

  • Adding labels and comments.

  • Following specific deployment practices across different codebases.

![undefined](https://www.datocms-assets.com/85246/1705419433-automations_cover.mp4)

Graphite automations is a new way to perform actions on pull requests using if-then-that rules.

It allows you to assign reviewers based on file paths, add labels, comment on the PR, and send notifications. You can automate almost any aspect of a PR, including its author, review status, labels, or base branch.

While each of these tasks may only take seconds, the aggregate time savings can be substantial.

Each team is unique and differences in workflow can introduce distinct challenges that slow down code review.

Poor communication and code quality issues can lead to slow code reviews.

Measuring and tracking metrics can play a big part in identifying bottlenecks in your process. This type of data provides insights into which workflow aspects might be causing delays.

screenshot of graphite team stats dashboard

Some PR and code review metrics that could be helpful:

  • Total PRs merged.

  • Publish to merge time.

  • Review response time.

  • Review cycles until merge.

Using these metrics together, you can parse specific patterns, clarifying which challenges might slow your workflow.

For example, if the publish to merge time is slower than you’d like, you can investigate whether PRs are spending more time waiting for review or in review. Based on this insight, you can make adjustments to address the underlying problem—improving visibility and response time or accelerating the review itself with improved processes, guidelines, or other adjustments.

As with all optimization efforts, this is an ongoing process. Continuously measuring these metrics, running experiments, and analyzing the results can help you uncover the specific unlocks for your team.

A structured workflow and advanced tools can reduce GitHub code review times while increasing code quality, throughput, and collaboration.

You may need help with management and connecting all the dots to speed up the code review process—particularly a strategic breakdown and tool optimization.

Using tools like Graphite can help you achieve better code reviews—faster. Graphite simplifies your GitHub code review process by facilitating stacked pull requests.

It allows you to create dependent PRs sequentially, making large code changes easier to process and review. The tool seamlessly integrates with GitHub to automate real-time PR synchronization. It reduces manual rebasing and conflict resolution efforts.

In addition, the insights, automations, and pull request inbox reduce manual effort and effectively organize review tasks to improve efficiency. The merge queue also automates conflict detection and resolution guidance. It allows you to address and avoid potential merge issues ahead of time.

Try Graphite today to see how it can help your team improve code reviews and ship faster.

Give your PR workflow
an upgrade today

Stack easier | Ship smaller | Review quicker

Or install our CLI.
Product Screenshot 1
Product Screenshot 2