Read Anthropic’s case study about Graphite Reviewer

At a certain point, all companies seem to outgrow their technologies, wrap them, and then replace them with an homegrown variant. When I was at Meta, PHP slowly became Hack. On the Instagram side, what was once raw Django slowly got obscured by layers of in-house wrappers.

Yet despite its age and ubiquity, GitHub has so far remained un-wrapped.

Sure some companies choose to self-host their GitHub instances, but by and large, the GitHub that large companies use (say the Airbnbs and Stripes of the world) looks recognizable to the college student on the free tier of github.com. It’s only once we reach the largest company sizes — the Google/Meta/Amazon level — that we start to see meaningful differentiation and investment.

Even then, that isn’t really a fair comparison. In 2005 at Google, Guido van Rossum — that same Guido that created Python — started working on the company’s first web-based code review tool called Mondrian. Evan Priestly at Facebook’s started hacking on its first web-based code review tool, Phabricator, in 2007. GitHub, for its part, was only founded in 2008. The fact that these companies don’t use GitHub today isn’t because they graduated out of it; they were already on custom-built alternatives.

But it’s an interesting counterfactual: with all of the not-invented-here that tends to exist at the largest tech companies, had GitHub existed, would they have outgrown it and replaced it?

Critique and Gerrit

One naive way of reaching a conclusion here is to take a look at their modern tooling and use that as a spec. Would these companies have been able to re-create this atop GitHub?

Google is the company that has shared the most about its setup over the years and the one we’ll focus on here. We even have screenshots of some of its tooling, albeit a few years old. (I’m told that the tools in these screenshots have gone through a large facelift since.)

Google has two internal code review tools: Critique, which is used by the majority of engineers, and Gerrit, which is open-sourced and continues to be used by public-facing projects. You can actually play with Gerrit yourselves here in the Chromium and Android open-source repos.

💡 That Mondrian tool we mentioned Guido created earlier? It was later open-sourced by Guido as “Rietveld,” named after an architect he liked named “Gerrit Rietveld.” Gerrit would be forked from Rietveld to later add access control.

(The Wikipedia page takes more creative license here than the official Gerrit history and states: “Gerrit is a fork of Rietveld started because ACL patches would not get integrated into Rietveld.”)

Let’s take a look at some of their features.

When engineers sign in in the morning, or take a break to review PRs — internally known as change lists or CLs — both Critique and Gerrit provide dashboards where its easy to see all of the in-flight changes at a glance (think a more sophisticated and information-dense version of the GitHub repository pull request view).

Gerrit’s dashboard (below) is backed by a singular search and pulls out information including the size of the change and a more detailed look at the status of the CL (the three columns to the right) to the top-level.

In Critique — which came later — the dashboard introduced multiple sections allowing authors to highlight different groups of CLs (most commonly those that you’ve authored and haven’t merged, and those that need your review).

💡 One weird quirk of Critique: the “Switch user” button on the top toolbar. It allows you to view the sections of another user, e.g. if you want to make sure that your CL is actually on their radar.

Clicking into a CL itself, the view is quite jarring. All of the elements we have on GitHub are present — just in different locations.

A few items get more prominent treatment than GitHub gives them; for example, next to the list of files in Gerrit, code coverage (and change in code coverage) is explicitly called out.

Also, the commit message for the CL appears as a first-class, reviewable entity, located right next to the actual changed files in the diff section.

The greatest departure from GitHub that Critique and Gerrit make is how they decide (at least internally at Google) when a CL is ready to be merged.

💡 In GitHub parlance, we say a PR is ready to be “merged.” Inside Google though, you’d say that the CL is ready to be “submitted.” That’s why you’ll notice in the screenshots above there are “Submit Requirements.”

All changes at Google require three levels of mandatory approvals:

  1. LGTM (”looks good to me”)

  2. Code owners

  3. Readability

The first two of these — LGTM and code owners — will be familiar to an engineer using GitHub. Anyone can give an LGTM which signifies that the core business logic of the CL checks out (equivalent to a regular approving GitHub PR review). Code owners maps cleanly to the GitHub equivalent concept: when you make a change in a given file, you need to approval of whoever is listed as the owner for the file or containing directory. (If you are the author and are an owner yourself, you automatically pass.)

💡 In fact GitHub lifted the concept of code owners directly from Google; when GitHub added code owners support in 2017, it included a small footnote at the bottom: “The code owners feature was inspired by Chromium’s use of OWNERS files.”)

Readability, however, is new and unique to Google.

From Software Engineering at Google:

In Google’s early days, Craig Silverstein (employee ID #3) would sit down in person with every new hire and do a line-by-line “readability review” of their first major code commit. It was a nitpicky review that covered everything from ways the code could be improved to whitespace conventions. This gave Google’s codebase a uniform appearance but, more important, it taught best practices, highlighted what shared infrastructure was available, and showed new hires what it’s like to write code at Google. The readability program remains in place today with the intention of continuing to uphold this standard across the tens of thousands of developers that merge tens of thousands of changes in each day.

The idea here is that it’s not just enough to have someone double-check and understand your functionality. With tens of thousands of developers committing code, you want to make sure that everyone is committing code that matches the lengthy language standards and is using the recommended patterns and libraries — so you add an additional reviewer for that.

(Like code owners, if you as the author are yourself a “readability expert” in the language you wrote the CL in, you’re automatically good to go here.)

With multiple possible sets of reviewers per CL — working in another team’s codebase you may find yourself needing an LGTM from your teammate (who best knows the change you’re trying to make for your own product) and a code owner approval and readability review from others — it can become easy to lose track of whose action is needed to unblock a particular change.

Gerrit and Critique introduce a first-class treatment for this.

In the earlier Gerrit screenshot, you’ll notice that a gray arrow sometimes appears next to the name of an author or reviewer.

And in Critique, certain names can appear bolded.

These markers signify that it’s somebody’s “turn” to action on the given CL. Hover over the indicator and Gerrit or Critique will also tell you why it thinks it’s your turn to take action (e.g. your review was just requested or an author responded to your comment). At any given moment, the set of all the individuals whose turn it is create the overall “attention set.”

On any given CL, a reviewer can also mark themselves as “not my turn,” removing themselves from the attention set and clarifying to the author — and the others remaining in the attention set — who really needs to take action. This might be because their teammate is a better reviewer or because they need another reviewer to take action first; culturally, Google CLs are typically reviewed by the LGTM reviewer first, then the code owner and readability expert second.

Not invented here

Where does all of this put our GitHub counterfactual?

Would a version of Google already on GitHub — but with the desire for customizable dashboards, custom approval logic, and turns/attention sets — have eventually given up on GitHub and sunk in the effort to create their own alternative from scratch?

From our earlier feature perspective, despite how unique Critique and Gerrit features seem at first blush, I come out bullish on the side of GitHub. GitHub’s platform and API offering is surprisingly powerful; bots posting comments or managing PR state (e.g. assignees) could create an approximate imitation here. The UI would be far uglier and the overall experience would feel far more homegrown and hacky, but I could easily see a company concluding that the results of creating a tool from scratch would be too much effort in light of this much cheaper possibility.

This is where I predict most of the largest name-brand companies on GitHub today — Airbnb, Stripe — are.

Another way of looking at this situation though is through the lens of integration with the rest of the companies tools. At Google and Meta scale, a strong “not invented here” syndrome pervades and all tools are built in-house, justified by complete control over the tool (if we want a feature, we can easily build it) and seamless integration with other existing tooling (we can guarantee that the code review tool can always speak directly to the task tool). An API is nice but you’re still at the whim of an external company; it’d be far better to be in control of your own destiny.

During my time at Meta, I saw countless spirited debates about this attitude. A new hire would join the company and ask why we needed to have our own custom everything. What did we need to rebuild our own version of a basic task tool from scratch? Hundreds of threaded comments later the debate would go nowhere and the consensus — that we need everything custom-built or we would lose our edge — wouldn’t budge.

From this perspective, even with a GitHub with complete feature parity, it seems doubtful that a Google or Meta would want to turn over control.

This is the most interesting part of the counterfactual to me: when does a company grow to the size where “not invented here” fully take over? The history would seem to indicate that it’s somewhere between a Stripe and a Meta, somewhere between a few thousand engineers and tens of thousands.

But another thing bugs me: these companies have grown up in different eras. Airbnb and Stripe have come to age in an era where the range of SaaS offerings is deeper than ever before and they’ve leveraged this. Their infrastructure is with cloud providers; they’re more comfortable using outside tools (like GitHub). Is the big company “not invented here” an inevitable product of plain company size? Or is it a byproduct of the era and ecosystem in which a company was founded?

If you grow up using outside tools, do you one day reach the scale where you necessarily have to stop? Or is it that you don’t use outside tools — not because of scale and leverage and impact — but because you never have?

As the earliest startups on GitHub grow larger and larger it’ll be fascinating to see what they choose.

Built for the world's fastest engineering teams, now available for everyone