TL;DR
A typical Github repo contains many committers who are not active. In fact, 17% of committers with any repo activity only merged 1 PR in all of 2023.
We propose defining an active committer as someone who merges a median of ≥ 1 PR/week — meaning they ship code on a regular basis.
Defining “active” is important to us because when we report out insights around, say, time to merge a PR or ideal PR size, we tend to do so for a “median user.”
We want this median user to reflect the typical developer experience, and including inactive committers will skew all metrics.
The larger the org, the higher the concentration of infrequent code committers (< 1 median PR per week). Smaller orgs (~11-25 committers) tend to have the highest % of active committers.
Even among active developers, there is a long tail: 48% of active developers are responsible for 80% of all PRs merged.
Total: 290k developers, 28 million PRs
Graphite has synced 28 million PRs across 290,000+ developers. (data excludes PRs from bots)
First filter: 2023-now
Unless otherwise stated, our analyses only includes data from more recent years (2023 onwards, 160k developers, 12 million PRs) to better reflect today’s trends.
Second filter: minimum PR merge frequency
When commenting on most trends, we internally filter for a subset of developers whom we term “active developers.” Not all users who have ever submitted a PR are shipping code on a regular basis. This matters because when we discuss facts around developers’ typical e.g., time to merge or PR size, we want to reflect the experience of developers who regularly ship code.
- A lot of committers in repos are not active; of those who published at least one PR since Jan 2023, 14% did not actually merge any PRs, and 17% only merged 1 PR.
When we look at weekly PRs merged, we see that of developers who have submitted a PR since 2023, two-thirds merge fewer than 1 median PR each week.*
*we exclude popular holidays periods where code freezes are common (2023-12-17 to 2024-01-06) and only count weeks between a developer’s first and last PR activity to take into account those who join and leave companies mid-year.
Definition: we propose defining an active committer as:
Merged any PRs from January 2023 to now
Median PRs merged per week ≥ 1
Why does this matter?
We want user-level metrics to reflect the average developer experience. When the majority of developers in a repo rarely merge PRs, per-user aggregate metrics will overwhelmingly reflect inactive developers.
People who submit PRs infrequently show different behavior than active developers whose primary function is to ship code.
- Example: if we include everyone, then the median developer’s median time to merge a PR is 18.3 hours. When we limit to active developers, the median developer’s median time to merge drops to 14 hours, a 29% decrease.
- If we include everyone, then it looks like 23% of developers are responsible for 80% of PRs merged. If we include only active developers, then 48% of active developers are responsible for 80% of PRs merged
An aside: who are these infrequent committers?
- Larger companies tend to have a higher concentration of committers who merge very few PRs in a year