Last week Anthropic released their case study highlighting how we used Claude to power Graphite Reviewer, our high-signal AI code review companion.
When we set out to build our new AI-powered code review feature, one of the main challenges we initially faced was achieving a deep level of code comprehension. When talking to customers we often heard, “an AI code review tool needs to have the sophistication and depth of a senior engineer, anything else would be just a novelty.” While plenty of large language models can generate code snippets or offer suggestions, few understand the intricate logic and patterns found in modern codebases. Early experiments with popular models left us underwhelmed—many generated confident but incorrect suggestions that failed to address real bugs.
With any AI tool, building trust is difficult and erosion of this trust happens quickly. When testing out different LLMs internally, the level of false positives and noise that these models introduced quickly undermined our confidence in the project. We even started doubting whether or not AI code review was tenable with the current sophistication of LLMs on the market. However, that apprehension dissipated when we started using Anthropic's Claude. Out of all the models we tested, Claude was by far the best at deeply understanding code and leaving meaningful feedback with the lowest false positive rate of any LLM we tested.
With Claude, Graphite Reviewer achieves:
• 40x faster pull request feedback loop, from 1 hour to 90 seconds
• 96% positive feedback rate on AI-generated comments
• 67% implementation rate of suggested changes
• Support for hundreds of thousands of pull requests across our customer base
How AI is shaping the future of code review
As generative AI increases the output of individual developers, more strain is placed on the “outer-loop” of the software development lifecycle; more code being produced means more code to be reviewed and tested. AI code review tools like Graphite Reviewer powered by Anthropic’s Claude can help bridge this gap. Traditional code review can be a bottleneck, forcing developers to wait hours or even days for feedback. By leveraging Claude, Graphite Reviewer cuts this feedback loop down to seconds while eliminating entire review cycles by giving immediate first pass feedback on every PR opened in a repository. This means that code authors can fix bugs, logical errors, and security vulnerabilities before a human reviewer even has a chance to look at the code.
These efficiency gains are further amplified in the case of distributed remote teams working across different time zones. Graphite Reviewer is always online ready to review any code you throw at it, meaning you no longer have to wait hours for a teammate across the world to wake up and review your code. As Brian Michel from The Browser Company put it, “as a single developer you’re really not alone anymore.”
Working with the Anthropic team
When we launched Graphite Reviewer in October 2024, demand quickly outpaced our initial expectations. Anthropic responded immediately, helping us scale our rate limits. They advised on best practices for caching prompts, handling retries, and structuring inputs to support our growing user base. We even had a dedicated slack channel with the Anthropic team where we could bounce ideas off of them and share technical findings. We found it to be incredibly helpful for getting advice on how we should structure our code when integrating with Claude.
While Anthropic’s Claude model already understands code at a sophisticated level, we believe that in the coming years, LLMs will take on an even more central role in code creation, review, and maintenance, enabling us to help developers produce higher-quality software more efficiently than ever before.
For more details on how Graphite Reviewer leverages Claude read the full case study.