Read Anthropic’s case study about Graphite Reviewer

Graphite is currently a thriving company developing a code review platform used by tens of thousands of top developers. But, like many dev tools startups, it didn't start that way.

In this post, I want to focus on the wisdom from a particular book that's been a cornerstone of my learning: "The Mom Test." This book is a short and essential read for anyone venturing into the world of DevTools startups, open-source contributions, or creating developer-focused products.


Note

Greg spends full workdays writing weekly deep dives on engineering practices and dev-tools. This is made possible because these articles help get the word out about Graphite. If you like this post, try Graphite today, and start shipping 30% faster!


Before working at Graphite, I was an infrastructure engineer at Airbnb. There, I learned countless lessons that helped teach me how to become a great software engineer. However, I would come to learn that when building something from scratch, it hardly matters how well you build it. What matters much more is picking the right thing to build. This realization became particularly evident as I embarked on the startup path, and co-founded Graphite.

Our team has always been passionate about the software quality and release processes. The first prototype my co-founders and I built was a “bug capture” tool. Despite getting a prototype in the hands of a wide network of friends, LinkedIn connections, and even random engineers online, the reception was immediately underwhelming. We pivoted into our next product, centered around iOS rollbacks, and faced the same challenges.

Frankly, no one (outside of a select few users) cared about what we were building. We started asking why — this is where “The Mom Test” came in.

Had we asked the right questions upfront, we could have saved ourselves years of toil. It might sound silly, but asking good questions can be really, really difficult.

"The Mom Test" book explains the solution - it’s about framing your questions in such a way that you get truthful, unbiased feedback, even from those who are inherently supportive, like your mother. When I used to share my excitement about our iOS rollbacks concept, the usual response was overwhelmingly positive, but in hindsight it was probably just folks being nice to me.

Of course, my friend would tell me, “iOS rollbacks sound like a great idea!” But what I should have asked was:

  • “When was the last time you googled for a way to roll back your iOS app?”

  • “Did you try the answers online?”

  • “You’re a great engineer, tell me about how you’ve hacked a solution here.”

  • “Given that you’ve hacked a solution, do you still google from time to time for something better?”

The key is to ask questions that dig deeper, inquiring about actual behavior, like whether someone has ever actively sought a solution like yours. Furthermore, software engineers as a user group are some of the most capable people in the world at solving their own problems. If they haven’t already tried scripting a solution, was it that big of a problem in the first place?

In developing a product for developers, it’s crucial to check one key detail: Does a solution already exist?

One of two things must be true if a solution doesn't exist. Either the idea doesn’t solve a big enough pain point (such as in our case with rollbacks and bug-capture), or there’s a fundamental reason why it hasn’t been able to exist yet. In-memory databases have always been a great idea, but it took RAM getting cheap enough for the idea to become unlocked. Chatbots are wonderful, but were blocked on advancements in large language models.

If you want to pursue building a dev tool that has never been built before, be very clear on why it can only be built now, and be ready to fight many competitors who have also just been unlocked.

A great example of being unlocked by the advancement of technology is MemSQL:

On April 23, 2013, SingleStore launched its first generally available version of the database to the public as MemSQL.[9] Early versions only supported row-oriented tables, and were highly optimized for cases where all data can fit within main memory. This design was based on the idea that the cost of RAM would continue to decrease exponentially over time, in a trend similar to Moore's law. This would eventually allow most use cases for database systems to store their data exclusively in memory.

https://en.wikipedia.org/wiki/SingleStore

The alternative (and in my opinion better) scenario is when the tool already exists but it has some glaring flaws. You can often find an open source project, a blog, or an internal big company tool that matches your idea. This can be some of your best validation that folks care about the problem enough to have tried building it before. The only thing left for you to figure out is: do users still feel some pain around the tool? How can you do better?

PagerDuty is a great example of this scenario:

We spent that first month of ‘09 thinking of ideas and doing research. One of the ways we thought of ideas was by thinking of internal tools that bigger companies had built in-house (like Amazon, where all three of us had worked prior) that other companies of all sizes would need… Amazon had built an internal tool to handle on-call scheduling and alerting via pagers. This tool was bolted on top of their internal ticketing and monitoring systems, so when critical issues were detected, the right people were paged… After doing a bit of research, we realized that it wasn’t just Amazon that built an internal tool for going on call—Google and Facebook both built their own versions. It seemed like there was a clear need here.

https://www.pagerduty.com/blog/decade-of-duty/

In the case of Graphite, our third and final pivot, the tool also already existed. Stacked diffs and alternative code review platforms were already invented and beloved at bigger companies like Google and Facebook. Phabricator and Gerrit were open source. Users actively searched for solutions, wrote scripts, tried self-hosting, and more. Each month someone on Twitter would tweet at GitHub asking for them to build stacked diffs natively. All the while, users craved more. It was the perfect opportunity for a new dev tool.

\Why did our first two attempts fail to gain traction, while our third succeeded? It certainly wasn't that the quality of our engineering doubled over night; that remained steady. The answer was that unlike our previous two ideas, people actually felt pain in the problem we were solving.

Fundamentally, Graphite passed The Mom Test in a way that our previous ideas hadn’t. Had we clearly and unbiasedly asked the following questions, we could have picked the right product to build from the start:

  • “When was the last time you googled for such a tool?”

  • “Did you try installing what you found?”

  • “Tell me about a script you hacked together here.”

  • “Since hacking together a script, when was the last time you still googled for a better solution?”

What ideas have passed this test historically? PagerDuty. Merge queues. Metrics dashboards. Slack bots. CI test runners. The list goes on. If it exists in a big company but doesn’t externally, that's a great starting place. If no company has ever built it internally, you might want to check your rose-tinted glasses.

Years into working on Graphite, the insights from "The Mom Test" remain invaluable to me. The whole team references it weekly when seeking genuine, unfiltered feedback, ensuring that Graphite focuses on developing features that address real needs. I highly recommend this book to anyone in the field of product development, especially in the realm of DevTools. It's not just a guide to asking the right questions – it's a roadmap to understanding and meeting your users' true needs.

Built for the world's fastest engineering teams, now available for everyone