Building developer tools is a meta-problem: Your architecture decisions directly impact how quickly you can build the very tools that make other developers more productive. At Graphite, we've learned that the right monorepo structure isn't just about code organization—it's about enabling a team of 15 engineers to ship over 1,000 pull requests per month while maintaining quality and sanity.
When I started Graphite with my co-founders, we made what many would consider (at the time) a contrarian bet: no microservices, no polyglot architecture, no distributed complexity. Just TypeScript, everywhere, in a single monorepo with a single server image. Years later, as we scale up to 40 engineers and millions of requests, I'm more convinced than ever that this approach was right for us—not because it's perfect or universally applicable, but because it optimizes for what actually matters at our scale: developer velocity.
This wasn't a decision made lightly. Coming from Facebook, we'd seen the power of internal tools like Phabricator operating at massive scale with monorepo principles. We also recognized that good architecture should fundamentally reduce complexity. Specifically, we identified three main sources of software complexity:
Change amplification: How many places must change when you update something.
Cognitive load: How much engineers must keep in their heads.
Unknown unknowns: Untracked dependencies or side effects.
This isn't to say microservices or polyglot architectures are wrong, though. They solve real problems around team autonomy, technology diversity, and independent scaling. But for a team our size building a cohesive product, the coordination overhead often outweighs the benefits.
Note
Greg spends full workdays writing weekly deep dives on engineering practices and dev-tools. This is made possible because these articles help get the word out about Graphite. If you like this post, try Graphite today, and start shipping 30% faster!
The philosophy: Composition over complexity
Our architecture philosophy draws heavily from "A Philosophy of Software Design" and clean architecture principles. We've learned that premature distribution is the root of much engineering evil. Instead of breaking apart our system into microservices, we break apart our code into composable modules.
Consider here how we handle external integrations. Rather than creating separate services for GitHub sync, AI processing, and email notifications, we use dependency inversion to create interfaces we own:
interface GitHubClient {getPullRequest(repo: string, number: number): Promise<PullRequest>;syncRepository(repo: Repository): Promise<SyncResult>;}interface AIClient {generateCodeReview(diff: string, context: ReviewContext): Promise<Review>;}// In production, we compose these togetherconst app = createApplication({github: new GitHubClientImpl(githubToken),ai: createOpenAIClient(openaiConfig), // Using Vercel AI SDKlogger: new DatadogLogger(),});// In tests, we mock them outconst testApp = createApplication({github: new MockGitHubClient(),ai: new MockAIClient(),logger: new NoOpLogger(),});
This pattern has saved us countless times. When we needed to swap from one AI provider to another, replace our logging infrastructure, or add a new feature flag system, the business logic never changed. We just swapped the implementation behind the interface.
The Vercel AI SDK has been particularly helpful here, providing a unified interface across different LLM providers while handling streaming, function calling, and other complexities. Instead of maintaining provider-specific integration code, we can focus on the AI features themselves.
TypeScript everywhere: One language, many benefits
In early 2020, we faced the classic startup choice: Python/Ruby for rapid prototyping or something more structured for long-term maintainability. We chose TypeScript not just because it was gaining momentum, but because it eliminated an entire class of problems.
The type system catches bugs at compile time that would otherwise surface in production. More importantly, when half your complexity lives in the frontend and half in the backend, sharing a single language with shared type definitions creates a development experience that feels magical:
// Shared types between frontend and backend, validated with Zodexport const CreatePullRequestRequestSchema = z.object({title: z.string().min(1),description: z.string(),baseBranch: z.string(),headBranch: z.string(),});export type CreatePullRequestRequest = z.infer<typeof CreatePullRequestRequestSchema>;// Backend route with runtime validationapp.post('/api/pull-requests', async (req: Request) => {const validatedData = CreatePullRequestRequestSchema.parse(req.body);const pr = await createPullRequest(validatedData);return { success: true, pullRequest: pr };});// Frontend usage - TypeScript knows the exact shapeconst response = await fetch('/api/pull-requests', {method: 'POST',body: JSON.stringify({title: formData.title, // Type-checkeddescription: formData.desc, // Type-checkedbaseBranch: 'main', // Type-checkedheadBranch: currentBranch // Type-checked})});
When you change an API contract, TypeScript tells you everywhere that needs updating. When you refactor a core type, the compiler becomes your pair programming partner, showing you exactly what broke and where. In an llm-assisted world, this benefit as only increased.
Zod has been crucial for keeping this type safety consistent between compile-time and runtime. It lets us define schemas once and use them for both TypeScript types and runtime validation, which eliminates the common problem of API contracts drifting between frontend assumptions and backend reality.
Monorepo structure: Libraries as first-class citizens
Our monorepo isn't just a collection of apps thrown together—it's architected as a ecosystem of composable libraries:
libs/├── public/ # Shared between frontend and backend│ ├── auth/ # Authentication utilities│ ├── git-diff-parser/│ └── shared-types/├── private/ # Backend-only libraries│ ├── db-client/ # Database abstraction layer│ ├── github-client/│ └── ai-client/apps/├── private/server/ # Main Node.js API server└── public/├── graphite-app/ # Next.js frontend├── cli/ # Command-line interface└── vscode/ # VS Code extension
Each library has a single responsibility and clean interfaces. Our database client wraps Typeorm heavily—we've learned to distrust ORMs' query generation and prefer explicit control:
// libs/private/db-client/src/pullRequests.tsexport async function findPullRequestsForUser(userId: string,filters: PullRequestFilters): Promise<PullRequest[]> {// We own the exact SQL that gets generatedconst query = db.select('*').from('pull_requests').where('author_id', userId).where('status', 'in', filters.statuses);if (filters.repository) {query.where('repository_id', filters.repository);}return query.execute();}
This approach has prevented countless N+1 query problems and makes performance optimization straightforward.
Build system: Turbo powers our velocity
With 50+ packages in our monorepo, naive builds would be painfully slow. In our early days, we experienced what internal Slack threads called "rebuild hell"—small changes in the server would trigger lengthy builds for unrelated libraries. This growing pain pushed us toward better modularization and ultimately led us to adopt Turbo.
Turbo transforms our build pipeline into a directed acyclic graph where each library only rebuilds when its dependencies change:
{"tasks": {"build": {"dependsOn": ["^build"],"outputs": ["dist/**"],"inputs": ["src/**", "package.json", "tsconfig.json"]},"test": {"dependsOn": ["^build"],"outputs": [],"inputs": ["src/**", "test/**"]}}}
The result? Running `yarn build` intelligently skips unchanged packages and only rebuilds what's necessary. We've seen 10x+ speedups compared to naive rebuilding everything. Most build operations are now sub-second for unchanged packages, transforming our development experience from frustrating waits to near-instant feedback.
Even better, our testing strategy leverages the same dependency graph. We maintain a shared testing library that re-exports consistent patterns:
// libs/public/test-utils/src/index.tsexport {describe,it,expect,beforeEach,afterEach,jest,} from '@jest/globals';export { createMockDatabase } from './database';export { createMockGitHubClient } from './github';export { setupTestEnvironment } from './environment';
Every package in our monorepo uses identical Jest testing patterns, making it trivial for any engineer to write tests anywhere in the codebase.
Single server image: Simplicity at scale
While the industry often preaches microservices, we've found a single, stateless Docker container scales surprisingly far. Our deployment strategy is dead simple:
FROM node:18-alpineWORKDIR /appCOPY package*.json ./RUN yarn install --frozen-lockfileCOPY . .RUN yarn buildEXPOSE 8000CMD ["node", "dist/index.js"]
This stateless approach, combined with horizontal scaling through AWS ECS, handles millions of requests with ease. When we need to scale, we add more instances. When we need to deploy, we replace them atomically.
The real win isn't performance—it's operational simplicity. One image to build, one deployment pipeline to maintain, one set of logs to monitor. Our continuous delivery pipeline automatically promotes builds from staging to production based on health checks and performance metrics.
Development experience: one command to rule them all
Great architecture enables great developer experience. Our entire development environment boots with a single command:
yarn server-stg
This command:
Installs all dependencies across 50+ packages.
Builds libraries in dependency order using Turbo.
Starts the development server with hot reload.
Connects to our long-lived staging database.
That last point is controversial but powerful. Instead of maintaining complex local database seeding scripts, we develop against staging data. It's a form of "testing in production," but staging is effectively an isolated production environment used only by employees. New engineers can start contributing meaningful code on day one without wrestling with incomplete local data.
Functional programming: Why we avoid classes
We actively discourage class-based programming, not out of dogma, but because of practical issues we've observed in our codebase and codified in our engineering principles. Classes don't encourage static dependency injection by default—dependencies get hidden in constructors and instance variables. They tend to grow into large files with many methods shoved inside a single class. Most problematically, they indirectly encourage inheritance hierarchies that become difficult to refactor.
This philosophy emerged from early internal debates and is now reinforced in our onboarding and code review practices. Classes aren't evil, but we've found that leaning toward functional composition leads to cleaner, more testable code:
// Instead of this OOP approach:class PullRequestService {constructor(private github: GitHubClient, private db: Database) {}async createPullRequest(data: CreatePRData): Promise<PullRequest> {// Dependencies hidden, harder to test, encourages large classes}}// We prefer this functional approach:export async function createPullRequest(github: GitHubClient,db: Database,data: CreatePRData): Promise<PullRequest> {const githubPR = await github.createPullRequest(data);const dbPR = await db.pullRequests.create({githubId: githubPR.id,title: data.title,// ... other fields});return transformToApiResponse(dbPR, githubPR);}
This style makes testing trivial, side effects explicit, and refactoring safer. Dependencies are explicit parameters rather than hidden instance state. Functions naturally stay focused on single responsibilities.
Database strategy: Postgres + Redis + S3
Our data layer is deliberately simple: Postgres for transactional data, Redis for caching and pub/sub, S3 for blob storage. This trinity handles everything from user authentication to real-time notifications to storing git diffs.
We lean heavily on Postgres's diverse strengths—JSONB for flexible schemas, full-text search for code search, transactions for consistency. For years, we even powered our entire PR inbox functionality with custom filtering rules implemented directly in Postgres. The combination of JSONB queries, complex WHERE clauses, and careful indexing let us build surprisingly sophisticated filtering without additional infrastructure.
Our database design also emphasizes idempotent, self-healing write paths and application-level joins rather than foreign key constraints. This approach, documented in our internal design docs, enables easier sharding and scaling while reducing operational incidents through built-in retry logic.
That said, we're starting to outgrow this approach. As our data volume increases, we're finally moving to dedicated search indices for faster results. Postgres got us remarkably far, but knowing when to graduate to specialized tools is part of scaling wisely.
// Example: storing and retrieving PR diffsasync function storePullRequestDiff(prId: string, diff: string): Promise<void> {// Store large diffs in S3if (diff.length > 1000000) {const key = `pr-diffs/${prId}.diff`;await s3.upload(key, diff);await db.pullRequests.update(prId, { diffLocation: `s3://${key}` });} else {// Store small diffs directly in Postgres JSONBawait db.pullRequests.update(prId, { diff });}}
Infrastructure as code: Terraform all the way down
Our infrastructure is defined entirely in imperative Terraform (CDKTF), from ECS clusters to RDS instances to CloudFront distributions. This means our entire production environment can be recreated from git history.
More importantly, infrastructure changes go through the same code review process as application changes. When we need to add a new service, scale a database, or modify networking rules, it's all code that can be reviewed, tested, and rolled back.
The contrarian choices that work
Several of our architectural decisions go against conventional wisdom:
We develop against staging data instead of local seed data. This eliminates the maintenance burden of keeping local development databases in sync with production schemas and gives developers realistic data to work with from day one.
We actively discourage classes in favor of pure functions. This naturally steers the codebase toward composition and makes testing significantly easier.
We chose a monorepo with a single deployment artifact instead of microservices. This reduces operational complexity and enables atomic deployments across the entire system. That said, we recognize this approach has limits—as teams grow beyond a certain size, the benefits of service autonomy and independent deployments may outweigh the coordination costs.
We wrapped our ORM heavily instead of trusting its abstractions. This gives us fine-grained control over database queries and prevents performance surprises.
These choices work because they optimize for our constraints: a growing team that needs to ship quickly while maintaining quality.
Results that matter
Our architecture enables the metrics that matter most, which we track obsessively - partly using our own Graphite Insights tools:
15 engineers shipping 1,000+ PRs per month - and scaling to 40 engineers this year
Fast build times for most changes thanks to Turbo's intelligent caching
One-command development environment that new hires can use productively on day one
Zero-downtime deployments through stateless horizontal scaling
Shared code between frontend and backend eliminating entire classes of integration bugs
These aren't vanity metrics—they represent real developer happiness and productivity gains. The real test of any architecture isn't how it looks in a diagram—it's how it feels to work with every day. Our engineers spend their time building features, not fighting the build system. They think in one language all day, not context-switching between different tech stacks. They can confidently refactor across package boundaries because TypeScript catches the breakages.
Lessons for similar teams
These architectural choices work well for us, but they're not universal truths. If you're building developer tools or any product where engineering velocity directly impacts business outcomes, consider these principles while adapting them to your specific context:
Optimize for developer experience over theoretical scalability. You can always add complexity later, but you can't easily remove it.
Choose technologies that compound. TypeScript's benefits multiply when used everywhere. Monorepos pay dividends when tooling is shared across packages.
Libraries beat microservices at small scale. Modular code in a single deployment is often simpler than distributed services.
Functional programming isn't just academic. Pure functions and explicit dependencies make large codebases maintainable.
Testing strategies should be as systematic as your architecture. Shared testing utilities across a monorepo ensure consistent quality everywhere.
Continuous experimentation is key. We're always iterating—refining boundaries, experimenting with build optimizations, and evolving patterns as the codebase and team grow. Architecture is never "done."
The goal isn't to build the most impressive architecture—it's to build the one that best serves your team's ability to ship great software. For us, that's meant embracing simplicity, sharing everything we can, and optimizing relentlessly for the feedback loop between idea and deployed code.
Want to see how these principles work in practice? Graphite is hiring across engineering roles!