Read Anthropic’s case study about Graphite Reviewer

Best practices for testing your code

Kenny DuMez
Kenny DuMez
Graphite software engineer

Automated tests are a powerful tool for building confidence in your code's correctness and functionality. Testing can be helpful at many different stages of the development cycle, including:

  • Developing new features: Automated tests help you assert the functionality of features under development and avoid functional regressions in existing features that have test coverage.
  • Fixing bugs: A test-first approach gives you confidence that the issue is correctly resolved. Tests detect and mitigate regression for that particular bug in the future, ensuring it’s caught going forward.
  • Refactoring code: Having test coverage over the code you intend to refactor gives you confidence that the functionality is not degraded after refactoring.

To properly implement testing, you need to understand the different types of tests and when to use them.

There are many differing opinions on what types of tests you should implement and to what degree. Most of these viewpoints agree that the more code you try to test simultaneously, the more expensive and brittle the test becomes.

The basic spread of different types of tests typically looks something like this:

  • Unit tests: Small, cheap, and quick to run. They should focus on a single unit of functionality—ideally as small as possible—and avoid invoking unrelated code. As such, they tend to heavily use test doubles. (More about test doubles below.)
  • Integration tests: They validate how different components of your system interact. This might be a vertical slice of functionality but should typically not include actual external dependencies.
  • End-to-end (E2E) tests: They validate the functionality of the entire application as a user would perceive it. This typically means minimal use of test doubles. Because so much code and functionality are being tested, E2E tests are more expensive to create and maintain, and they take longer to run.

For lower-level tests like unit and integration tests, at times you'll want to substitute part of your system with a test double so that the test doesn't need to deal with too much functionality that's irrelevant to what it is trying to assert. Test doubles come in several forms, with varying levels of fabrication and functionality:

  • Dummy objects: These are simple placeholder objects that satisfy the required parameters when neither the system under test (SUT) nor the test itself cares about them.
  • Test stubs: Stubs replace a real component and give you a control point to provide indirect inputs that influence the SUT's execution path.
  • Test spies: A spy records the indirect outputs of a SUT, allowing you to make assertions on the state of data at the point of the spied-on component.
  • Fakes: Fakes have working implementations that are not the same as the production ones. They are typically used to shortcut unimportant functionality during a test. An example of this might be a fake email sender that the tests can use to avoid sending real emails.
  • Mocks: Mocks are preprogrammed with expectations that can be used to validate the inputs that the mocked component would have received. Depending on where you place your mocks, you can test indirect inputs and outputs.

Test doubles allow you to decouple units of code from the rest of a well-architected system, making the tests smaller, cheaper, and easier to maintain.

When to write tests for the system is up to you, but many developers have found success with test-driven development (TDD) where you write your tests before writing the functionality that they validate.

You write a small test that’s expected to fail but specifies part of the desired behavior of the SUT. Once you confirm that it fails, you write just enough implementation code to make the test pass. After this, you can refactor the test and implementation to improve them before writing the next piece of the test, describing the next piece of functionality, and starting the cycle again.

Code coverage describes what percentage of your application's code is executed when tests are run.

Code coverage is not directly linked to code quality. It just gives you an idea of which code is run during a test. It doesn’t provide confidence that the code is free from bugs so shouldn’t be used as a measure of quality. Instead, use it to know which areas of your application are not touched by tests.

Git inspired
Graphite's CLI and VS Code extension make working with Git effortless.
Learn more

Graphite
Git stacked on GitHub

Stacked pull requests are easier to read, easier to write, and easier to manage.
Teams that stack ship better software, faster.

Or install our CLI.
Product Screenshot 1
Product Screenshot 2