Debugging with `git bisect`: Identifying the commit that introduced a bug

Greg Foster
Greg Foster
Graphite software engineer
Try Graphite


Note

This guide explains this concept in vanilla Git. For Graphite documentation, see our CLI docs.


Table of contents

When a previously working feature suddenly breaks, finding which commit introduced the bug can be challenging. Git provides a powerful tool called git bisect to speed up this “bug hunt.” It uses a binary search through your commit history to pinpoint the first bad commit that caused the issue. This guide will explain how git bisect works, walk through an example of tracking down a bug step-by-step (with commands and outputs), and cover advanced tips like automated bisecting with scripts. We’ll also discuss caveats (such as complex histories) and briefly see how the Graphite CLI can complement this workflow.

git bisect is a Git command that helps identify the commit where a bug was introduced by performing a binary search through the commit history. You provide two reference points: one “good” commit (where the code was known to work) and one “bad” commit (where the bug is present). Git then checks out a commit roughly halfway between these points. You test that commit and mark it as “good” or “bad,” which tells Git whether the bug lies in the earlier half or later half of the history. Git continues this process of halving the search space until it pinpoints the first commit that introduced the bug. In essence, git bisect automates the “divide and conquer” approach to debugging. This is much faster than checking every commit sequentially – it dramatically reduces the number of commits you have to test (binary search is O(log N) in complexity). For developers, this means you can find the offending commit in a fraction of the time it would take by manual trial and error.

Now let's walk through the bisect process step by step using a scenario where a CLI tool's report generation feature has stopped working. We'll mark our known good and bad commits and let Git guide us to the culprit.

  1. Start the bisect session: In your repository, check out the branch with the issue (e.g. main) and initiate bisecting by running git bisect start. This puts Git into bisect mode, ready to accept good/bad markers. (If you are in a subdirectory of the repo, navigate to the repo’s root before starting, since bisect may refuse to start from a subdirectory.)

  2. Mark the bad commit: Tell Git which commit is “bad” (contains the bug). Typically, this is the current HEAD if you’re on the latest buggy code. You can simply run:

    Terminal
    $ git bisect bad

    This marks the current commit (HEAD) as bad. Alternatively, you could specify a commit hash or tag after bad if the bad commit isn’t the one currently checked out (e.g., git bisect bad <commit-hash>). In our example, HEAD is the buggy version, so we mark it as bad.

  3. Mark a good commit: Next, identify a commit that you believe was before the bug appeared. This should be a commit where the feature worked. It must be an ancestor of the bad commit for the bisect to work correctly (in other words, the “good” commit should be from before or around when the bug was introduced, on the same branch history). Often you might choose a tagged release or a commit from a certain date. Check out that commit and verify the feature (e.g., run myapp generate-report on that commit to confirm it doesn’t error). Once confirmed, mark it as good:

    Terminal
    $ git bisect good <commit-hash>

    Replace <commit-hash> with the identifier of the known good commit (you can also use a tag or branch name). For example, if v2.0 was good, use its commit hash or tag: git bisect good v2.0. After this, Git has two endpoints for the search: one bad commit and one good commit. Git will now automatically choose a midpoint commit between them. It will check out that commit and prompt you to test it. You should see output indicating the progress of the bisect, for example:

    Terminal
    Bisecting: 3 revisions left to test after this (roughly 1 step)
    [0023cdddf42d916bd7e3d0a279c1f36bfc8a051b] Changing page structure

    In the above message (your actual output will vary), Git is telling us it checked out an intermediate commit (with hash 0023cdd...) and that there are 3 more revisions left to test after this one. The commit message “Changing page structure” is shown for context. Now it’s our turn to evaluate this commit.

  4. Test the chosen commit: At this point, you need to determine if the bug is present in the commit Git has checked out. In our scenario, you would run myapp generate-report with the current code. There are two possibilities: either the command runs without error (meaning this commit is good – the bug has not yet manifested), or the command fails in the same way (meaning this commit is bad – the bug is present here). Based on the outcome, inform Git of the result:

    • If the commit does not have the bug (feature works correctly), run git bisect good.
    • If the commit still has the bug, run git bisect bad.

    By marking the current commit good or bad, you’re telling Git which half of the commit range the bug must lie in. Git will then drop all commits on one side of this commit from consideration and automatically check out the next commit halfway through the remaining range. Each time you mark a commit, Git narrows the search space and gives a new “midpoint” commit to test. For example, after one iteration, you might see:

    Terminal
    Bisecting: 1 revision left to test after this (roughly 0 steps)
    [abcd1234ef56789...] Some feature commit

    You would again test this new commit and mark it good or bad. (If you ever realize you marked a commit incorrectly, you can use git bisect log and git bisect replay to undo or redo steps, but that’s an advanced use beyond our main steps here.) Continue this process of testing and marking commits. Git is essentially guiding you through the commit history like a binary search wizard, eliminating half of the commits in each step.

  5. Identify the first bad commit: Eventually, the range will narrow down to a single commit – the point where the status flips from good to bad. When git bisect has enough information, it will stop and output the details of the first bad commit (the culprit). For example, you might see output like this once the bisect completes:

    Terminal
    e4203915d6639fdc7028d69a9cc773c2fc2b584b is the first bad commit
    commit e4203915d6639fdc7028d69a9cc773c2fc2b584b
    Author: Alice Dev <alice@example.com>
    Date: Fri Jul 11 10:15:42 2025 -0400
    Add new output format (introduce bug in report generation)
    src/report_generator.py | 10 +++++++---
    1 file changed, 7 insertions(+), 3 deletions(-)

    Git has identified the first commit where the bug appears. In this example, the commit message hints that a change to the report generator introduced the bug. Now we know exactly which commit caused the regression.

  6. Reset to normal Git mode: After finding the bad commit, end the bisect session by running:

    Terminal
    $ git bisect reset

    This will check out your repository back to the state (branch/commit) you were on before you started the bisect. It cleans up the bisection state so you can resume normal Git operations. If you forget this step, Git will remind you (for instance, git status will show a message that you are still bisecting). You can also optionally provide a commit to git bisect reset (e.g. git bisect reset HEAD to stay on the current commit, or a specific hash), but simply running it with no arguments is the common case to return to your original branch.

At this point, we have the exact commit that introduced the bug. You can use commands like git show <bad_commit> or open your diff viewer to inspect what changed in that commit and start working on a fix. The above process might seem like several steps, but in practice git bisect greatly reduces the manual work. If there were 100 commits between the good and bad points, you might only need to test ~7 commits to find the culprit (because 2^7 ≈ 128). It’s a huge time-saver on large projects.

Manually checking out and testing each commit can be tedious, especially if the test is complex or if there are many bisection steps. That’s where git bisect run comes in. This command lets you plug in an automated test script or command, so Git will perform the bisection for you—no manual intervention needed at each step.

How it works: You provide git bisect run with a command (or path to a script) that can determine whether the current commit is good or bad. Git will then sequentially checkout each candidate commit and execute your test command. Based on the exit code of that command, Git will mark the commit as good or bad and continue the bisect automatically.

The conventions for the test command’s exit code are:

  • Exit code 0: indicates the commit is good (no bug).
  • Exit code between 1 and 127 (inclusive) except 125: indicates the commit is bad (bug detected).
  • Exit code 125: a special code meaning “skip this commit.” Git will exclude that commit and continue bisecting. This is useful if the commit can’t be tested (for example, it doesn’t compile or the test itself fails to run for reasons unrelated to the bug).
  • Any other exit code (outside 0-127) will abort the bisect process entirely.

Writing a test script: Your script can be as simple or complex as needed to detect the bug. It should perform an operation that behaves differently in good vs. bad commits. In our CLI example, we might write a script that runs myapp generate-report and checks if it succeeds. For instance, create a file test_report.sh like:

Terminal
#!/bin/bash
# Test if 'myapp generate-report' runs without error
if myapp generate-report > /dev/null 2>&1; then
# Command succeeded, assume bug is NOT present
exit 0 # good commit
else
# Command failed (non-zero exit), bug is present
exit 1 # bad commit
fi

This script runs the command and redirects output to avoid clutter. If the command exits successfully, we exit 0 (marking the commit as good). If it errors out, we exit with 1 (marking it bad). Make sure to give your script execute permissions (chmod +x test_report.sh).

Now you can run the bisect in one go by doing:

Terminal
$ git bisect start
$ git bisect bad HEAD # mark HEAD as bad
$ git bisect good v2.0 # mark known good commit
$ git bisect run ./test_report.sh

Git will now automatically iterate through the commits, running test_report.sh on each one. You’ll see output for each step, for example:

Terminal
running ./test_report.sh
Bisecting: 4 revisions left to test after this (roughly 2 steps)
[commit1234abcd] Feature X implementation
running ./test_report.sh
Bisecting: 2 revisions left to test after this (roughly 1 step)
[commit5678efgh] Refactor Y
running ./test_report.sh
Bisecting: 0 revisions left to test after this (roughly 0 steps)
[commit9abc0123] Add new output format (introduce bug in report generation)
running ./test_report.sh
commit9abc0123 is the first bad commit

In the above log, Git automatically marked commits good or bad based on our script’s result, and ultimately identified the first bad commit. It then stops, showing the bad commit details just like in the manual process (author, date, message, diff stats). A message “bisect run success” may appear to indicate the automated bisection completed successfully. Finally, don’t forget to run git bisect reset to return to normal HEAD.

Using git bisect run is extremely powerful for scenarios where you can programmatically detect the bug (for example, via a test suite, script, or even a curl command to an endpoint if the bug is observable externally). It saves you from having to babysit the bisect process. Just be sure that your test is reliable — it should consistently report the correct status for each commit. If the test itself is flaky, it could mislead the bisect.

While git bisect is a robust tool, there are some caveats and tips to keep in mind for best results:

  • Reproducibility of the bug: Bisect assumes that you can reliably determine whether a commit is good or bad. If the bug is intermittent or depends on external factors (like a race condition or a flaky network call), you might mistakenly mark a commit as good when it actually just didn’t manifest the bug in that run. This can lead to incorrect results. If you suspect nondeterministic behavior (e.g. a timing-related bug), run tests multiple times or add instrumentation to be sure. A failure of git bisect to pinpoint a commit might itself hint at an unstable bug or environment issue.

  • External dependencies and environment: If running old commits requires a different environment (for example, older dependencies, database migrations, etc.), setting up each commit for testing can be difficult. In our CLI example, if older commits require an old config or data file format, you might need to adjust your test or environment. In some cases, you may skip certain commits if they can’t be tested at all. Use git bisect skip to tell Git to ignore a commit that is not testable (e.g. git bisect skip on the current commit, or even skip a range with git bisect skip <bad_range_start>..<bad_range_end> if you know a whole set of commits won’t build/run). Skipping too many commits can reduce bisect’s effectiveness, so use this sparingly.

  • Non-linear history (merges): git bisect works best on a linear commit history. If your project uses frequent merges, the bug may have been introduced in a feature branch and only manifested after merging into main. Git bisect can still handle this, but it might point to a merge commit as the first bad commit if the bug appears only when two lines of development combine. This is expected – it means the integration of the branches caused the issue (perhaps a merge conflict resolution or interaction between features). However, debugging a large merge commit can be tricky. Best practice in that case is to identify which branch introduced the problematic change. You might need to run a secondary bisect within the feature branch that was merged, or examine the differences introduced by the merge. Some developers avoid complex merge commits by using rebase workflows, but note that rebasing will rewrite history. If you had a merge commit that would have been flagged as the culprit, and you instead rebased, you effectively bury that information. In a pure linear history created by rebasing, the “first bad commit” might end up being a consolidated commit that’s harder to interpret. In summary: merges can cause the bug to appear at the merge point, and rebasing can make it harder to trace which change introduced a bug. Keep this in mind when analyzing bisect results in a non-linear history.

  • Choosing good and bad commits: Ensure the “good” commit you choose truly doesn’t have the bug and is an ancestor of the bad commit. If you accidentally pick a commit that isn’t actually in the direct history of the bad commit, Git will get confused or might error. A quick way to find a candidate is using git log or tags. If you’re not sure how far back to go, you can do a manual binary search on dates (e.g., test a commit from a month ago to see if bug was present, then adjust accordingly). The further apart your initial good and bad points are, the more steps bisect might take, but it will still be much faster than linear search.

  • Frequency of commits and tests: Using git bisect is even more effective if your commit increments are small and tested. In a codebase where each commit passes a test suite, it’s easier to pinpoint which commit introduced a failing test. This ties into good practices like committing frequently and using continuous integration. Each commit should ideally represent a logical change; then bisect can cleanly identify the faulty change.

  • Logging and debugging info: Sometimes, it’s useful to add logging or debugging printouts temporarily while bisecting, especially for tricky bugs. You can commit a temporary test or log, use bisect to find the bug, then reset those changes after. Just be cautious not to alter the code’s behavior in a way that masks the bug.

By keeping these points in mind, you’ll use git bisect more effectively and interpret its results correctly.

If your team uses Graphite for a stacked Git workflow, it can complement git bisect and sometimes even reduce the need for it. Graphite’s tooling encourages organizing changes into stacked pull requests (small, incremental PRs), which makes it easier to spot where a bug was introduced. For example, if a bug appears after a stack of 5 small commits is merged, you might already have a good idea which commit in that stack is the culprit. Graphite’s gt log command can show a clear view of your stack’s commit ancestry, helping to quickly identify the commit where things went wrong. Each commit in a Graphite stack typically has an associated description and diffs, so you can inspect the stack to find suspect changes.

Graphite also provides automation that can aid in debugging. Notably, organizations using Graphite’s merge queue benefit from an automatic bisect-like mechanism when batching commits: if multiple PRs are merged as a batch and the batch fails CI, Graphite can automatically bisect the batch to find the breaking change. This means Graphite will test subsets of the batch to isolate which PR (and thus which commit) introduced the problem, saving developers time. In effect, Graphite’s infrastructure is doing a form of git bisect on your behalf at the PR level.

In our context of using git bisect for a single repository, Graphite doesn’t replace git bisect—you’d still use git bisect for pinpointing issues within a series of commits. However, Graphite’s emphasis on small, reviewable commits and its tooling (like gt stack and gt log) can make it more apparent where bugs likely entered. Think of it as an enhancement: if you’re already following a stacked workflow with Graphite, you might need fewer guesses to find a bad commit because your changes are well-organized. And when you do need to run git bisect, there’s a good chance the range of commits to search is smaller or limited to a particular stack.

Tip: If you integrate Graphite’s workflow, continue to use commit messages and PR descriptions to document changes. This documentation, combined with bisect, can expedite debugging. For instance, when git bisect surfaces a commit, a good commit message will immediately tell you what changed and perhaps why. Graphite encourages meaningful commit messages (since each commit often goes through code review), which is a best practice for all Git use, not just Graphite.

In summary, Graphite CLI can enhance the debugging workflow by keeping commits organized and leveraging tools that automatically isolate bad commits in a group. Teams already using Graphite should take advantage of those features alongside git bisect for the fastest path to identify and fix regressions.

Using git bisect can be a game-changer when tracking down elusive bugs. By systematically narrowing the commit range, it finds the exact commit that introduced a problem, saving you from guesswork. We walked through a practical example of debugging a CLI tool with bisect, explored how to automate the process with git bisect run for efficiency, and covered important caveats to watch out for. With these skills, you can tackle regressions in your codebase more confidently. Whether you’re working in a large team or solo on a project, knowing how to leverage git bisect is like having a detective’s magnifying glass in your Git toolkit – it zeroes in on the culprit commit so you can focus on crafting a fix.

Built for the world's fastest engineering teams, now available for everyone