Read Anthropic’s case study about Graphite Reviewer

Using the git filter-repo tool

Kenny DuMez
Kenny DuMez
Graphite software engineer
Try Graphite


Note

This guide explains this concept in vanilla Git. For Graphite documentation, see our CLI docs.


git filter-repo is a Python script that allows for fast and comprehensive rewriting of repository history. The script operates by scanning the entire history of a repository and applying modifications (like removing files), replacing text in files, or changing old commit/email details. It's often used to remove sensitive data, change old commit messages, reduce repository size by excluding unwanted files, or restructure the repository layout.

  • Operates considerably faster than git filter-branch.
  • Simpler syntax and more focused design.
  • Safety features to avoid common pitfalls (like accidentally rewriting recent history).
  • Can act on all branches in a repository simultaneously with the --all flag.

Before you can use git filter-repo, you need to install it. If you have Python installed, you can easily install git filter-repo using pip:

Terminal
pip install git-filter-repo

All of the commands below involve rewriting history of the local repository. In order for these changes to take affect, you will need to run git push --force after each of them. This will force update the history of the remote upstream repository to reflect the new, altered history of the local repo. This should be done with caution as this is a potentially destructive operation, and may be forbidden on certain repositories.

Before rewriting history in this manner, make sure to contact your Git repo's administrator.

To apply filters across all branches in your repository, use the --all flag. This is useful for global changes, such as completely removing a file from every branch and tag:

Terminal
git filter-repo --path unwanted_file --invert-paths --all
  • --path unwanted_file: Specifies the path or file that you want to focus on in the repository.
  • --invert-paths: Modifies the behavior of the filter to affect all paths except those specified by the --path option. Essentially, this tells Git to keep everything except the specified unwanted_file.
  • --all: Applies the filter to all branches and tags in the repository.

This command will delete all traces of unwanted_file from every commit across all branches and tags in the repo.

To rename a directory in the entire history of your repository, you can use:

Terminal
git filter-repo --path-rename old_directory_name:new_directory_name

This command renames old_directory_name to new_directory_name across all commits.

If you need to move a set of files into a subdirectory, you can use the --path-rename option:

Terminal
git filter-repo --path-rename "root_file.txt:subdirectory/root_file.txt"

This command moves root_file.txt from the root directory of the repository into subdirectory.

To remove a file from every commit across the history of your repository, run:

Terminal
git filter-repo --invert-paths --path file_to_delete.txt

git filter-branch is an older tool that can be used for similar purposes as git filter-repo, but it is generally slower and less user-friendly. git filter-branch should be considered fully deprecated and you should instead use git filter-repo for all of your repository rewriting needs.

For further reading, see the official documentation for git filter-repo.

Built for the world's fastest engineering teams, now available for everyone