What is git filter-repo?
git filter-repo
is a Python script that allows for fast and comprehensive rewriting of repository history. The script operates by scanning the entire history of a repository and applying modifications (like removing files), replacing text in files, or changing old commit/email details. It's often used to remove sensitive data, change old commit messages, reduce repository size by excluding unwanted files, or restructure the repository layout.
Key features:
- Operates considerably faster than
git filter-branch
. - Simpler syntax and more focused design.
- Safety features to avoid common pitfalls (like accidentally rewriting recent history).
- Can act on all branches in a repository simultaneously with the
--all
flag.
Installation
Before you can use git filter-repo
, you need to install it. If you have Python installed, you can easily install git filter-repo
using pip:
pip install git-filter-repo
Basic usage of git filter-repo
All of the commands below involve rewriting history of the local repository. In order for these changes to take affect, you will need to run git push --force
after each of them. This will force update the history of the remote upstream repository to reflect the new, altered history of the local repo. This should be done with caution as this is a potentially destructive operation, and may be forbidden on certain repositories.
Before rewriting history in this manner, make sure to contact your Git repo's administrator.
Filtering all branches
To apply filters across all branches in your repository, use the --all
flag. This is useful for global changes, such as completely removing a file from every branch and tag:
git filter-repo --path unwanted_file --invert-paths --all
--path unwanted_file
: Specifies the path or file that you want to focus on in the repository.--invert-paths
: Modifies the behavior of the filter to affect all paths except those specified by the--path
option. Essentially, this tells Git to keep everything except the specifiedunwanted_file
.--all
: Applies the filter to all branches and tags in the repository.
This command will delete all traces of unwanted_file
from every commit across all branches and tags in the repo.
Renaming a directory
To rename a directory in the entire history of your repository, you can use:
git filter-repo --path-rename old_directory_name:new_directory_name
This command renames old_directory_name
to new_directory_name
across all commits.
Moving files to a subdirectory
If you need to move a set of files into a subdirectory, you can use the --path-rename
option:
git filter-repo --path-rename "root_file.txt:subdirectory/root_file.txt"
This command moves root_file.txt
from the root directory of the repository into subdirectory
.
Examples of git filter-repo
To remove a file from every commit across the history of your repository, run:
git filter-repo --invert-paths --path file_to_delete.txt
git filter-branch vs. filter-repo
git filter-branch
is an older tool that can be used for similar purposes as git filter-repo
, but it is generally slower and less user-friendly. git filter-branch
should be considered fully deprecated and you should instead use git filter-repo
for all of your repository rewriting needs.
For further reading, see the official documentation for git filter-repo
.