Read Anthropic’s case study about Graphite Reviewer

Caching in GitHub Actions

Kenny DuMez
Kenny DuMez
Graphite software engineer

Caching in GitHub Actions helps optimize your CI/CD workflows by saving time and reducing network traffic. By storing and reusing parts of your development environment—like dependencies, build outputs, and Docker layers—you can significantly speed up your workflow execution. This guide will explore how to use caching effectively with GitHub Actions, focusing on Docker, Node.js dependencies, and general cache management.

Caching in GitHub Actions works by storing data between workflow runs, allowing subsequent runs to reuse this data and reduce execution time. GitHub Actions saves and restore caches by using a cache key string that uniquely identifies the cache.

The cache key is critical because GitHub Actions uses it to check if there is a matching cache available. If a match is found, the cached data is restored to the specified path. If not, the action proceeds without cache, and you have the option to create a new cache after the steps are completed.

For example, when using Docker-based workflows, you can cache individual Docker layers. This can significantly speed up build times by reusing Docker layers instead of rebuilding them on every run.

  1. Using Docker Buildx: Docker Buildx is an extended version of the native Docker build command with additional features. It's recommended for caching because it supports advanced caching mechanisms.

  2. Configure GitHub Actions: Here's a basic setup to cache Docker layers using Buildx and GitHub Actions:

    Terminal
    jobs:
    build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Set up Docker Buildx
    uses: docker/setup-buildx-action@v2
    - name: Cache Docker layers
    uses: actions/cache@v3
    with:
    path: /tmp/.buildx-cache
    key: ${{ runner.os }}-buildx-${{ hashFiles('**/Dockerfile') }}
    restore-keys: |
    ${{ runner.os }}-buildx-
    - name: Build and cache Docker images
    uses: docker/build-push-action@v3
    with:
    context: .
    file: ./Dockerfile
    push: false
    tags: user/app:latest
    cache-from: type=local,src=/tmp/.buildx-cache
    cache-to: type=local,dest=/tmp/.buildx-cache,new=true

This configuration uses the actions/cache action to cache the Docker build context and layers, utilizing the Buildx cache-from and cache-to options for effective layer caching. This makes it so that you do not have to rebuild these layers on each run, saving time and compute resources.

Caching node_modules can significantly speed up Node.js builds by avoiding redundant dependency downloads.

  1. Basic Node.js Caching: To leverage Node caching, add these steps to your workflow to cache and restore node_modules:

    Terminal
    jobs:
    build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Cache node_modules
    uses: actions/cache@v3
    with:
    path: node_modules
    key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
    restore-keys: |
    ${{ runner.os }}-node-
    - name: Install Dependencies
    run: npm install
    - name: Build
    run: npm run build

This setup caches node_modules based on the hash of the package-lock.json, ensuring that the cache is updated whenever dependencies change.

While caches are automatically deleted, either after they haven't been accessed in over a week or when the total cache size of the repository exceeds the limit set by GitHub, you can also delete them manually using the GitHub Actions cache API.

First, you can list all of the caches for a repository using curl:

Terminal
$ curl \
-H "Accept: application/vnd.github.v3+json" \
-H "Authorization: token <YOUR_GITHUB_TOKEN>" \
https://api.github.com/repos/<REPO_OWNER>/<YOUR_REPO_NAME>/actions/caches

Then, also using curl you can delete individual repository caches by ID:

Terminal
$ curl \
-X DELETE \
-H "Accept: application/vnd.github.v3+json" \
-H "Authorization: token <YOUR_GITHUB_TOKEN>" \
https://api.github.com/repos/<REPO_OWNER>/<YOUR_REPO_name>/actions/caches/<SELECTED_CACHE_ID>

Specifying the correct path for dependency caching is crucial. This path varies depending on the language and dependency manager used. For instance:

  • Node.js (node_modules): Typically, the path is simply node_modules.
  • Python (pip): The path might be ~/.cache/pip.
  • Java (Maven): The path is commonly ~/.m2/repository.

For example:

Terminal
- name: Cache Python dependencies
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: ${{ runner.os }}-pip-${{ hashFiles('**/requirements.txt') }}

This configuration ensures that Python dependencies are cached based on the contents of requirements.txt.

For further reading see the official GitHub documentation on caching.

Graphite
Git stacked on GitHub

Stacked pull requests are easier to read, easier to write, and easier to manage.
Teams that stack ship better software, faster.

Or install our CLI.
Product Screenshot 1
Product Screenshot 2