GitHub Action caching

Written on July 11, 2020

I’m always interested in making builds faster!

If your builds run on self-hosted runners then you can persist files between builds so caching is of limited value (or may even make builds slower). However when using a GitHub-hosted runner (build agent) every build gets a brand new VM. It can take a while for dependencies to be restored (eg. NuGet, NPM or similar), and this has to happen every time a build runs. Being able to cache these dependencies and restore them quickly can potentially make a big difference.

I’ve started adding the Cache task to my Azure Pipelines builds where I can. The equivalent for GitHub Actions is the Cache action.

These both work in a similar way. You indicate a path whose contents you want to cache for future builds, and a key which is used to determine when the cache is stale.

Here’s the cache action that I’m using for my Show Missing extension.


    - uses: actions/[email protected]
      with:
        path: ${{ github.workspace }}/.nuget/packages
        key: ${{ runner.os }}-nuget-${{ hashFiles('**/packages.lock.json') }}
        restore-keys: |
          ${{ runner.os }}-nuget-

The first time you run a build with caching enabled, it won’t appear to run any faster. In fact it might take slightly longer, as when just before the build completes, the cache action will bundle up all the files underneath the path specified and save them.

Subsequent builds will then download and restore the dependencies. Because this is done efficiently (one tar.gz file to download and extract, and the cache presumably lives relatively close to the runner VM), it will usually be a lot faster than relying on the normal package restore process.

For NuGet packages, you need to have key paths that can indicate when the cache should be updated. Whilst Visual Studio extensions don’t yet support the new ‘SDK-style’ project format, you can still make use of PackageReference, and if you use nuget.exe 4.9 or above, then you can create and use packages.lock.json files. If I hadn’t updated to PackageReference, then the old packages.config would probably work just as well.

Here’s a build with no caching. It took 1m 54s.

Build without cache

 Committing restore...
Generating MSBuild file D:\a\VsShowMissing\VsShowMissing\VS2019\obj\VS2019.csproj.nuget.g.props.
Generating MSBuild file D:\a\VsShowMissing\VsShowMissing\VS2019\obj\VS2019.csproj.nuget.g.targets.
Writing assets file to disk. Path: D:\a\VsShowMissing\VsShowMissing\VS2019\obj\project.assets.json
Restored D:\a\VsShowMissing\VsShowMissing\VS2019\VS2019.csproj (in 8.74 sec).

NuGet Config files used:
    D:\a\VsShowMissing\VsShowMissing\NuGet.Config
    C:\Users\runneradmin\AppData\Roaming\NuGet\NuGet.Config
    C:\Program Files (x86)\NuGet\Config\Microsoft.VisualStudio.Offline.config
    C:\Program Files (x86)\NuGet\Config\Xamarin.Offline.config

Feeds used:
    C:\Program Files (x86)\Microsoft SDKs\NuGetPackages\
    https://api.nuget.org/v3/index.json

Installed:
    45 package(s) to D:\a\VsShowMissing\VsShowMissing\Gardiner.VsShowMissing\Gardiner.VsShowMissing.csproj
    109 package(s) to D:\a\VsShowMissing\VsShowMissing\VS2019\VS2019.csproj

The first time we add the cache (2m)

First build with cache

The cache task logs that there is currently nothing to restore

Run actions/[email protected]
Cache not found for input keys: Windows-nuget2-3881b0e254e4b0c4e40edd9efa8c26dfd9f5c93c42dad979f6c5869b765a72d0, Windows-nuget2-

But notice there’s a second post-build step for the cache. File to be cache are added to a tar file and that is then saved.

Post Run actions/[email protected]
Cache saved successfully
Post job cleanup.
C:\windows\System32\tar.exe -z -cf cache.tgz -P -C d:/a/VsShowMissing/VsShowMissing --files-from manifest.txt
Cache saved successfully

And now subsequent builds use the cache.

Second build with cache

You can see the cache action does a restore:

Run actions/[email protected]
Cache Size: ~116 MB (122108071 B)
C:\windows\System32\tar.exe -z -xf d:/a/_temp/50db9096-3e6c-4681-8753-3e12e33854f1/cache.tgz -P -C d:/a/VsShowMissing/VsShowMissing
Cache restored from key: Windows-nuget2-3881b0e254e4b0c4e40edd9efa8c26dfd9f5c93c42dad979f6c5869b765a72d0

and the output from the nuget restore is a bit different:

 MSBuild auto-detection: using msbuild version '16.6.0.22303' from 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\MSBuild\Current\bin'.
Restoring packages for D:\a\VsShowMissing\VsShowMissing\VS2019\VS2019.csproj...
Restoring packages for D:\a\VsShowMissing\VsShowMissing\Gardiner.VsShowMissing\Gardiner.VsShowMissing.csproj...
Committing restore...
Committing restore...
Generating MSBuild file D:\a\VsShowMissing\VsShowMissing\VS2019\obj\VS2019.csproj.nuget.g.props.
Generating MSBuild file D:\a\VsShowMissing\VsShowMissing\Gardiner.VsShowMissing\obj\Gardiner.VsShowMissing.csproj.nuget.g.props.
Generating MSBuild file D:\a\VsShowMissing\VsShowMissing\VS2019\obj\VS2019.csproj.nuget.g.targets.
Generating MSBuild file D:\a\VsShowMissing\VsShowMissing\Gardiner.VsShowMissing\obj\Gardiner.VsShowMissing.csproj.nuget.g.targets.
Writing assets file to disk. Path: D:\a\VsShowMissing\VsShowMissing\VS2019\obj\project.assets.json
Writing assets file to disk. Path: D:\a\VsShowMissing\VsShowMissing\Gardiner.VsShowMissing\obj\project.assets.json
Restored D:\a\VsShowMissing\VsShowMissing\VS2019\VS2019.csproj (in 844 ms).
Restored D:\a\VsShowMissing\VsShowMissing\Gardiner.VsShowMissing\Gardiner.VsShowMissing.csproj (in 845 ms).

NuGet Config files used:
    D:\a\VsShowMissing\VsShowMissing\NuGet.Config
    C:\Users\runneradmin\AppData\Roaming\NuGet\NuGet.Config
    C:\Program Files (x86)\NuGet\Config\Microsoft.VisualStudio.Offline.config
    C:\Program Files (x86)\NuGet\Config\Xamarin.Offline.config

Feeds used:
    C:\Program Files (x86)\Microsoft SDKs\NuGetPackages\
    https://api.nuget.org/v3/index.json

But wait.. that build took 2m 4s! What gives? That’s slower than the first time!

Yeah, that is odd. So a couple of thoughts:

  • Do measure if adding a cache actually makes a difference.
  • The speed of the runner VMs does vary a bit. In that last run, notice that the restore was slightly faster but the build step was quite a bit slower.
  • Possibly you might get different better results from SDK projects?

I have a theory that in my case there aren’t a huge amount of dependencies, so the time saved downloading them separately isn’t dramatically different to the cache restoring them all. But when you have a lot of dependencies (and NPM packages are likely to be a good example), or the the download speed of all those dependencies is limited, then single large download vs lots of smaller separate downloads should give definite advantages.

Categories: GitHub Actions