Part VI - Caching NuGet packages for improved build times on Azure
This article is part of a series for setting up the single pipeline}
I was having to run a lot of builds, as I had to correct each mistake in the build file, run a build for its pull request, and another for the merge. The amount of time I had to spend sitting around waiting for these to complete was annoying me. Once I got everything working, it was time to look at speeding up the builds. I’d found an article a while back on Azure Pipeline Caching that I’d filed away for future me to look at. Thanks, past me!
Azure Pipeline Caching makes a hash out of each package.lock.json file it finds in the repo, and then I think creates a “superhash” out of those hashes to use as the cache key. In this way, if a single change is made to any of the packages in the solution, then the superhash will be different and there will be a cache miss. Otherwise, it just downloads the zipped packages from the cache and installs them from that, hopefully speeding up the process.
What package.lock.json files? #
I’d set up all the projects in this microservice to use package references in the project file, so I didn’t have any lock files. I’m not the biggest fan of lock files, to be honest, as they’re a potential source of annoying merge conflicts. However, given that I am the development team, that shouldn’t be a problem too often. Adding the lock files is a simple matter of adding the following to the PropertyGroup section of each project file:
<RestorePackagesWithLockFile>true</RestorePackagesWithLockFile>
Build the solution, and you’ll now have lock files for each project that has NuGet packages.
The Cache task #
For this, I just took the example from the official page:
- task: Cache@2
displayName: Cache NuGet packages
inputs:
key: 'nuget | "$(Agent.OS)" | $(Build.SourcesDirectory)/**/packages.lock.json'
restoreKeys: |
nuget | "$(Agent.OS)"
nuget
path: $(NUGET_PACKAGES)
Add this task before the Install Packages task (obviously).
You will need to add the following variable to the variables section of the build file:
NUGET_PACKAGES: $(Pipeline.Workspace)/.nuget/packages
Having got these checked in. the caching task found the lock files, created a superhash, and cached what looked like a half gigabyte zip file of all the packages.
However, every time it attempted to build, it reported a cache miss. Initially, this baffled me, as I knew that the task was correctly finding all 4 lock files. However, looking at the logs of the auto-generated step near the end of the build that was caching the packages, it was additionally finding a lock file in the built folder.
Checking through the properties of all the lock files, sure enough, one of them was set to copy to the output folder, which meant it was being copied into the built code folder. Hopefully this isn’t a problem you’ll encounter, but I thought I’d mention it in case it happens to anyone else. I changed that to Do not copy, and the caching now worked as expected.
Success, with a caveat #
Or almost as expected! Having got it working with the build pipeline for my pull requests, I added it to the build and release pipeline, and the first time it ran it didn’t find the packages in the cache. Even though it had the same cache key as the PR build had used to store the packages in the cache.
It transpires that the cache is per pipeline and per branch, not per Azure DevOps project which was what I was expecting. This is due to security concerns, which I do get, though I’m not convinced a blanket nixing of this feature is the only solution.
Addendum: Ignore the caveat #
So far, the pull request build doesn’t always get a cache hit, even though the cache key hasn’t changed. I haven’t got to the root of why this happens; I don’t think the cache is timing out (these builds were twenty minutes apart), and whilst PR builds are cached per branch, these were on the same branch. Whilst the latter constraint does hinder the usefulness of the feature quite considerably, at least the main build will regularly benefit from caching.
Interestingly, however, even with a cache miss this task speeds up the builds regardless. Looking through the logs, the difference appears to be that when I have the task in the build, all 4 projects have their packages restored concurrently, whereas only 2 projects are restored concurrently normally. I haven’t been able to find anything about setting the concurrency level for nuget restore in the documentation, so I’m not sure how Azure is setting this, but I’m happy to take a win here whatever the mechanism.
Next, we’’l look at speeding up the unit test task.