As documented in some of my past posts, I’ve been using Docker on CircleCI a lot recently, and I thought it was about time I investigated adding Docker layer caching to the mix. This is one of Docker’s best features, in my view: if a layer does not need building, because the command and the build context are unchanged, then a cached layer will be reused.
This feature comes for free when building images locally, since Docker leaves old intermediate layer images on the disk by default. However, in continuous integration systems, a completely fresh virtual environment is spun up every time, effectively wiping the Docker layer cache for every build. So, how can this be fixed?
Considerations
When building images locally, intermediate Docker layers tend to get stored without limit, at least until the user realises they could do with recovering many gigabytes of uselessly consumed disk space! However, having all these layers available means that if a layer has been generated previously, it is likely to be available for reuse.
In a continuous integration context, the engineer has to make more of a conscious decision about how to select images for the purposes of caching. These are:
- The most conservative choice is just to store the layers that make the current image (where the Dockerfile is multi-stage, this should include the layers in non-output builds)
- All images in the build cache for the last X days. In this scenario, the existing layers are loaded, a new build is run, which may create new image layers. Those new images are stored in the cache, and old ones are purged
The benefit of the second option is that the a sequence of changes made to a repository can, in theory, require old layers. For example:
- State 1 contains layers: A, B, C
- State 2 contains layers: A, D, E,
- State 3 contains layers: A, B, F (by virtue of reverting a change)
However, I have taken the view that changes will not often do this, and that as the purge delay increases, there is a decreasing chance that those layers will ever be used. I will therefore stick with just option 1 for now.
Native CircleCI solution
My first attempt at this made use of the CircleCI caching feature, which is available for both free-tier and commercial users. Interestingly, CircleCI do support layer caching, but it’s only available for commercial users, and attracts an extra fee. It seems a bit odd that something with a relatively easy free workaround would be chargeable, but perhaps their approach is intended to reduce the storage overhead for free-tier users, which, as I said earlier, can be substantial with Docker.
CircleCI caching feature
CircleCI provide a file cache for users at all tiers, to help reduce build times. This is often used to store dependency tool temp files to reduce download times, and as it turns out, it can be used to store a Docker layer cache also.
I present here the configuration, and then I’ll walk through it:
- restore_cache: keys: - v1-{{ .Branch }} - run: name: Load Docker image layer cache command: | set +o pipefail if [ -f /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz ]; then gunzip -c /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz | docker load; docker images; fi - run: name: Build application Docker image command: | docker build --tag ${CIRCLE_PROJECT_REPONAME} . - run: name: Save Docker image layer cache command: | mkdir -p /caches # See here: https://github.com/mozmeao/snippets-service/pull/208/files # and here: https://stackoverflow.com/q/49965396 # The build commands here will be completely cached, and so very quick # docker build --tag ${CIRCLE_PROJECT_REPONAME} . | grep '\-\-\->' | grep -v 'Using cache' | sed -e 's/[ >-]//g' > /tmp/layers.txt docker save $(cat /tmp/layers.txt) | gzip < /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz - save_cache: key: v1-{{ .Branch }} paths: - /caches/
The four sections can be thought of as two pairs of actions:
- Restore the file cache and load it into Docker if it exists
- Convert the Docker layer cache into a file and save it in the file cache
In relation to the outer statements, restoring and saving the file cache, you’ll notice this uses a key involving a version number and the name of the branch. This means a cache for v1-master will be ignored if we change to another branch or another version. The exact format doesn’t matter here – for example it could be an arbitrary version number only. The version number is useful to break the cache – if we want to re-generate the image from scratch, just bump up the version number.
The two inner statements will, respectively, load the file cache into Docker, and export the Docker layer cache into CircleCI’s file cache. When loading the cache, a conditional is used, since it is expected that sometimes the file will not exist.
The saving of the cache can be done in a number of ways. The simplest way is to use docker history -q ${CIRCLE_PROJECT_REPONAME}
, which would list the layer hashes of the named image. However, I have a multi-stage build, and history
won’t list any layers other than those in the output image. For layer caching to work completely, all layer hashes must be found and stored, and a convenient way to do that is to build (again) and parse the output for layer hashes.
Unfortunately, there is some problems with this solution:
- Builds on a fresh branch would likely be able to re-use some layers, but here those layers will not be loaded, and so the build will not benefit from a cache that has already been generated.
- The
save_cache
directive does not write anything if the key exists already. This means that if a change occurs that results in some layers needing fresh generation, they will not be added to the cache.
The first problem can, admittedly, be resolved by removing the branch name. However, the second one is a rather harder: if a filled CircleCI cache is never updated, then the missing Docker layers will always need regeneration until the cache key is modified (by bumping the version number). As far as I know there is no way to get the save directive to always write the cache contents.
Remote image cache
So, let’s examine another solution. In this one, I’ll retrieve the layers as images for each build stage, and use a Docker registry as a cache.
- run: name: Load Docker image layer caches command: | set +o pipefail docker pull ${CACHE_REGISTRY}:base || true docker pull ${CACHE_REGISTRY}:output || true - run: name: Pull base images command: | docker pull alpine:3.6@sha256:3d44fa76c2c83ed9296e4508b436ff583397cac0f4bad85c2b4ecc193ddb5106 - run: name: Build application Docker images command: | # This is the output build docker build \ --cache-from ${CACHE_REGISTRY}:base \ --cache-from ${CACHE_REGISTRY}:output \ --tag ${CIRCLE_PROJECT_REPONAME} \ . # This is the first stage build docker build \ --cache-from ${CACHE_REGISTRY}:base \ --cache-from ${CACHE_REGISTRY}:output \ --target build \ --tag docker-cache-base \ . - run: name: Save Docker image layer cache command: | # Use a spare registry to contain a layer cache # # Here's the cache for the base layer: docker tag docker-cache-base ${CACHE_REGISTRY}:base docker push ${CACHE_REGISTRY}:base # Here's the main layer: docker tag ${CIRCLE_PROJECT_REPONAME} ${CACHE_REGISTRY}:output docker push ${CACHE_REGISTRY}:output - run: name: Tag and push Docker output image to main registry command: | sh .circleci/push.sh
Here are some notes on this approach. Note that the Dockerfile in this case is comprised of two stages, and they both need to be cached.
- The images are pulled from the cache to start with. The
|| true
device prevents them from stopping the build if the images do not exist - The two build stages are currently based on Alpine 3.6. Since this will be pulled from Docker Hub for every build, I have specified a fixed hash, to give the image more version stability (since the image is not for a mission-critical task though, I may yet pull the latest release, in the hope that there are no breaking changes)
- I then build the primary output stage first. Any non-output stages should be built afterwards, since they will benefit from the Docker cache. TheĀ
--target
option here is used to specify what part of the build is required - If the builds are successful, they can be tagged and pushed, to a “cache” repo (which I store in the
${CACHE_REGISTRY}
env var) - Finally, I use a shell script to search the target Docker registry for the latest build hash, and push a new version tag if the hash does not exist
Potential improvements
The problem with Docker caching is that it can insulate a built image against changes one does want. For example, it is worth getting security updates for the base Linux, and presently they won’t happen, since I have locked to a specific hash. The image could do with a complete rebuild anyway every so often, which I can do by checking the age of the latest image in the registry. I’ll look at that in another post.
Thanks for the article. Recently I found out our ruby base image has been updated often enough it became a nuisance for me to keep changing the docker layer caching key.
So this is how I went about it: https://anonoz.github.io/tech/2018/06/17/circleci-docker-caching.html
Woop! Nice work, and glad mine was of use.
I think you have a typo. The angle bracket after gzip Shouldn’t:
docker save $(cat /tmp/layers.txt) | gzip /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz
Thanks Marc. It’s good to check, but I think that bracket was correct. This command:
* does the docker save, which outputs to stdout
* that is then piped through gzip
* that is then redirected to a file
I think I copied this from somewhere on the web, but checking the man page just now, it looks like “gzip >” is a common usage.
For what it’s worth, I preferred the second solution anyway, so I no longer use gzip when saving a layer cache.
Hi Jon,
I appreciate your post. I think I might have found a typo in your CircleCI Caching Feature solution on L25; the redirection is flipped. I think it should be the following, which will save the layer, pipe it to gzip, and then export the file to the /caches dir (/s//):
docker save $(cat /tmp/layers.txt) | gzip > /caches/${CIRCLE_PROJECT_REPONAME}.tar.gz
Thanks for this post! I had a question – in your circleci config, are you using the setup_remote_docker step that creates a remote Docker environment for the job? If not, do you know if this approach would work if we were using setup_remote_docker?
Thanks!