Introduction
I’ve just made use of a Docker multi-stage build when Dockerising my WordPress blog, and I thought it would be useful to examine the benefits. Multi-stage builds are a nice way of being able to do build fragments in order to selectively copy pieces into the output build.
Here’s a cut-down example:
FROM alpine:3.6 AS build # Required to do Git clone operation RUN apk --update add git openssh-client WORKDIR /root RUN mkdir themes RUN git clone git@bitbucket.org:halferacc/threattocreativity-forked.git themes/threattocreativity && \ git clone git@bitbucket.org:halferacc/jonblog-theme.git themes/jonblog FROM alpine:3.6 # @todo Install PHP, Apache, etc here # @todo Install PHP extensions here # Copy themes from build stage COPY --from=build /root/themes/threattocreativity wp-content/themes/threattocreativity COPY --from=build /root/themes/jonblog wp-content/themes/jonblog
There are two build stages here – one called “build” and then an unnamed one, which is the output build. The first one installs Git and the OpenSSH client, which are required for cloning operations, and in the second, I merely copy the files from the clone. This allows the final image to benefit from not having the extra software installed, and thus reduce the output image size considerably.
Let’s look at the image that is generated by the full version of this multi-stage approach:
docker images | grep jonblog jonblog latest 8d589c2fdc7c 13 hours ago 101MB
OK, now I’ll add in the apt-get
command to install the build software in the output image, and re-examine:
docker images | grep jonblog jonblog latest 6f16614e1a72 About a minute ago 125MB
As you can see, this approach has saved 24M, or around 19% of the original image size, which will make the image quicker and cheaper to transfer (in practice it won’t make much of a difference on my infrastructure, but when applied to busy web properties, this principle is important).
Time saving
One of the other benefits of multi-stage builds is that it can speed up the build-amend-test workflow. This partly stems from the popular desire to build things in a standard order:
- Base image
- OS upgrades
- Languages and environments (Ruby, PHP, Node, Python, etc)
- Servers and other large software packages
- Utility software
- Application code
- Application configuration
This means that if one wants to make a change to PHP (e.g. an additional extension) then all the following Docker image layers will be invalidated, and will need to be done again. This usually means, in the case of application code, a checkout from a remote/central repository, which is not a speedy operation.
However, in the Dockerfile
snippet given above, it is in fact speedy. This is because changes to the PHP configuration do not invalidate any part of the first build stage, and thus the application code does not need to be cloned again. In order words, COPY --from=build
is entirely a local build operation.
Admittedly, the time saving in just two small Git clones is small. However, I also do the same with nine plugins (a mixture of zip files from the official wordpress.org site and self-hosted repos), and the time saved soon adds up.