Docker reproducibility problems

It's been a while since the last blog post. Work and parenting leaves not too much time to blogging. After departing from Mozilla I decided to take some time off to decompress from work.

In this blog post I would like to record my thoughts about issues I encountered with Docker in the past few years. Most of them are related to the way how you build docker images, not how you run them.

When someone says "Take this docker image and run this thing" it sounds really useful - no dependency hell, no need to install new or old software on your machine. Just run it!

Everything changes if you want to use this approach in production.

One of the scenarios that you can hit is this.

  • Use that docker image for a while
  • A day/week/month/year later realize that you want to update something or install a new library on top of the current image
  • Try to run docker build ... - get a broken build for multiple reasons

Here are those "some reasons" and their possible work arounds. These issues don't happen very often, but if you hit them in a middle of a very important release, firefighting is guaranteed.

FROM: ubuntu

The first line in your Dockerfile says:

FROM: ubuntu

What version of Ubuntu, sorry? Ok, we can use ubuntu:20.10 instead. Yay? Not really. Even in this case the underlying image can (and arguably should) change - security updates and bug fixes are usual things these days.

Ok, how about ubuntu:20.10@sha256:xxxxxxx? Better! Now you should pull the same image every single time without any surprises! Combined with dependabot's Dockerfile update feature you get pull requests to update the image and get your tests on those PRs (you have tests, right?).

The downside of using the SHA is that it depends on the repo you push to - you can push your image to dockerhub and GCR and get different SHA256 sums.

apt-get update && apt-get upgrade

This command is unpredictable. You may end up with something that may break your application. In the past I had issues with openssl upgrades, for example. On the other hand, without upgrading the OS you may end up in a vulnerable situation due to missing security updates.

As a work around you can use something like https://snapshot.debian.org/ or your own copy of the apt repo. The latter is a bit painful and adds another layer of complexity. AFAIK, there is no service that provides automatic PRs to bump the repo URLs.

RUN pip install flask

This one is similar to the previous one, but for your application.

Using strictly pinned versions (npm/yarn lock files, pip pinned requirements files) is a common approach here. Dependabot and similar services can create PRs with updated versions. Run your tests, merge, backout if needed. No surprises (usually).

This is how it looks like usually.

Random advice

  • curlbash is not your friend, you rely on someone else's script that may be harmful. Don't use it. Worst case, copy the script, audit it, copy the files it downloads.
  • Use kaniko to build images without docker daemon
  • Use skopeo to push images around
  • Make sure you have tests to test the changes in the underlying software stack
  • Never, no, NEVER run anything as root in docker
  • Use docker-compose. It helps testing things locally. You can even use it to build the images.
  • Use Nix/NixOS to solve all the issues above ;)
  • Use CI for everything above. You'll have more time for more exciting things ;)