Multi-stage Docker builds for build-time toolchain isolation

May 06, 2021 5 minutes

Dependency hell.

The situation where two components of a computer system depend on separate, and incompatible, versions of a given software or hardware dependency.

It’s well known by now that thanks to containerization technology, dependency hell is a problem that’s mostly a thing from the past; you can easily isolate program dependencies at runtime and forget about it.

What’s less known, however, is that you can also isolate build-time dependencies from one another just as easily, by using multi-stage Docker builds.

Multi-stage builds TL;DR

Multi-stage builds are basically a concatenation of Dockerfiles, where you can copy artifacts (e.g. compiled binaries) from an earlier stage into the current stage without having to explicitly use the Docker Host’s filesystem as an intermediary.

It has two major use cases:

  1. Producing leaner container images
  2. Simplifying CI/CD setups

The former is achieved by using [relatively] heavy SDK images to build the required artifacts from source code, and copying those artifacts into a later stage that’ll build the final image based on a smaller runtime image.

# Ref.: https://docs.docker.com/samples/dotnetcore/

# === STAGE 1 - COMPILE PROGRAM ===
FROM mcr.microsoft.com/dotnet/sdk:5.0 AS build
WORKDIR /app

# Copy csproj and restore as distinct layers
COPY *.csproj ./
RUN dotnet restore

# Copy everything else and build
COPY ../engine/examples ./
RUN dotnet publish -c Release -o out

# === STAGE 2 - BUILD FINAL IMAGE ===
FROM mcr.microsoft.com/dotnet/aspnet:3.1
WORKDIR /app
COPY --from=build /app/out .
ENTRYPOINT ["dotnet", "aspnetapp.dll"]

The latter is achieved as we no longer need to install every toolchain in our build servers; just install Docker and we’re good*.

* This is of course a gross over-simplification.

Real Use Case

The Dockerfile provided in the previous example is very close to what you could use in a production scenario if you’re building a program whose dependencies are met completely by installing NuGet packages.

That’s the ideal scenario, as NuGet is fantastic: reliable, respects SemVer, and it works the same way now as it used to do 10 years ago.

For web developers, sadly, there’s a problem.

Problem

Yep, NodeJS

Of course it is.

Node breaks often.

“Often”, when using lockfiles which are supposed to provide reproducible environments, means “more than never”.

A even more common occurence is finding out that one of the thousands of indirect dependencies had their NodeJS engine requirement increased for no particular reason.

The engine "node" is incompatible with this module. Expected version ">=12.13.0". Got "10.24.0"

How do I know that? I just add --ignore-engines and everything works…

TypeError: [(...variantsValue),(...extensions)].flat is not a function

Except this time it didn’t.

The Real Problem

We were just delaying the inevitable.

To understand how, here’s the Dockerfile for Inudog (source code):

# === Stage 1 - Build ===
FROM mcr.microsoft.com/dotnet/core/sdk:3.1 AS build
WORKDIR /source

## Restore .NET dependencies
COPY ./Inudog.Web/*.csproj ./Inudog.Web/
RUN dotnet restore Inudog.Web

## Install client-side build time dependencies
RUN apt-get update && apt-get install yarnpkg -y

## Copy the rest of the project files
COPY . ./

## Build client-side assets
RUN (cd Inudog.Web && yarnpkg install && yarnpkg run build)

## Build the .NET project after
RUN dotnet publish -c Release -o /output Inudog.Web

# === Stage 2 - Run ===
FROM mcr.microsoft.com/dotnet/core/aspnet:3.1 AS runtime

WORKDIR /app
COPY --from=build /output ./

EXPOSE 80/tcp
VOLUME /app/files
ENTRYPOINT ["dotnet", "Inudog.Web.dll"]

We install yarn and build our client-side assets in the same stage in the same stage as the one we build our .NET project; using dotnet SDK 3.1 LTS as the base image.

.NET 3.1 was released in December 2019, 2 years ago, a long time for software but an eternity for anything web related.

The latest NodeJS version available in the official image for .NET Core 3.1 is Node 10.

Knowing that, the why then becomes apparent: we’re using an image meant for one purpose, for a somewhat unsupported purpose. Of course, we could continue to do so, maybe we could…

  • Install Node Version Manager (NVM) by piping some random script to bash, then install the desired Node version.
  • Install Node 10, install n using npm, use n to install an updated Node, and use that Node to finally install and run our packages…
  • or just compile NodeJS from source so we have extra downtime!

Awful options.

Fortunately for us, there’s a much cleaner solution that’s obvious in hindsight, but that I don’t remember anyone suggesting when I was learning about .NET Core containerization.

Solution

Just add another stage!

Here’s the current Dockerfile for PulsoServer, a SSR web application which uses a 3-stage Dockerfile:

# === Stage 1 - Build client-side stuff ===
FROM docker.io/library/node:14 AS frontend
WORKDIR /css

## Copy only frontend files
COPY ./PulsoServer.App/package.json ./PulsoServer.App/package.json
COPY ./PulsoServer.App/yarn.lock ./PulsoServer.App/yarn.lock
COPY ./PulsoServer.App/*.config.js ./PulsoServer.App/
COPY ./PulsoServer.App/Client/ ./PulsoServer.App/Client/

## Install yarn, we don't use NPM
RUN apt-get update && apt-get install yarn -y

## Compile frontend assets
RUN cd PulsoServer.App && yarn install && yarn run build

# === Stage 2 - Build .NET ===
FROM mcr.microsoft.com/dotnet/core/sdk:3.1 AS dotnet
WORKDIR /source

## Restore .NET dependencies
COPY ./PulsoServer.App/*.csproj ./PulsoServer.App/
RUN dotnet restore PulsoServer.App

## Copy the rest of the project files
COPY . ./

## Copy compiled assets to wwwroot/
COPY --from=frontend /css/PulsoServer.App/wwwroot/* ./PulsoServer.App/wwwroot/

## Build the .NET project after
RUN dotnet publish -c Release -o /output PulsoServer.App

# === Stage 3 - Run ===
FROM mcr.microsoft.com/dotnet/core/aspnet:3.1 AS runtime

WORKDIR /app
COPY --from=dotnet /output ./

EXPOSE 80/tcp
HEALTHCHECK CMD curl --fail http://localhost:5000/health || exit
ENTRYPOINT ["dotnet", "PulsoServer.App.dll"]

Everything NodeJS related is isolated in its own frontend stage. We just copy the compiled assets to our wwwroot directory before building our .NET project afterwards.

That’s all. Really.

Summary

In this post we described how to leverage multi-stage builds to separate compilation of client assets and .NET Core code for a simple server-side rendered (SSR) web application.

Multi-stage builds makes it easy to isolate our build-time dependencies to help prevent unexpected compatibility issues after updating them.

In a future post we’ll explore how to use local container image registries to reduce the performance hit that comes with using multiple different images in a single Dockerfile.