Spellbinding Docker - The Art of Multi-Stage Builds

Ana-Ioana Vlad and Kamil Choroba · September 30, 2024 · 16 min read

Dockermulti-stage builds
Share on linkedinShare on facebookShare on twitterShare on reddit

In this article, we are about to embark on a two-part journey through the magical world of Docker. In the first part, we will follow a sorceress as she refines her Docker spells with multi-stage builds, learning how to transform an old, inefficient Dockerfile into a streamlined, powerful tool. In the second part, we’ll step into the real world to explore practical examples and applications of these magical techniques in production environments. By the end, you’ll be ready to wield the power of Docker multi-stage builds yourself.

1. Story Time: Spellbinding Docker

In this section, we’ll follow the sorceress as she confronts an outdated Dockerfile, plagued by inefficiency and lack of structure. Through her journey, we’ll explore the power of Docker multi-stage builds and how they can transform a cumbersome process into a sleek, organized solution. We’ll witness the evolution of her Dockerfile, uncovering how separating development, testing, and production into distinct stages improves performance, clarity, and maintainability. Let’s dive into the art of Docker refactoring and reveal the magic of multi-stage builds.

Intro - The Magical Docker Odyssey

Welcome to a tale of magic and mastery. In the realm of software, a sorceress embarks on a quest to transform ancient Docker spells into powerful, modern solutions. Join us as we unravel her journey, discovering the secrets of Docker multi-stage builds.

Chapter 1 - The Ancient Docker Spellbook

Our story begins with a challenge. The sorceress encounters the ancient Docker spellbook - an archaic, (almost) single-stage build. Cumbersome and inefficient, it hinders her magical prowess. Let’s explore how this old code held back her potential.​

FROM 0123456789.dkr.ecr.eu-west-1.amazonaws.com/some-node-image AS builder

RUN curl -sqL https://dl.yarnpkg.com/rpm/yarn.repo | tee /etc/yum.repos.d/yarn.repo &&\
    yum groupinstall -y "Development Tools" &&\
    yum install -y yarn git &&\
    yum clean all

WORKDIR /app

COPY package.json yarn.lock .npmrc ./

RUN yarn install --frozen-lockfile

COPY tsconfig.prod.json ./
COPY src ./src/
COPY assets ./assets/

RUN yarn build

# Add prod dependencies into dist bundle
RUN yarn install --frozen-lockfile --production --modules-folder ./dist/node_modules

FROM 0123456789.dkr.ecr.eu-west-1.amazonaws.com/some-node-image

RUN yum install -y https://github.com/krallin/tini/releases/download/v0.18.0/tini_0.18.0.rpm &&\
    yum clean all

WORKDIR /app

ENV NODE_ENV=production
ENV PORT=3000

COPY --from=builder /app/dist ./
COPY --from=builder /app/src/fragments ./fragments
COPY --from=builder /app/assets ./assets
CMD ["node", "--max-http-header-size=20480", "index.js"]

# Node binary was not designed to run as PID=1, tini is a very small init system that is default for Docker
# It ensures that zombie processes are cleaned up and handles SIGTERM, SIGINT and SIGHUP signals.
ENTRYPOINT ["/bin/tini", "--"]

Looking deeper at this ancient Docker spellbook, we can already identify some of the challenges that the sorceress faced:

  • Use of a Single Stage for Building and Dependencies: ​ The dependencies and the build process are handled in the same stage. Separating these into different stages could improve the build efficiency and caching.​

  • No Clear Separation of Concerns: ​ The Dockerfile does not clearly separate the concerns of building, testing, and deploying, which can make maintenance and updates more challenging.​

  • Lack of modularity: ​ Missing important stages like development and testing, which could be useful in different situations.​

The sorceress’s first encounter with Docker revealed many inefficiencies: an ancient Dockerfile with no clear structure, a single-stage build, and a lack of separation between development and production. As she digs deeper into this configuration, she realizes that the inefficiencies can be remedied. Soon, she will discover the power of multi-stage magic.

Chapter 2 ​- The Revelation of Multi-Stage Magic​

In a moment of revelation, our sorceress discovers multi-stage Docker builds. A powerful spell that segments her tasks into development, testing, and production. Each stage, a step closer to efficiency and clarity.​

Multi-Stage Magic ​(After Refactoring)​:

In Docker terms, multi-stage builds allow her to break down the build process into distinct, optimized stages, each serving a specific purpose.

# ======================================================================
# Base image
# ======================================================================
FROM 0123456789.dkr.ecr.eu-west-1.amazonaws.com/some-node-image AS base

WORKDIR /app

COPY package.json yarn.lock playwright.config.ts .npmrc tsconfig.prod.json tsconfig.json .prettierignore jest.config.js ./
COPY src ./src/
COPY assets ./assets/
COPY smoke-tests ./smoke-tests/
COPY __tests__ ./__tests__/

RUN yarn install --frozen-lockfile

# ======================================================================
# Development stage
# ======================================================================
FROM base AS dev

CMD ["yarn", "dev"]

# ======================================================================
# Unit tests stage
# ======================================================================
FROM base AS unit

COPY . .

RUN yarn lint &&\
    yarn format:test &&\
    yarn test:ci

# ======================================================================
# E2E stage
# ======================================================================
FROM mcr.microsoft.com/playwright:v1.47.0-jammy AS e2e

WORKDIR /app

COPY --from=base /app/ ./

RUN npx playwright install

RUN yarn test:smoke

# ======================================================================
# Build stage
# ======================================================================
FROM base AS build

ENV NODE_ENV=production

COPY . .

RUN yarn build

# Add prod dependencies into dist bundle
RUN yarn install --frozen-lockfile --production --modules-folder ./dist/node_modules

# ======================================================================
# Production stage
# ======================================================================
FROM 0123456789.dkr.ecr.eu-west-1.amazonaws.com/some-node-image

WORKDIR /app

ENV NODE_ENV=production
ENV PORT=3000
ENV ARCH=amd64
# For Mac M1, use following arch
# ENV ARCH=arm64

RUN curl -fsSLO "https://github.com/krallin/tini/releases/download/v0.19.0/tini-${ARCH}" \
    && ln -s /app/tini-${ARCH} /usr/local/bin/tini

RUN chmod +x "tini-${ARCH}"

COPY --from=build /app/dist ./
COPY --from=build /app/src/fragments ./src/fragments
COPY --from=build /app/assets ./assets
RUN rm -rf ./__tests__ ./smoke-tests ./playwright.config.ts

CMD ["node", "--max-http-header-size=20480", "src/index.js"]

# Node binary was not designed to run as PID=1, tini is a very small init system that is default for Docker
# It ensures that zombie processes are cleaned up and handles SIGTERM, SIGINT and SIGHUP signals.
ENTRYPOINT ["tini", "--"]

Let’s reflect on the changes that the sorceress made to her Docker spellbook:

  • Multi-Stage Builds (development, unit tests, E2E tests, build, production): ​ This approach allows for separation of concerns, making the Dockerfile more organized and efficient, each stage being tailored for specific needs.​

  • Dedicated Testing Stages: ​ Having separate stages for unit tests and E2E tests ensures that testing is an integral part of the build process. ​

  • Isolation of Development Environment: ​ The dev stage isolates the development environment, which can be beneficial for development and debugging purposes without affecting the production build.​

  • Improved Modularity: ​ The Dockerfile is now modular, with each stage serving a specific purpose. This makes it easier to maintain and update the codebase.​

Through the use of multi-stage builds, the sorceress has transformed her Dockerfile into a powerful tool, streamlining development, testing, and production. By isolating each stage, she has created a Docker environment that is more modular, efficient, and easier to maintain. Her refactoring journey highlights the power of multi-stage builds to optimize processes and improve performance, paving the way for even greater Docker mastery.

Now that her Dockerfile is streamlined, the sorceress seeks a way to orchestrate her Docker spells across different environments. This search leads her to Docker Compose.

Chapter 3 ​- Docker Compose Symphony

Our sorceress then encounters Docker Compose, a symphony of containers. In this analogy, Docker Compose acts as a conductor, coordinating each stage of her multi-stage build, enabling seamless transitions between development, testing, and production.

Let’s explore how Docker Compose complements the multi-stage Docker builds:

# this will target the dev stage in our Dockerfile
services:
  web:
    build:
      context: .
      target: dev

Running docker-compose up, the sorceress conjures a development environment, where she can weave her magic with ease.​ The target: dev directive in the Docker Compose file allows her to focus only on the development stage, isolating her environment for optimal productivity.​ The value dev corresponds to the alias defined in the Dockerfile:

FROM base AS dev

More than this, the sorceress can change the target value to either unit or e2e, running tests in isolation to thoroughly refine her spells before production. Docker Compose allows her to orchestrate containers for different environments—development, testing, and production—ensuring consistency across teams and enhancing collaboration. By isolating each stage, she makes her workflow more agile and efficient. Additionally, leveraging Docker’s caching mechanisms speeds up iterative development, enabling her to focus on refining her spells with greater efficiency. Docker Compose thus integrates seamlessly with multi-stage builds, offering a dynamic and scalable approach for both rapid development and thorough testing.

However, the sorceress knows that even the most powerful Docker spells need a mechanism to ensure they’re deployed efficiently. Enter Jenkins, her ally in continuous integration and deployment.

Chapter 4 ​- Jenkins Harmony with Targeted Builds

In our sorceress’ journey, Jenkins becomes an ally. Together, they harmonize CI/CD with targeted builds. Each Jenkins stage aligns with the Dockerfile, ensuring efficient resource usage and faster, smoother builds.​

Let’s explore how Jenkins orchestrates the Docker builds. The sorceress defines parallel stages for unit and E2E tests before the build stage, like so:

stage('Test') {
  parallel {
    stage('unit') {
      agent { node { label 'build-node20' } }
      steps {
        script {
          sh 'scripts/unit.sh'
        }
      }
    }
    stage('e2e') {
      agent { node { label 'build-node20' } }
      steps {
        script {
          sh 'scripts/e2e.sh'
        }
      }
    }
  }
}

The most relevant parts of the scripts unit.sh and e2e.sh are:

# Build container run only unit tests
docker build --target unit --pull -t "$SERVICE:$DOCKER_TAG-unit" .

# Build container run only e2e tests
docker build --target e2e --pull -t "$SERVICE:$DOCKER_TAG-unit" .

Here is a snapshot of the Jenkins pipeline before and after using targeted builds:

ℹ️ By running unit and E2E tests in parallel, Jenkins reduces the overall build time, ensuring faster feedback and quicker deployments. At the same time, the build stage becomes a single responsibility stage, focusing solely on building the production image.

In this scenarion, the Jenkins pipeline orchestrates the Docker builds, ensuring that each stage has a clear, well-defined role with a single resposibility, not doing more than it’s actually needed. This alignment between Jenkins and Docker stages streamlines the CI/CD process, making it more efficient and reliable.

With Jenkins in place, the sorceress finds even more benefits beyond just speed and efficiency. These revelations promise to enhance collaboration and maintainability.​

Chapter 5​ - Beyond the Magic: Additional Benefits​

Our journey goes beyond mere speed and efficiency. The sorceress realizes the additional benefits of her multi-stage Docker spells - enhanced readability, easier maintenance, and improved collaboration across her team.​

  • Enhanced Readability: ​ The Dockerfile is now more organized and easier to read, with each stage serving a specific purpose. This clarity makes it simpler for the sorceress and her team to understand and maintain the codebase.​
  • Separation of Concerns: The Dockerfile now clearly separates the concerns of building, testing, and deploying, making it easier to manage and update the codebase.​
  • Improved Collaboration: ​ The modular structure of the Dockerfile encourages collaboration among team members. Each stage can be worked on independently, allowing for parallel development and testing.​
  • Potential Speed Improvements: The multi-stage Docker builds can improve build times by leveraging caching and reducing the size of the final image. This can lead to faster deployments and more efficient resource usage.​ Coupled with Jenkins, the sorceress could further optimize the CI/CD process, ensuring that each stage is executed in parallel, maximizing efficiency and speed.​

Epilogue - The Sorceress’ Triumph

As our tale concludes, we see a kingdom transformed. The sorceress’s journey with Docker multi-stage builds has brought magic to the realm of software, revolutionizing the way builds are crafted and optimized. Through her mastery of Docker Compose and Jenkins, she has orchestrated a powerful symphony of efficiency, flexibility, and scalability. May her story inspire you to embark on your own adventures in optimization.

But her journey doesn’t end here. Armed with her newfound knowledge, the sorceress is prepared to tackle even greater challenges. From scaling microservices across vast infrastructures to integrating seamless CI/CD pipelines, she knows that the power of Docker will allow her to build, test, and deploy more efficiently in any kingdom she encounters.

And so, we bid her farewell as she sets out on her next quest, eager to refine her craft, uncover new spells, and further enhance her Docker magic. With the wisdom gained from her journey, she is ready to face whatever challenges lie ahead, confident in her mastery of Docker’s transformative capabilities.

In the next section we will explore practical applications and examples of Docker stages in real production scenarios.

2. Practical applications and examples

In this section we are going to demonstrate how Docker stages can be used in real production scenarios.

Basic setup

You can find all the examples in the github repository: https://github.com/KamilChoroba/understanding-docker-stages.git

Clone this repository and open it in a editor of your choice. We will use Visual-Studio-Code:

$ ~ git clone git@github.com:KamilChoroba/understanding-docker-stages.git
$ ~ cd understanding-docker-stages
understanding-docker-stages $ code

ℹ️ the command code only works if you have Visual-Studio-Code installed and its executable via the command line.

Scenario 1, basic Dockerfile

In the first scenario we are going to create a simple base docker image, by using a public docker image (alpine:3.20.2) as the base. And creating a work directory /app for all follow up commands.

Goto scenario-01 inside of understanding-docker-stages/scenarios.

understanding-docker-stages $ cd scenarios/scenario-01

Open the Dockerfile and read the content.

Build + Run this docker image locally and explore the content:

understanding-docker-stages/scenario-01 $ docker build -t docker-stages:1 .
understanding-docker-stages/scenario-01 $ docker run -it docker-stages:1 /bin/sh

ℹ️ Commands explained:

  • docker build will execute the build process of docker. It will generate a docker images
  • -t docker-stages:1 is tagging the created docker image. If not provided, docker will choose a random hash-name. You can also specify the version with :1 which can also be a name e.g. :main
  • . specifies the context path, this should be optimally the location where your Dockerfile is located
  • docker run will run an docker image locally
  • -it: This mode allows to interact with the running docker container as you would usually do via ssh
  • docker-stages:1 is the image + tag to run
  • /bin/sh is the command to execute on the docker image. This will allow us to run shell commands.

After running the docker image we can explore it a bit. You will notice, that you are immediately in the directory /app as it was set in the Dockerfile.

/app # pwd
/app
/app # ls -lah
total 8K
drwxr-xr-x    2 root     root        4.0K Jul 23 06:14 .
drwxr-xr-x    1 root     root        4.0K Aug 23 13:19 ..
/app #
/app # exit

Scenario 2, setup basic typescript project

In the second scenario we are going to setup an npm project in Docker. For this we want to create a new stage and keep the base untouched. We will install the npm package, copy all source files into the docker image and install all dependencies.

Goto scenario-02 inside of understanding-docker-stages/scenarios.

understanding-docker-stages/scenario-01 $ cd ../scenario-02

Open the Dockerfile and check the content.

Build + Run this docker image locally and explore the content:

understanding-docker-stages/scenario-02 $ docker build -t docker-stages:2 .
understanding-docker-stages/scenario-02 $ docker run -it docker-stages:2 /bin/sh

Also here, we can explore the content of the docker image a bit:

/app # ls -lah
total 56K
drwxr-xr-x    1 root     root        4.0K Jul 23 06:17 .
drwxr-xr-x    1 root     root        4.0K Sep  6 13:20 ..
drwxr-xr-x   47 root     root        4.0K Jul 23 06:17 node_modules
-rw-r--r--    1 root     root       20.9K Jul 23 06:17 package-lock.json
-rw-r--r--    1 root     root         344 Jul 22 19:32 package.json
drwxr-xr-x    2 root     root        4.0K Jul 23 06:15 src
-rw-r--r--    1 root     root       11.7K Jul 22 19:32 tsconfig.json
/app # exit

Scenario 3, build and test the project

In the next scenario we will run the service build command to generate the compiled project assets. Of course we could also execute other commands here like e.g. test.

Goto scenario-03 inside of understanding-docker-stages/scenarios.

understanding-docker-stages/scenario-02 $ cd ../scenario-03

Open the Dockerfile and check the content.

Build + Run this docker image locally and explore the content:

understanding-docker-stages/scenario-03 $ docker build -t docker-stages:3 .
understanding-docker-stages/scenario-03 $ docker run -it docker-stages:3 /bin/sh

While exploring the docker image, you will notice that we now have the generated js file in the dist folder:

/app # ls -lah
total 60K
drwxr-xr-x    1 root     root        4.0K Jul 23 06:18 .
drwxr-xr-x    1 root     root        4.0K Sep  6 13:23 ..
drwxr-xr-x    2 root     root        4.0K Jul 23 06:18 dist
drwxr-xr-x   47 root     root        4.0K Jul 23 06:17 node_modules
-rw-r--r--    1 root     root       20.9K Jul 23 06:17 package-lock.json
-rw-r--r--    1 root     root         344 Jul 22 19:32 package.json
drwxr-xr-x    2 root     root        4.0K Jul 23 06:15 src
-rw-r--r--    1 root     root       11.7K Jul 22 19:32 tsconfig.json
/app # cd dist
/app/dist # ls -lah
total 12K
drwxr-xr-x    2 root     root        4.0K Jul 23 06:18 .
drwxr-xr-x    1 root     root        4.0K Jul 23 06:18 ..
-rw-r--r--    1 root     root          44 Jul 23 06:18 index.js
/app/dist # exit

Scenario 4, create a production image and run the production application

Now we can create a clean production image from the previous executed stages. But first, we only want to install the production required npm dependencies. In a final stage we can then copy all required files and run the production service.

Goto scenario-04 inside of understanding-docker-stages/scenarios.

understanding-docker-stages/scenario-03 $ cd ../scenario-04

Open the Dockerfile and check the content.

Build + Run this docker image locally and explore the content:

understanding-docker-stages/scenario-04 $ docker build -t docker-stages:4 .
understanding-docker-stages/scenario-04 $ docker run -it docker-stages:4

This should print Hello, world! in on your CLI. That means, the index.ts was successfully compiled + executed in your docker image.

Explore the docker image a bit. You will notice, you are not by default in the /app directory, because we used a clean base image (node:21).

understanding-docker-stages/scenario-04 $ docker run -it docker-stages:4 /bin/sh
# ls -lah
total 64K
drwxr-xr-x   1 root root 4.0K Sep  6 13:37 .
drwxr-xr-x   1 root root 4.0K Sep  6 13:37 ..
-rwxr-xr-x   1 root root    0 Sep  6 13:37 .dockerenv
drwxr-xr-x   1 root root 4.0K Jul 22 19:40 app
lrwxrwxrwx   1 root root    7 May 13 00:00 bin -> usr/bin
drwxr-xr-x   2 root root 4.0K Jan 28  2024 boot
drwxr-xr-x   5 root root  360 Sep  6 13:37 dev
drwxr-xr-x   1 root root 4.0K Sep  6 13:37 etc
drwxr-xr-x   1 root root 4.0K May 14 09:25 home
lrwxrwxrwx   1 root root    7 May 13 00:00 lib -> usr/lib
drwxr-xr-x   2 root root 4.0K May 13 00:00 media
drwxr-xr-x   2 root root 4.0K May 13 00:00 mnt
drwxr-xr-x   1 root root 4.0K May 16 13:57 opt
dr-xr-xr-x 239 root root    0 Sep  6 13:37 proc
drwx------   1 root root 4.0K May 16 13:57 root
drwxr-xr-x   1 root root 4.0K May 14 01:44 run
lrwxrwxrwx   1 root root    8 May 13 00:00 sbin -> usr/sbin
drwxr-xr-x   2 root root 4.0K May 13 00:00 srv
dr-xr-xr-x  11 root root    0 Sep  6 13:37 sys
drwxrwxrwt   1 root root 4.0K May 16 13:57 tmp
drwxr-xr-x   1 root root 4.0K May 13 00:00 usr
drwxr-xr-x   1 root root 4.0K May 13 00:00 var
# cd app/dist/
# ls -lah
total 20K
drwxr-xr-x 1 root root 4.0K Jul 22 19:57 .
drwxr-xr-x 1 root root 4.0K Jul 22 19:40 ..
-rw-r--r-- 1 root root   44 Jul 22 19:37 index.js
drwxr-xr-x 6 root root 4.0K Jul 22 19:56 node_modules
# exit

Scenario 5, intermediate step to run development environment via docker

Now lets add a development stage in between which allows us to make changes on the code-base and immediately see the changes in the docker container.

Goto scenario-05 inside of understanding-docker-stages/scenarios.

understanding-docker-stages/scenario-04 $ cd ../scenario-05

Open the Dockerfile and check the content.

At this point we need to build and execute this docker image with a few more arguments:

understanding-docker-stages/scenario-05 $ docker build --target dev -t docker-stages:5 .
understanding-docker-stages/scenario-05 $ docker run -v ./src:/app/src docker-stages:5

ℹ️ Commands explained:

  • --target dev will target a specific stage. dev is defined as an alias.
  • -v ./src:/app/src will map the local director ./src into the docker container. This will allow changes on the local computer to directly be seen on the docker image

Apply changes to the scenario-05/src/index.ts file. You should see immediately the difference in the CLI.

understanding-docker-stages/scenario-05 $ docker run -it docker-stages:5 /bin/sh
> my-app@1.0.0 dev
> nodemon --exec ts-node src/index.ts

[nodemon] 3.1.4
[nodemon] to restart at any time, enter `rs`
[nodemon] watching path(s): *.*
[nodemon] watching extensions: ts,json
[nodemon] starting `ts-node src/index.ts`
Hello, world!!!
[nodemon] clean exit - waiting for changes before restart
[nodemon] restarting due to changes...
[nodemon] starting `ts-node src/index.ts`
Hello, world!!! I've changed the codebase
[nodemon] clean exit - waiting for changes before restart

That’s it for today. Happy coding!

Share on linkedinShare on facebookShare on twitterShare on reddit

About the authors

Ana-Ioana Vlad

Ana-Ioana Vlad is an Engineering Lead at AutoScout24, dedicated to mentoring and fostering collaborative team environments while driving performance and growth. Outside of work, she enjoys gaming and is a proud mom.

Connect on Linkedin

Kamil Choroba

Kamil Choroba is a seasoned Software Engineer with over 10 years of experience, specializing in full-stack development with a focus on Node.js, TypeScript, and React, and driven by passions for drawing, gaming, music, and sports.

Connect on Linkedin

Discover more articles like this:

Stats

Over 170 engineers

50+nationalities
60+liters of coffeeper week
5+office dogs
8minmedian build time
1.1daysmedianlead time
So many deployments per day
1000+ Github Repositories

AutoScout24: the largest pan-European online car market.

© Copyright by AutoScout24 GmbH. All Rights reserved.