Reducing Docker's image size while creating an offline version of Carbon.now.sh

Creating a docker image of carbon.now.sh that is as small as possible, step by step so we can create beautiful snippets offline.

Reducing Docker's image size while creating an offline version of Carbon.now.sh
tl;dr : Starting from a (nodejs) Docker image of 2.43Gb, we use a step by step approach to get under 100MB for a deployment ready image!

Note: This is the first article of a series of two. Combined, we will manage to reduce the image size by a factor 10. You can see part 2 here!

EDIT1 : You can follow the discussion on Reddit.

EDIT2: The creator of Docker-slim made a simpler version that is just as efficient as a demo following this post. You can see it here.

Disclaimer: It's close to my first time playing around with Docker so you might find the article underwhelming :).

I'm sure most of you are used to those beautiful code snippets you see in presentations or at conference talks. They look like just like this one :

An example of code snippet

Well, almost all of them come from carbon.now.sh, who is doing a great job at making your code look nice.

Unfortunately, I work in a large company and it decided to block access to the website to avoid risking any data leaks (which makes a whole lot of sense if you ask me). Well, luckily for us Carbon is open-source and uses the MIT license so we can spin our own internal version of it.

This blog lists my journey dockerizing the application and reducing the final image size.

Getting that sweet Docker image working

The first step is to get any kind of Docker image working, straight to the point. Let's do it.

We start by cloning the repo and creating a Dockerfile at the root of the project. The project requires node 12 so we'll use the official node image as base image.

FROM node:12

WORKDIR /app

COPY package*.json ./
RUN yarn install
COPY . .
RUN yarn build

CMD [ "yarn", "start" ] 

What we're doing here is very limited:

  • We define a working directory inside the base image
  • We install dependencies using yarn install
  • We build the project
  • We define yarn start as start command when the image will be ran

What is now left to do is actually build the image, and test it (you might want to run using the -d option to get detached mode if you intend to run the server for a long time :). I am just testing here).

$ docker build -t julienlengrand/carbon.now.sh .
$ docker run -p 3000:3000 julienlengrand/carbon.now.sh:latest

Now if we go to http:localhost:3000, we should see this :

The landing page of carbon, but on localhost

Great!!!! .... Except for the fact that my image takes 2.34Gb of disk space! For something that takes screenshots, it won't be acceptable :).

➜  carbon git:(feature/docker) docker images
REPOSITORY                                IMAGE ID            SIZE
julienlengrand/carbon.now.sh              81f97ac3419b        2.43GB

Let's see what more we can do.

Keeping only the app in the image

Thing is, the way we've built the image now works, but it is far from efficient (but we knew that already). We have our whole toolchain in the container, as well as the build and development dependencies and more. We want to get rid of all this, as we don't need it to run our server.

One of the common ways to do this in the Docker world is called multi-step builds, and one of the ways to achieve this is to use the builder pattern (not to be confused with the other well-know builder pattern). In short, we use a first container to build our application and create our final image.

Let's see how that looks like :

FROM node:12 AS builder

WORKDIR /app
COPY package*.json ./
RUN yarn install
COPY . .
RUN yarn build

FROM node:12

WORKDIR /app
COPY --from=builder /app .
EXPOSE 3000
CMD [ "yarn", "start" ]

This Dockerfile contains essentially the very same lines as before except for two major differences:

  • We now split operations over 2 containers (one builds, the other will run)
  • We copy the result of the build step over to the second container to create the final image.

Just like before, we use the same commands to run and test this new version (surprisingly, it works as expected!).

The nice side-effect of multi-step build can be directly seen. We divided our final image size by 2 :

➜  carbon git:(feature/docker) docker images
REPOSITORY                                IMAGE ID            SIZE
julienlengrand/carbon.now.sh              586a65d1ee4e        1.34GB

1.34Gb for a webapp that takes glorified screenshots though, it's still way too much for me. Let's dive further.

Using a more efficient image

Using the official Node image has benefits but given that it's based on a Debian system, it's also very large. The next step for us is to look at a smaller image. One of the well known 'lighter' distros for containers is alpine and luckily there is a supported node version of it called mhart/alpine-node!

This time our Dockerfile barely changes, we just want to replace the base image:

FROM mhart/alpine-node:12 AS builder

WORKDIR /app
COPY package*.json ./
RUN yarn install
COPY . .
RUN yarn build

FROM mhart/alpine-node:12

WORKDIR /app
COPY --from=builder /app .
EXPOSE 3000
CMD [ "yarn", "start" ]

And again, we build and run with expected success :).

Again, we divide our image size by two and with this version we're just over 500Mb!

➜  carbon git:(feature/docker) docker images
REPOSITORY                                IMAGE ID            SIZE
julienlengrand/carbon.now.sh              b79dbcd33de0        502MB

Removing more of the dependencies and things we don't use

We can keep trying to reduce the bundle size by shipping even less code to the container. Let's use npm prune for that (unfortunately, yarn decided to not offer an exact equivalent). By using npm prune --production right after building, we can get rid of all of our dev dependencies. Rebuilding the image shaves yet another 100Mb.

Here is our final Dockerfile:

FROM mhart/alpine-node:12 AS builder

WORKDIR /app
COPY package*.json ./
RUN yarn install
COPY . .
RUN yarn build
RUN npm prune --production

FROM mhart/alpine-node:12

WORKDIR /app
COPY --from=builder /app .
EXPOSE 3000
# Running the app
CMD [ "yarn", "start" ]

That's it for now. I'm looking for more ways to shave some more megabytes but we did reduce the size of our deployable by almost a factor of 10! For a bit of feel good, here is the list of images we created so we can see progress:

List of all images and sizes

It still feels quite crazy to me that a simple website needs 400Mb to run today, I'm sure we can do better :). But let's stop there for now, time for a well deserved weekend!

Oh, and if you want to use Carbon locally, feel free to pull the image from the Docker Hub and run it locally :

docker run -p 3000:3000 julienlengrand/carbon.now.sh:latest

Some references I used today (thanks to them!)

Alright, this is a start but we're not done just yet! Let's dive further and see what more we can do. Read part 2 here!