How to reduce JVM docker image size

How to reduce JVM docker image size

If you’ve been using Kotlin or Scala (like we are at Wolt) or any other JVM based language for a while, you might have noticed that starting with Java 11, Java Runtime Environment (JRE) doesn’t have a separate distribution anymore, it is only distributed as a part of Java Development Kit (JDK). As a result of this change, many official Docker images don’t offer a JRE-only image, e.g.: official openjdk images, Amazon corretto images. In my case using such an image was resulting in an app image of 414MB, where the app itself was only taking around 60MB. At Wolt we strive to be efficient and sustainable, so such a waste of space shall not be tolerated.

Let’s see how to reduce Java docker image size dramatically.

The problem

With Java 9 there was introduced Platform Module Subsystem (JPMS). It lets us create our very own JRE image optimized for our needs. For example, if our app doesn’t use network stack anyhow or doesn’t interact with the desktop environment, we can omit java.net and java.desktop packages from the image, saving a few megabytes of space.

And starting with Java 11, JRE doesn’t have its own separate distribution, there’s no way to install it without installing JDK.

The reason for that is the modularity introduced in Java 9. There’s no need to try to distribute one JRE that will fit all, instead everyone can create a JRE image suitable for their own needs.

That’s the philosophy that many docker image maintainers have adopted, omitting exclusive JRE images and only shipping images with JDK.

Unfortunately if you’re using such images as is, then you’re wasting the space of your Docker image registry, your local machine and network bandwidth downloading and uploading them. JDK comes with tools, sources and documentation that you don’t need to run your app.

Let’s use this repository as an example. It has a small app there that runs a web server on port 8080 and replies with “Hello, world!” to a GET request.

Here’s how the Dockerfile would look like for a typical JDK-based image:

FROM amazoncorretto:17.0.3-alpine

# Add app user
ARG APPLICATION_USER=appuser
RUN adduser --no-create-home -u 1000 -D $APPLICATION_USER

# Configure working directory
RUN mkdir /app && \
    chown -R $APPLICATION_USER /app

USER 1000

COPY --chown=1000:1000 ./app.jar /app/app.jar
WORKDIR /app

EXPOSE 8080
ENTRYPOINT [ "java", "-jar", "/app/app.jar" ]

It uses an Amazon corretto JDK image as a base, creates a non-root user to run the app, and then copies the jar file into the image.

Let’s build the image and check it’s size:

docker build -t jvm-in-docker:jre -f jre.dockerfile .
docker image ls | grep -e "jvm-in-docker.*jdk"

In my case this is how the output looks like:

jvm-in-docker jdk 4126e7e5ce37 51 minutes ago 341MB

I.e. the image size is 341MB. Pretty huge image for a 7MB jar file, right? Here’s what we can do about it.

The solution

Along with modularity Java 9 has introduced a new tool called jlink. The purpose of this tool is to build a custom JRE image optimized for your use case. It provides a few options to tune the JRE image and modules to use, but there’s also a way to make it pretty generic (include all modules). First let’s have a look at the generic example:

# base image to build a JRE
FROM amazoncorretto:17.0.3-alpine as corretto-jdk

# required for strip-debug to work
RUN apk add --no-cache binutils

# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
         --verbose \
         --add-modules ALL-MODULE-PATH \
         --strip-debug \
         --no-man-pages \
         --no-header-files \
         --compress=2 \
         --output /customjre

# main app image
FROM alpine:latest
ENV JAVA_HOME=/jre
ENV PATH="${JAVA_HOME}/bin:${PATH}"

# copy JRE from the base image
COPY --from=corretto-jdk /customjre $JAVA_HOME

# Add app user
ARG APPLICATION_USER=appuser
RUN adduser --no-create-home -u 1000 -D $APPLICATION_USER

# Configure working directory
RUN mkdir /app && \
    chown -R $APPLICATION_USER /app

USER 1000

COPY --chown=1000:1000 ./app.jar /app/app.jar
WORKDIR /app

EXPOSE 8080
ENTRYPOINT [ "/jre/bin/java", "-jar", "/app/app.jar" ]

Let’s walk through this file:

  • Here we use a staged build of 2 stages.
  • In the first stage we use the same Amazon corretto image.
  • We install binutils package (it is required for jlink to work), and then run jlink. You can have a look at the options’ description in the Oracle documentation, but the most important part for us here – is this line: --add-modules ALL-MODULE-PATH . It commands jlink to include all available modules into the image.
  • At the second stage of the build we’re copying our custom JRE image from the first stage and do the same configuration as we did here.

Now, let’s build that image and check it’s size:

docker build -t jvm-in-docker:jre -f jre.dockerfile .
docker image ls | grep -e "jvm-in-docker.*jre "

In my case here’s how the output:

jvm-in-docker jre 15522f93ea6c 51 minutes ago 103MB

I.e. the image size is 103MB. 3 times smaller than it was before! And that’s with all modules included! Maybe we can improve that result? Let’s see!

When size matters

In the previous step we have included all Java modules into the image. Let’s see how much smaller we can make it if we exclude the modules we don’t use.

To do that we’re going to use jdeps. Jdeps has first been introduced with Java 8 and can be used to analyze dependencies of our app. But what we’re most interested in is Java module dependencies we have. The tricky part here, is that not all of the dependencies are required by the app itself, some of them are required by the libraries we use. Luckily for us, jdeps can detect such dependencies as well.

In our team we’re using Gradle’s distribution plugin to pack the app. Using jdeps in such case is pretty straightforward, just run:

./gradlew installDist # this will assemble and unpack the distribution
jdeps --print-module-deps --ignore-missing-deps --recursive --multi-release 17 --class-path="./app/build/install/app/lib/*" --module-path="./app/build/install/app/lib/*" ./app/build/install/app/lib/app.jar

Where “app” is our Gradle module name.

In case you’re using a so-called fat jar (aka uber-jar) jdeps, unfortunately, can’t analyze dependencies of jars inside jars, so you have to unpack the jar file first. Here’s how to do that:

mkdir app
cd ./app
unzip ../app.jar
cd ..
jdeps --print-module-deps --ignore-missing-deps --recursive --multi-release 17 --class-path="./app/BOOT-INF/lib/*" --module-path="./app/BOOT-INF/lib/*" ./app.jar
rm -Rf ./app

As you can see, first we unpack the jar here and then run jdeps with a few arguments. You can read more about the arguments in the Oracle documentation, but what will happen here: is jdeps will print a list of module dependencies. It should look like that:

java.base,java.management,java.naming,java.net.http,java.security.jgss,java.security.sasl,java.sql,jdk.httpserver,jdk.unsupported

NOTE: there seems to be a bug in jdeps version 17.x.x causing a com.sun.tools.jdeps.MultiReleaseException. If you’re getting this exception, try installing jdeps from JDK 18.

Now we need to take that list and replace ALL-MODULE-PATH with it in the docker file from the previous step. Like this:

# base image to build a JRE
FROM amazoncorretto:17.0.3-alpine as corretto-jdk

# required for strip-debug to work
RUN apk add --no-cache binutils

# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
    --verbose \
    --add-modules java.base,java.management,java.naming,java.net.http,java.security.jgss,java.security.sasl,java.sql,jdk.httpserver,jdk.unsupported \
    --strip-debug \
    --no-man-pages \
    --no-header-files \
    --compress=2 \
    --output /customjre

# main app image
FROM alpine:latest
ENV JAVA_HOME=/jre
ENV PATH="${JAVA_HOME}/bin:${PATH}"

# copy JRE from the base image
COPY --from=corretto-jdk /customjre $JAVA_HOME

# Add app user
ARG APPLICATION_USER=appuser
RUN adduser --no-create-home -u 1000 -D $APPLICATION_USER

# Configure working directory
RUN mkdir /app && \
    chown -R $APPLICATION_USER /app

USER 1000

COPY --chown=1000:1000 ./app.jar /app/app.jar
WORKDIR /app

EXPOSE 8080
ENTRYPOINT [ "/jre/bin/java", "-jar", "/app/app.jar" ]

Let’s build that image and check it’s size:

docker build -t jvm-in-docker:jre-slim -f jre-slim.dockerfile .
docker image ls | grep -e "jvm-in-docker.*jre-slim"

Here’s what i got:

jvm-in-docker jre-slim c8513c84b324 58 minutes ago 55.1MB

I.e. the image is only 55MB. It’s 6 times smaller than the original image! Pretty impressive result!

But there’s a catch. If your app is under active development, it could be that at some point you will add a dependency on a library that depends on a Java module that’s not included in the image. In that case you’d have to analyze the dependencies again to build a working image. Ideally, that could even be automated, but it’s up to you to decide if it’s worth the hassle. A JRE image with all modules included can be reused by multiple projects, which means it will save you some space in the image registry, while very use case specific images can only be used by a single project.

But in case you’re interested in automation of building small use-case specific JRE images, you can find an example in a similar article on my personal blog.

Summary

As you can see with just a little effort we can shrink the image size by at least 3 times.

You have 2 options:

  • build a generic JRE image that includes all modules and can be used universally by any app;
  • build a use case specific JRE image that will take less space, but will be less universal.

It’s up to you to decide which path suits you best, but any option will be a win in comparison to the default JDK image. Thanks to Docker images being organized in layers, multiple images based on the default JDK image shouldn’t take too much space, as all app images will be reusing the same base. But even in that case having smaller images might save you some bandwidth.

In our project at Wolt we decided to go with a more generic option so that the layer having the JRE image can be reused by other projects as well.

image size comparison

That’s it. Now you know how to reduce JVM docker image size.

Docker files from the examples are available here: monosoul/jvm-in-docker.

Bonus

At Wolt we also include a couple of private CA certificates into the JRE’s certificate store. Here’s how to do that using the Dockerfile from above:

# base image to build a JRE
FROM amazoncorretto:17.0.3-alpine as corretto-jdk

# required for strip-debug to work
RUN apk add --no-cache binutils

# Build small JRE image
RUN $JAVA_HOME/bin/jlink \
         --verbose \
         --add-modules ALL-MODULE-PATH \
         --strip-debug \
         --no-man-pages \
         --no-header-files \
         --compress=2 \
         --output /customjre

# main app image
FROM alpine:latest
ENV JAVA_HOME=/jre
ENV PATH="${JAVA_HOME}/bin:${PATH}"

# copy JRE from the base image
COPY --from=corretto-jdk /customjre $JAVA_HOME

# Add an extra root CA certificate
ADD https://example.com/extra-ca.pem $JAVA_HOME/lib/security/extra-ca.pem
RUN echo "<sha256 sum of the certificate>  $JAVA_HOME/lib/security/wolt-ca.pem" | sha256sum -c - && \
    cd $JAVA_HOME/lib/security && \
    keytool -cacerts -storepass changeit -noprompt -trustcacerts -importcert -alias extra-ca -file extra-ca.pem

# Add app user
ARG APPLICATION_USER=appuser
RUN adduser --no-create-home -u 1000 -D $APPLICATION_USER

# Configure working directory
RUN mkdir /app && \
    chown -R $APPLICATION_USER /app

USER 1000

COPY --chown=1000:1000 ./app.jar /app/app.jar
WORKDIR /app

EXPOSE 8080
ENTRYPOINT [ "/jre/bin/java", "-jar", "/app/app.jar" ]

Let’s go through what happens after the line 25:

  • (line 26) First we download the certificate and put it into a directory in the image.
  • (lines 27-28) Then we check the certificate’s hash sum to make sure it didn’t get compromised.
  • (line 29) After that we import the certificate into the JRE’s certificate storage.

Simple! Happy hacking!

Are you interested in joining our Kotlin community? We’re hiring — check out our open roles!

Sources