Featured image of post Going Distroless

Going Distroless

Distroless images are very small images. They just contain your application and their runtime dependencies. What are the benefits? Let's dive in!

You probably know this saying:

There is no such thing as “cloud” - The cloud is just somebody else’s computer.

And I must admit that’s exactly what I thought when I first heard about distroless images. But what is distroless and what are the pros and cons? Let’s dive in!

# What is distroless

“Distroless” images contain only your application and its runtime dependencies. They do not contain package managers, shells or any other programs you would expect to find in a standard Linux distribution.

Distroless images are very small images. They just contain your application and their runtime dependencies. They don’t contain extras like shells and package managers - basically everything you would typically find in Linux distributions.

That being said, distroless images still contain a distribution - In the end there’s no such thing as no distro.

# Why distroless

When I first heard of distroless, it was because I was investigating the high vulnerabilities count in one of my projects. The client uses multiple tools to check for issues like code smell, missing tests and vulnerabilities in container images.

I reached out to the team that’s deploying these tools to check what can be done, since we’re using official (Python) images most vulnerabilities didn’t impact our code/application but still blocked our CI/CD pipelines. They introduced distroless images, something that was kind of new to me. They would probably solve most of our issues, but there’s got to be a catch, right?

Google, one of the contributors to distroless, says the following about distroless:

Restricting what’s in your runtime container to precisely what’s necessary for your app is a best practice employed by Google and other tech giants that have used containers in production for many years. It improves the signal to noise of scanners (e.g. CVE) and reduces the burden of establishing provenance to just what you need.

Distroless images are very small. The smallest distroless image, gcr.io/distroless/static-debian11, is around 2 MiB. That’s about 50% of the size of alpine (~5 MiB), and less than 2% of the size of debian (124 MiB).

# Examples

Time to checkout some distroless examples. The tests below are ran locally. After building an image, Trivy is used to run a security scan.

# Rust (Rocket)

Let’s start with benchmarking a normal Rust 1.77 image. It’s a basic Rocket application with just a single endpoint. The base image used is rust:1.77, and the OS is fully updated. This gives the following results:

1
2
3
rust-fat
==================
Total: 1056 (UNKNOWN: 5, LOW: 501, MEDIUM: 438, HIGH: 107, CRITICAL: 5)

Let’s build the same application, but now with a distroless image:

1
2
3
rust-distroless
=============================
Total: 21 (UNKNOWN: 0, LOW: 13, MEDIUM: 7, HIGH: 1, CRITICAL: 0)

Not zero, but definitely better. Let’s compare the image sizes:

Name Size
rust-fat 2.45GB
rust-distroless 39.1MB

That’s quite a difference! Let’s see some more examples.

# Java (Micronaut)

A simple Maven image:

1
2
3
mvn-fat
======================
Total: 75 (UNKNOWN: 0, LOW: 55, MEDIUM: 20, HIGH: 0, CRITICAL: 0)

v.s. a distroless image:

1
2
3
mvn-distroless
============================
Total: 24 (UNKNOWN: 0, LOW: 18, MEDIUM: 2, HIGH: 3, CRITICAL: 1)
Name Size
mvn-fat 414MB
mvn-distroless 236MB

# Python (Flask)

And then a bit of a tricky one. So far we’ve used languages who compile/package their dependencies with the executable they ship. Python works differently so needs a few more steps.

Lets start with a simple Python API (Flask with Gunicorn):

1
2
3
python-fat (debian 12.5)
==================
Total: 124 (UNKNOWN: 0, LOW: 71, MEDIUM: 39, HIGH: 13, CRITICAL: 1)

And now the distroless image:

1
2
3
4

python-distroless (debian 12.5)
==================
Total: 70 (UNKNOWN: 0, LOW: 23, MEDIUM: 30, HIGH: 16, CRITICAL: 1)

That’s definitely less, but still quite a lot (and the critical one is still present). As can be seen in the image we disable creating a virtual environment and we need to copy the entire /usr/local/lib/python3.11/site-packages over to the distroless image (to get the dependencies “installed”). We also need to explicitly update the PYTHONPATH variable, to tell Python where to look for packages and modules.

Lastly, sizes. The resulting image is just a bit smaller.

Name Size
python-fat 259MB
python-distroless 134MB

# Conclusion

Distroless images can greatly reduce the size of images. Since it removes a lot of OS level tooling that’s (often) not needed, it can also remove quite some vulnerabilities.

However, since it removes so much from the image it also makes it harder to debug. Since there is no shell, you can’t exec into the container to see it’s content or troubleshoot a process. Furthermore, if you need some OS-level dependency installed it gets quite harder. You can’t run sudo apt install somepackage since there’s no package-manager installed either.

Distroless images are not suitable for every piece of software and workload. They might require a bit more configuration and maintenance. But, as with many things in IT, it’s a great tool for the right jobs.

Google’s distroless images are built with Bazel. If you’d like to apply more customizations or automate the process of creating images, I highly encourage to checkout the docs.

In a future blog we’ll discuss other images that can be used to create safe and clean Docker images.

Built with Hugo
Theme Stack designed by Jimmy