Docker about to announce Docker Module Runner

Docker is about to launch Docker Model Runner (DMR). This allows users to run AI large language models on their local system. Although not officially announced yet, multiple preview users have come up with their first impressions, like Nigel Poulton (author of The Kubernetes Book) did on LinkedIn. Meanwhile a DMR landing page has become available on the Docker website.

During KubeCon / CloudNativeCon Europe 2025 (this week in London) Docker will introduce DMR during one of the side events.

Big change

This is definitely a big change for Docker. Being the oldest runtime for Docker and OCI containers, once published in 2013 it exploded in popularity and containerization became the next big thing in compute. It was the only runtime available in the early versions of Kubernetes. However, in Kubernetes version 1.24 native Docker support was dropped from the codebase and the community moved merely to Containerd as their new favourite runtime for Kubernetes.

Meanwhile Docker was moving more into the Developer workstations, by means of Docker Desktop. This product has become very popular to run code locally in containers, as a local test environment for apps being developed, or running containerized tools.

In February Docker got a new CEO, named Don Johnson (no, not that guy from Miami Vice) who came over from Oracle. Now DMR is seeing the light soon, probably in Docker Desktop for Apple Silicon. Yes, Docker is on the move again.

Not a container

DMR will probably being introduced in Docker Desktop 4.40 for Mac. This would be a cool choice, since competitor Ollama showed some issues supporting their Silicon GPU’s. Docker promises to start with support for Silicon and NVidia GPU’s, which could be a game changer, since developers love to use MacBooks.

If you think those models run like any other container in Docker, you’re wrong. Docker introduces a new runtime, allowing models to run natively on the system, so not from a container! So far, a small set of supported models has been mentioned, including Llama, Gemma and Mistral. Although not running as a container, models are still stored as OCI artifacts, allowing you to use your existing container registry to store/cache models locally.

New subcommand and API

To use DMR from the command line, a new subcommand wil be introduced: docker model. This will allow you to pull, run and manage large language models. It is comparable to handling docker images, so it is easy to adapt. Using docker model pull <model> you can download an LLM to your system. Once downloaded, you can start it using docker model run <model>. There is also docker model list and docker model rm to keep things organized.

Docker will provide an API to access models programmatically. The API is compatible with the OpenAI API. This allows developers to easily migrate their code by simply changing the endpoint to localhost. This is another difference compared to Ollama, which implemented its own API.

What about Ollama?

So what does this mean for Ollama? One of the good things of free and open source software is that you always can choose which tool or technology best fits your purpose. So there will be a place for both solutions. For now DMR will still be in development, while Ollama is already very mature. Also Ollama has built an impressive library of LLM’s, while DMR is still working on this. Personally I would stick to Ollama for now, but keep an eye on how DMR evolves. I expect Docker Model Runner will take another 6 to 12 months to become a serious competitor.

More information

Qstars IT will attend KubeCon / CloudNativeCon Europe 2025, in April we will publish the highlights on this blog. Meanwhile you can always contact us to help you on your journey into containerization, to prepare your code for containerization, build CI pipelines for automation, to design/build your hosting platform or develop your next cloud-ready Python application.