Not every container has an operating system inside

...but every one of them needs your Linux kernel.

The majority of Docker examples out there explicitly or implicitly rely on some flavor of the Linux operating system running inside a container. I tried to quickly compile a list of the most prominent samples:

Running an interactive shell in the debian jessie distribution:

$ docker run -it debian:jessie

Running an nginx web-sever in a container and examine its config using cat utility:

$ docker run -d -P --name nginx nginx:latest
$ docker exec -it nginx cat /etc/nginx/nginx.conf

Building an image based on Alpine Linux:

$ cat <<EOF > Dockerfile
FROM alpine:3.7
RUN apk add --no-cache mysql-client
ENTRYPOINT ["mysql"]
EOF

$ docker build -t mysql-alpine .
$ docker run mysql-alpine

And so forth and so on...

For the newcomers learning the containerization through hands-on experience, this may lead to a false impression that containers are somewhat indistinguishable from full-fledged operating systems and that they are always based on well-known and wide-spread Linux distributions like debian , centos , or alpine .

At the same time, approaching the containerization topic from the theoretical side ( 1 , 2 , 3 ) may lead to a rather opposite impression that containers (unlike the traditional virtual machines) are supposed to pack only the application (i.e. your code) and its dependencies (i.e. some libraries) instead of a trying to ship a full operating system.

As it usually happens, the truth lies somewhere in between both statements. From the implementation standpoint, a container is indeed just a process (or a bunch of processes) running on the Linux host . The container process is isolated ( namespaces ) from the rest of the system and restricted from both the resource consumption ( cgroups ) and security ( capabilities , AppArmor , Seccomp ) standpoints. But in the end, this is still a regular process, same as any other process on the host system.

Just run docker run -d nginx and conduct your own investigation:

ps axf output (excerpt)

systemctl status output (excerpt)

sudo lsns output

Well, if a container is just a regular Linux process, we could try to run a single executable file inside of a container. I.e. instead of putting our application into a fully-featured Linux distribution, we will try to build a container image consisting of a folder with a single file inside. Upon the launch, this folder will become a root folder for the containerized environment.

If you have Go installed on your system, you can utilize its handy cross-compilation abilities:

// main.go
package main

import "fmt"

func main() {
    fmt.Println("Hello from OS-less container (Go edition)")
}

Build the program from above using:

$ GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o hello
$ file hello
> hello: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked
Alternatively, you can use Docker to compile a similar _C_ program.
// main.c
#include <stdio.h>

int main() {
    printf("Hello from OS-less container (C edition)\n");
}

Compile it using the following builder container:

# Dockerfile.builder
FROM gcc:4.9
COPY main.c /main.c
CMD ["gcc", "-std=c99", "-static", "-o", "/out/hello", "/main.c"]
$ docker build -t builder -f Dockerfile.builder .
$ docker run -v `pwd`:/out builder
$ file hello
> hello: ELF 64-bit LSB executable, x86-64, version 1 (GNU/Linux), statically linked, for GNU/Linux 2.6.32

Finally, let's build the target container using the following trivial Dockerfile:

FROM scratch
COPY hello /
CMD ["/hello"]
$ docker build -t hello .
$ docker run hello
> Hello from OS-less container (Go edition)

If we now inspect the hello image with the wonderful dive tool, we will notice that it consists of a directory with the single executable file in it:

dive hello

This exercise is roughly what Docker hello-world example does. There are two key moments here. First, we based our image on a so-called scratch image. This is just an empty image, i.e. the building starts from the empty folder and then just copies the executable file hello into it. Second, we used a statically linked binary file. I.e. there is no dependency on some shared libraries from the system. So, a bare Linux kernel is enough to execute it.

Now, what if we inspect the nginx image which we used at the beginning of this article?

dive nginx

Well, the directory tree looks like a root filesystem of some Linux distribution. If we take a look at the corresponding Dockerfile we can notice that nginx image is based on debian :

FROM debian:buster-slim

LABEL maintainer="NGINX Docker Maintainers <docker-maint@nginx.com>"

ENV NGINX_VERSION   1.17.10
ENV NJS_VERSION     0.3.9
ENV PKG_RELEASE     1~buster

...

And if we dive dipper and examine debian:buster-slim Dockerfile we will see that it just copies a root filesystem to an empty folder:

FROM scratch
ADD rootfs.tar.xz /
CMD ["bash"]

Combining Debian's user-land with the host's kernel containers start resembling fully-featured operating systems. With nginx image we can use the shell to interact with the container:

Interactive shell with running nginx container.

Can we do the same for our slim hello container? Obviously not, there is no bash executable inside:

So, what should be the conclusion here? The virtualization capabilities of containers turned out to be so powerful that people started packing fully-featured user-lands like debian (or more lightweight alternatives like alpine or busybox ) into containers. By virtue of this ability:

  • We can play with various Linux distribution using a simple docker run -it fedora bash .
  • We can use OS commands including package managers like yum or apt while building our images.
  • We can interact with running containers using various OS utilities.

But with great power comes great responsibility. Huge containers carrying lots of unnecessary tools slow down deployments and increase the surface of potential cyberattacks.

Make code, not war!

我来评几句
登录后评论

已发表评论数()

相关站点

+订阅
热门文章