At the OS level, a Docker image is just a directory tree — a collection of files and folders that looks like a Linux root filesystem (/bin, /lib, /etc, /usr, etc.). Nothing magic about it.

What makes it a Docker image is how that directory tree is structured, stored, and composed.


Layers — Tarballs Stacked on Top of Each Other

Each layer in an image is a tar archive — a compressed snapshot of filesystem changes. When Docker builds or pulls an image, it:

  1. Unpacks each layer tarball in order
  2. Applies them as overlays — later layers can add, modify, or delete files from earlier ones
  3. Presents the result as a single merged filesystem view

The kernel feature doing this merging is OverlayFS (overlay filesystem). It presents stacked layers as one unified directory without copying any files.


OverlayFS — How the Kernel Sees It

When a container runs, OverlayFS presents the kernel with three things:

lower dirs  (read-only) → the image layers, stacked
upper dir   (read-write) → a fresh empty layer just for this container
merged dir               → what the container process actually sees

Reads resolve through the stack from top to bottom — upper first, then the image layers in order. Writes go to the upper layer only. The image is never modified.

This is copy-on-write (CoW): a file is only copied to the upper layer the moment it’s first written to. That’s why:

  • Containers start instantly — no copying, just a new empty upper layer
  • Ten containers from the same image share the image’s disk blocks on the host
  • Stopping and removing a container discards the upper layer and everything written inside it

What the “OS Image” Actually Is

When you write FROM ubuntu:22.04, you’re not getting a kernel — you’re getting Ubuntu’s userspace files: apt, bash, libc, coreutils, and so on. The container shares the host’s kernel. There’s no guest OS, no bootloader, no hardware emulation — just a process running in an isolated namespace, pointed at a different root filesystem.

This is the fundamental difference from a VM. See containers vs virtual machines.


Content-Addressable Storage

Docker stores layers in a content-addressable store at /var/lib/docker/overlay2/. Each layer’s directory is named by the hash of its content. If two different images share an identical layer (e.g. both FROM python:3.12-slim), Docker stores that layer once and hard-links it into both image’s layer stacks. Disk is not duplicated.

docker system df          # see how much disk images are using
docker image prune        # remove dangling images (untagged, unreferenced)
docker image prune -a     # remove all unused images

One sentence: a Docker image is a stack of read-only filesystem tarballs presented as a single merged view by OverlayFS — a container adds a writable upper layer on top without touching the image beneath.


See Also