At the OS level, a Docker image is just a directory tree — a collection of files and folders that looks like a Linux root filesystem (/bin, /lib, /etc, /usr, etc.). Nothing magic about it.
What makes it a Docker image is how that directory tree is structured, stored, and composed.
Layers — Tarballs Stacked on Top of Each Other
Each layer in an image is a tar archive — a compressed snapshot of filesystem changes. When Docker builds or pulls an image, it:
- Unpacks each layer tarball in order
- Applies them as overlays — later layers can add, modify, or delete files from earlier ones
- Presents the result as a single merged filesystem view
The kernel feature doing this merging is OverlayFS (overlay filesystem). It presents stacked layers as one unified directory without copying any files.
OverlayFS — How the Kernel Sees It
When a container runs, OverlayFS presents the kernel with three things:
lower dirs (read-only) → the image layers, stacked
upper dir (read-write) → a fresh empty layer just for this container
merged dir → what the container process actually sees
Reads resolve through the stack from top to bottom — upper first, then the image layers in order. Writes go to the upper layer only. The image is never modified.
This is copy-on-write (CoW): a file is only copied to the upper layer the moment it’s first written to. That’s why:
- Containers start instantly — no copying, just a new empty upper layer
- Ten containers from the same image share the image’s disk blocks on the host
- Stopping and removing a container discards the upper layer and everything written inside it
What the “OS Image” Actually Is
When you write FROM ubuntu:22.04, you’re not getting a kernel — you’re getting Ubuntu’s userspace files: apt, bash, libc, coreutils, and so on. The container shares the host’s kernel. There’s no guest OS, no bootloader, no hardware emulation — just a process running in an isolated namespace, pointed at a different root filesystem.
This is the fundamental difference from a VM. See containers vs virtual machines.
Content-Addressable Storage
Docker stores layers in a content-addressable store at /var/lib/docker/overlay2/. Each layer’s directory is named by the hash of its content. If two different images share an identical layer (e.g. both FROM python:3.12-slim), Docker stores that layer once and hard-links it into both image’s layer stacks. Disk is not duplicated.
docker system df # see how much disk images are using
docker image prune # remove dangling images (untagged, unreferenced)
docker image prune -a # remove all unused imagesOne sentence: a Docker image is a stack of read-only filesystem tarballs presented as a single merged view by OverlayFS — a container adds a writable upper layer on top without touching the image beneath.