Docker storage

Microservices are all about stateless and ephemeral workloads, and containers are great microservices. This may suggest that that Docker is all about ephemeral storage. In fact, Docker supports both non-persistent and persistent storage, such as database, kafka, etc.

Non-persistent storage is automatically created, alongside the container and is tied to the lifecycle of the container. On Linux system, it is /var/lib/docker/ as part of container. This is referred to as local storage.

Docker has a concept of volume, which is essentially a file or a directory. Volumes are for persistent data. they are de-coupled from containers and are not tied to the lifecycle of any container. Volume allows process in docker container to bypass the default uionFS, and stores file or directory on host machine. It also allows different containers to share data. You may mount a volume to a container. even if container is deleted, volume persists.

By default, Docker creates new volumes with the built-in local driver. Local volumes are only available to containers on the node they’re created on. There are also third-party drivers as plugins that provides advanced options to integrate external storage system with Docker. (NAS, SAN, etc)

There are more than 25 volume plugins that you can specify with -d switch, to cover all three categories of storage

  • Block storage tends to be high performance and good for small-block random access workloads.
  • File storage is high performance, shared amongs multiple containers with NFS or SMB protocols.
  • Object storage is good for long term storage of large data blobs that do not change frequently. It is often content addressable and relatively low performance.

Note that if you share volume with multiple containers, the application needs to worry about data collision.

You may use docker volume create command to create volume. Note that there is no quota management within docker so the partition needs to be managed at operating system level.

Implementation of Volume

Remember that Docker image is built on multi-layer file system. When we run a container, Docker places a read-write layer on top of the image, such that the active files in running container are all placed in this read-write layer. When container is deleted, so are the files. The file system in Docker is a pseudo file system implemented in unionFS. Volumes bypasses the uionFS and directly accesses the host file system.

When we create a Docker volume, Docker places the volume data to /var/lib/docker/volumes and under each directory named after volume, creates a directory _data, which is attached to the corresponding container.

You can even mount an NFS volume to container. Reference here.

We mentioned UnionFS a couple times so far. UnionFS is a light-weight, layered file system. It can mount the contents of multiple directories to the same directory, to form a single file system. User can use unionFS like a directory. It is the foundation of Docker image and container and enables saving of spaces.

Container and UnionFS | Enqueue Zero
Union FS

There are three common types of union FS: AUFS, DeviceMapper, and OverlayFS.

AUFS file system

AUFS is the earliest driver that Docker uses for file system, most common in Ubuntu and Debian. To check if the system support AUFS, check out the documentation here.

AUFS is recommended in Ubuntu or Debian. For CentOS and Redhat, it needs to be installed and make sure the command above returns aufs. To configure AUFS, create file /etc/docker/daemon.json and add:

{
  "storage-driver":"aufs"
}

Then restart docker service. Run “docker info” and examine the Storage Driver section, as documented here.

AUFS layers multiple directories on a single Linux host and presents them as a single directory. These directories are called branches in AUFS terminology, and layers in Docker terminology. The unification process is referred to as a union mount.

Layers of an Ubuntu container

This section describes how the layers work and this section describes how it reads and writes files (Copy-on-Write (CoW) strategy to maximize storage efficiency and minimize overhead). CoW characterized AUFS.

AUFS has not been adopted in the Linux kernel mainline for lack of maintainability. So for CentOS, the recommended file system driver is devicemapper.

Devicemapper file system

Devicemapper is a technical framework to map physical block device to virtual block device, introduced since kernel 2.6.9. So it’s essentially different from AUFS. The Logical Volume Manager (LVM) in Linux is also implemented based on devicemapper.

The three critical components in devicemapper are:

  • mapped device: a virtual device that devicemapper provides to client
  • target device: the underlying physical device or a section of it.
  • map table: keeps track of the offset, range, etc between mapped and target devices.

Devicemapper uses target driver to block, filter, and forward I/O requests (e.g. Raid, encryption, think provisioning, etc). In thin provisioning, storage driver only assigns spaces that are needed. Docker uses snapshot technology in thin provisioning. This part of the documentation provides further details as to how device mapper works.

ubuntu and busybox image layers

Devicemapper has to modes:

  • loop-lvm: in dev and test environment
  • direct-lvm: recommended in production

Here is the performance best practice. To configure devicemapper, create /etc/docker/daemon.json file and add:

{
  "storage-driver":"devicemapper"
  "storage-opts":[
    "dm.directlvm_device=/dev/xdf",
    "dm.thinp_percent=95",
    "dm.thinp_metapercent=1",
    "dm.thinp_autoextend_threshold=80",
    "dm.thinp_autoextend_percent=20",
    "dm.directlvm_device_force=false"
  ]
}

Then restart docker service. Run “docker info” and examine the Storage Driver section to ensure direct-lvm mode is on.

Since devicemapper uses block device to store files, it is faster than directly operate on file system. It is adopted as default driver as unionFS for a long time, ensuring stable performance under Red Hat and CentOS.

OverlayFS file system

Earlier versions of OverlayFS (known as overlay driver) is not stable. Later version is known as overlay2, which is very stable and recommended in overlay2. It requires:

  1. Docker version higher than 17.06.02;
  2. Kernel version higher than 3.10.0-514 for CentOS and RHEL; or higher than 4.0 for other distributions of Linux;
  3. Using with xfs file system with d_type turned on

In production environment, it is recommended to moutn /var/lib/docker to separate disk or partition, to prevent the directory getting full from impacting the host OS. The option pquota is recommended for mounting options in /etc/fstab.

To configure storage driver, create file /etc/docker/daemon.json, with the following content:

{
  "storage-driver":"overlay2",
  "storage-opts":[
    "overlay2.size=20G",
    "overlay2.override_kernel_check=true"
  ]
}

Then restart docker service. Run “docker info” and examine the Storage Driver section to ensure storage driver is overlay2 and d_type is true.

The way overlay2 works is similar to AUFS, involving union mount process, with lowerdir, upperdir and merged. More details are here, including how overlay2 works with file read and file write (e.g. CopyOnWrite).

Today, overlay2 driver is officially recommended by Docker for its stability and performance, it should be used if all the conditions are met.