What are containers anyways?
The official Docker resources site says:
A container is a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another.
You can imagine containers as some sort of virtualization. But what’s the difference?
When talking about virtualization, you may think of hypervisors. However, there are multiple ways of implementing virtualization . Regarding to the topic of the post, I’m going to focus on “hypervisors” and “containers”:
- Hardware-based Virtualization: Is the virtualization of computers as complete hardware platforms, certain logical abstractions of their components, or only the functionality required to run various operating systems .
- Operating system-based virtualization: Operating system paradigm in which the kernel allows the existence of multiple isolated user space instances. Such instances have different name they are called containers (LXC, Solaris containers, Docker), zones (Solaris containers), virtual private servers (OpenVZ), partitions, virtual environments (VEs), virtual kernels (DragonFly BSD), or jails (FreeBSD jail or chroot jail) .
Note that virtualization can be classified in multiple different ways, but this one fits the best to illustrate what containers are.
Now that we understand what the differences are, hypervisors make sense when we want to virtualize a full computer with the advantages and the requirements overhead. On the other side containers are much more lightweight, almost only stores file structures and even shared ones are stored only once. Furthermore, since they share the kernel with your OS, you only need to manage one of those. Finally don’t forget about the speed difference of running an “isolated process” compared with a full virtualized computer.
Since containers allow applications to be more rapidly deployed, patched, or scaled it’s getting more used in order to accelerate development, test, and production cycles.
Something to keep in mind is that the container images need to be build for specific target,for instance you can’t run Linux containers on Windows.
Actually there is Docker Desktop which uses WSL2 or Hyper-V to generate a VM and connect it’s kernel with docker, yet if it wasn’t for that VM it wouldn’t work.
How to create and download images
An image is a read-only template that contains a set of instructions for creating a container , for now let’s say that each instruction generates a new layer with the modifications.
Images are created using
Dockerfile’s, here is an example of a simple one:
We are going to dive deeper into container images later
You can build your image using
docker build . , you can also specify a different folder.
Here is the result:
Sending build context to Docker daemon 2.048kB Step 1/3 : FROM alpine latest: Pulling from library/alpine Digest: sha256:21a3deaa0d32a8057914f36584b5288d2e5ecc984380bc0118285c70fa8c9300 Status: Downloaded newer image for alpine:latest ---> c059bfaa849c Step 2/3 : RUN ["touch", "/new"] ---> Using cache ---> 6b28fc1e2272 Step 3/3 : CMD ["echo", "Hey!"] ---> Using cache ---> cd86016de0a5 Successfully built cd86016de0a5
You can check how it is generating a layer in each step. Additionally it downloaded
There are two interesting details to explain from the output:
- If the given image is not available locally (which was the case) it downloads it from the registry, for now you can imagine it like a giant database that contains images with all it’s releases.
- It didn’t download
alpineimage, it used
alpine:latest. Well container image format may contain optional information like the registry or tag, however if you don’t specify the release it uses
If for some reason you want to download an image from the registry without using it immediately you shall use
docker pull . Realize that other commands also pull images if not found locally like
docker build or
Creating a container is actually simple, just run
docker run , depending on the image that command by itself may not do much.
Here are the some useful options:
-p :: Maps a container port to the host so it is accessible from the outside.
-it: Is actually a combination of
--tty. It allows you to get access to an interactive shell.
-d: Run container in detached mode.
--rm: Removes a container automatically after it exists.
-v /path/host /path/containerBind mount a local folder inside the container.
By default docker uses a builtin bridge network, containers within that network can only communicate among them using the IP address.
If you want containers to communicate using a name you have to create a bridge network and attach the containers to it :
docker network create --driver bridge alpine-net docker run -dit --name alpine1 --network alpine-net alpine ash docker run -dit --name alpine2 --network alpine-net alpine ash
You can also use
--linkoption but it’s deprecated so I do not recommend it.
Here is a diagram of how bridge driver works:
There are other available networking drivers:
host: The container shares the network stack with the host
overlay: Allowing multihost networks (used for docker swarm environments)
macvlan: Allow you to assign a MAC address to a container, making it appear as a physical device on your network. The Docker daemon routes traffic to containers by their MAC addresses.
none: Disable all networking.
This may vary on container implementation, for instance podman supports
How to handle storage
Containers are meant to be stateless, therefore storage shouldn’t be inside the containers. Otherwise when the container get’s removed all that data get’s erased and there is no way of getting it back. There are 3 ways of managing storage:
- Managed by docker (stored under
- Other processes shouldn’t modify data on that folder.
- Volumes may have a name.
- Is the best solution to persist data on containers.
- Bind mounts
- Any folder on the host can be mounted anywhere on the container.
- Other processes could modify files inside the mount.
- Containers could modify files on the host filesystem.
- Doesn’t write to disk, stores data on memory.
- Provides a higher performance.
Volumes lifecycle is completely independent from containers’. You can add a volume to a container (it also creates it if it doesn’t exist):
docker run --mount type=volume, source=,target= ...
Also volumes can be managed with
Multiple nodes with Docker-Compose
We already know how to connect containers. Nevertheless there is a better approach, using
docker-compose. With Compose, you use a YAML file to configure multicontainer services including volumes, networks and containers. Then, with a single command, you create and start all the services from your configuration, additionally this configurations files are almost equivalent to docker swarm
docker composemay not come installed with older docker versions, check the documentation on how to install it.
Most important objects are
networks. By default it creates a network for the stack so you can access services by name.
Almost all docker parameters have an equivalent version with similar naming. Here is an example setting up wordpress:
Here are some of the best practices when creating docker images :
Create ephemeral containers: The container can be stopped and destroyed, then rebuilt and replaced with an absolute minimum set up and configuration. For instance any data that needs to persist must be stored in a stateful backing service, typically a database .
Switch to a non-root user: If a service can run without privileges, use
USERto change to a non-root user. Start by creating the user and group in the Dockerfile with
Do not leak sensitive info to docker images: Even if it’s in a intermediate layer it can be recovered.