Docker

Created: March 20, 2020

Fundamental commands

docker --version

docker info Detailed information on the Docker install.

docker run IMAGE[:TAG] [COMMAND] Run a container. TAG is usually a version number. Also run an optional COMMAND within the container.

-d IMAGE Detached mode. (In the background.) --name NAME Specify a name for the container. Otherwise name is random. --restart OPTION Restart a container when it stops based on conditions. OPTION: {no|on-failure|always|unless-stopped} on-failure: non-zero exit codes. unless-stopped: like always but excludes manual stops. -p HOST_PORT:CONTAINER_PORT Expose a port from the container as a port on the host. --rm Automatically delete a container when it stops. --memory NUMBER Put a hard-limit on container memory usage. Allows for swap usage. NUMBER units: {b|k|m|g} --memory-reservation NUMBER Put a soft-limit on memory usage. Only activates when memory is running low. NUMBER units: {b|k|m|g} -v SOURCE:DESTINATION:[OPTIONS] Mount a volume or bind mount. See Docker Volume section for full information.

  --mount \                    
  [type={bind|volume|tmpfs}],\
  source={VOLUME_NAME|PATH},\
  destination=PATH,\
  [RO]

docker service

  --mount \                    
  [type={bind|volume|tmpfs}],\
  source={VOLUME_NAME|PATH},\
  destination=PATH,\
  [RO]

docker ps List running containers.

-a List all containers.

docker container

stop {NAME|ID} Stop a container. start {NAME|ID} Start a container. rm {NAME|ID} Delete a container. Must be stopped.

docker inspect {OBJECT_ID|NAME} Get JSON config for an object. Containers, images, and services are (some) objects.

--pretty Make the output more like YAML. Images and services only.

--format='{{.<FIELD_NAME>}}' Retrieve a specific field value by itself.

docker volume

create VOLUME_NAME Create a volume. ls List volumes. rm VOLUME_NAME Delete a volume. inspect VOLUME_NAME Detailed volume information in JSON format.

Docker Volumes

https://docs.docker.com/engine/reference/commandline/
service_create/#add-bind-mounts-volumes-or-memory-filesystems

Bind mounts:

Mount a specific path on the host machine to the container. Not portable.

Volumes:

Docker Engine manages the storage of data on the host’s filesystem. Portable.
Volumes can be mounted to multiple containers.

Using Docker volumes:

Both bind mounts and volumes can be used with either --mount or -v.
Only --mount can be used with the service command.

–mount:

--mount [type={bind|volume|tmpfs}],\
source={VOLUME_NAME|PATH},\
destination=PATH,\
[RO]

type Optional because type can be extrapolated from source. bind (mount), volume, or tmpfs (temporarily in-memory storage). source Volume name or bind mount path. destination|target|dst Path to mount inside the container. readonly|RO Make the Docker Volume read-only.

-v:

-v SOURCE:DESTINATION:[OPTIONS]

SOURCE Use a name for volumes and a path for bind mounts. DESTINATION Path to mount inside the container. OPTIONS Comma-separated list of options. For example, ro for read-only.

Volume drivers and storage in clusters

https://docs.docker.com/engine/extend/legacy_plugins/

Options:

Application logic to store data in external object storage.
A volume driver that is external to any specific machine in the cluster.

Simple example with sshfs:

To install, on all nodes in the cluster: docker plugin install --grant-all-permissions vieux/sshfs

Create a service with the mount option:

--mount volume-driver=vieux/sshfs,\
source=VOLUME_NAME,\
destination=PATH,\
volume-opt=sshcmd=USERNAME@HOST:DEST_PATH,\
volume-opt=password=PASSWORD

Don’t create the volume on one node then create the service. The options don’t get propogated to other nodes if the volume driver is specified in a docker create command.

Docker images and Dockerfiles

To create a new image: - Create Dockerfile within a new directory. - Populate Dockerfile with YAML config. - docker build -t NEW_IMAGE_NAME PATH_TO_DIRECTORY

Dockerfile Syntax

https://docs.docker.com/engine/reference/builder/

Basic Dockerfile syntax:

```
# Single-line comment.
FROM {scratch|IMAGE_NAME[:TAG]} [AS STAGE_NAME]
ENV IDENTIFIER STRING
RUN COMMAND
CMD ["COMMAND","ARG1","ARG2",...,"ARGN"]
```

FROM Starts a new build stage and sets the base image. Usually must be the first directive in the Dockerfile. However, the ARG directive can be placed before FROM. scratch: from nothing.

ENV Set environment variables. These variables can be referenced in Dockerfile itself. Referenced using $IDENTIFIER.

RUN Run a command on top of the previous layers and commit the changes as a new layer.

CMD A default command that is used if no other is given in docker run. Ex: CMD ["nginx","-g","daemon off;"].

More Dockerfile syntax:

COPY LOCAL_PATH TARGET_PATH Copy files from the local machine to the image. Paths can be relative to WORKDIR, if that directive is set. --from={NAME|INDEX} Copy files from a previous build stage. NAME is the STAGE_NAME of a FROM section. INDEX is the FROM section index. Zero-based indexing.

EXPOSE PORT Expose a PORT. For documentation purposes only.

ADD SOURCE TARGET_PATH Like COPY but more features. SOURCE is an archive, file path, or URL.

WORKDIR PATH Set PATH as the current working directory. Subsequent directives will respect the working directory. Can be used multiple times. PATH can be relative to the current working directory. The final WORKDIR determines the run-time working directory.

STOPSIGNAL SIGNAL Specify the singal that will be used to stop the container. Location in the file doesn’t matter. Ex: STOPSIGNAL SIGTERM.

HEALTHCHECK CMD COMMAND Create a custom health check. By default, Docker just uses exit codes. Ex: HEALTHCHECK CMD curl localhost:80

Simple Dockerfile example: ``` #Simple Nginx image FROM ubuntu:bionic

ENV NGINX_VERSION 1.14.0-0ubuntu1.3

RUN apt-get update && apt-get install -y curl
RUN apt-get update && apt-get install -y nginx=$NGINX_VERSION

CMD ["nginx","-g","daemon off;"]
```

Important notes on Dockerfiles

Multiple RUN statements in Dockerfiles: - RUN statements create new layers. - Every time an image is re-built, only the layers that have changed are re-built. - If a command, for example apt update, needs to run consistently, it should be included in every RUN statement.

Efficiency: - Put things that are less likely to change on lower-level layers. - Don’t create unnecessary layers. - Avoid including any unnecessary files or packages in the image.

Multi-stage builds: - Multi-stage builds can make for more efficient resulting images. - A multi-stage build is a Dockerfile with multiple FROM sections. - For example, compiler binaries may be included in a single-stage build. - The last stage alone creates the resulting image. - The --from flag of COPY can be used to copy files from a previous stage.

Flattening images: - Images with fewer layers can have better performance. - Flattening images should be avoided unless it has tangible benefits. - To flatten an image: - Run a container from the image. - docker export CONTAINER_NAME > ARCHIVE_NAME.tar cat ARCHIVE_NAME.tar | docker import - NEW_IMAGE_NAME:TAG - TAG is a label, for example, latest.

Docker Installation

https://docs.docker.com/engine/install/

Uninstall old versions first

List all available versions: apt list docker-ce -a yum list docker-ce --show duplicates

Install a specific version: {yum|apt} install docker-ce-VERSION_STRING \ docker-ce-cli-VERSION_STRING containerd.io

Enable a non-root user to use Docker: usermod -aG docker USER newgrp docker # activate the group without logging out

Upgrading

Use the package manager install command with a newer version specified.
Existing containers will not be impacted.

Downgrading

Stop Docker.
Remove docker-ce and docker-ce-cli.
yum check-update, apt update, or equivalent.
Install the new version of Docker.
Existing containers will not be impacted.

Bash completion installation

https://docs.docker.com/compose/completion/

Command as of the time of writing: sudo curl -L "https://raw.githubusercontent.com/docker/compose/1.25.5/"\ "contrib/completion/bash/docker-compose -o /etc/"\ "bash_completion.d/docker-compose"\

Key concepts

Dockerfiles Define new images using YAML files.
New images can be built on top of existing images.

Docker Compose Define multi-container applications using YAML files. Useful for development, testing, and single-host environments.

Docker Swarm A distributed cluster of Docker machines. Swarm provides orchestration, HA, and scaling features.

Docker Services Run an application image on one or more nodes within a Docker Swarm.

Docker Stacks A collection of interrelated Docker Services that can be deployed and scaled as a single unit. Defined using YAML files.

Docker Registries Repositories of Docker images.

Docker Compose

https://docs.docker.com/compose

Run multi-container applications using a declarative format.

Installation

sudo curl -L \
"https://github.com/docker/compose/releases/download/1.25.5/docker-compose-"\
"$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
sudo ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose

Basic Usage

Create a docker-compose.yml in a project directory.
Example docker-compose.yml:

version: '3'
services:
  web:
    image: nginx
    ports:
    - "8080:80"
  redis:
    image: redis:alpine

docker-compose up -d: create and run the containers defined in the YAML config file. -d is detached mode.
docker-compose ps: list containers/services that are currently running under Docker Compose.
docker-compose down: remove the services/containers defined by docker-compose.yml.

Docker Networking

Commands

docker run --net={NETWORK_NAME|host|none}: run a Docker container with a specific network driver.
docker network create --driver {bridge|overlay|MACVLAN} <NETWORK_NAME>: create a new network with a specific network driver.
docker network create --driver overlay --attachable <NETWORK_NAME>: create an attachable overlay network.
docker network connect <NETWORK> <CONTAINER>: connect a container to an existing network.
Create a network with the MACVLAN network driver:

docker network create macvlan --subnet <CIDR_BLOCK> --gateway <GATEWAY_IP>\
-o parent=<NETWORK_DEVICE> <MACVLAN_NETWORK_NAME>

docker service create --network OVERLAY_NETWORK_NAME: create a service using an overlay network.
docker networks ls: list networks.
docker network inspect: see JSON configuration for a network.
docker network disconnect <NETWORK> <CONTAINER>: disconnect a network from a container.
docker network rm <NETWORK>: remove a network when there are no containers connected.

Theory

Basic Theory

Docker uses an architecture called the Container Networking Model (CNM) to manage networking for Docker containers.
Sandbox: An isolated unit containing all the netwroking components assoicated with a single container. Usually a Linux Network namespace.
Endpoint: Connects a sandbox to a network. Each sandbox/container can have any number of endpoints, but has exactly one endpoint for each network it is connected to.
Network: A collection of endpoints connected to one another.
Network Driver: Handles the actual implementation of the CNM concepts.
IPAM Driver: Automatically allocates subnets and IP addresses for networks and endpoints.
Bridge Network: Connects endpoints on a single host.
Overlay Network: Connects endpoints across multiple Swarm hosts.

Native Network Drivers

These are the built-in network drivers:

Host Bridge Overlay MACVLAN None
A specific network driver can be used by specifying --net=<DRIVER> in docker run.

Host: `--net=host`

Containers use the host’s networking resource directly.
No sandboxes.
Containers cannot use the same port(s).
Use case: simple setups, with only a couple of containers on a single host.

Bridge: `--net=<BRIDGE_NETWORK_NAME>`

A bridge network provides connectivity between containers on the same host.
The default driver for containers when Swarm is not in use.
Creates a Linux Bridge for each Docker network.
The default Linux bridge, which is created if network is not specified, is bridge0. ip link can be used to view bridges.
Use case: isolated networking on a single host.
The default overlay is called ingress and is used if no other network is specified.

Overlay: `--net=<OVERLAY_NETWORK_NAME>`

Connectivity across multiple Docker hosts.
Uses VXLAN data plane. Routing is transparent to the containers.
Automatically configured interfaces, bridges, etc.
Use case: networking between Swarm hosts.

MAVCLAN `--net=<MACVLAN_NETWORK_NAME>`

Endpoint interfaces are directly connected to host interfaces, reducing overhead and improving latency.
Harder to configure.
Greater dependency between MACVLAN and the external network.
Use cases: low latency is required or containers need IP addresses in the external subnet.

None `--net=none`

Containers are completely isolated.
If networking is required, it must be done manually.
None creates a separate networking namespace for each container but no interfaces or endpoints.
Use cases: when no networking is required or everything needs to be done manually.

DNS

Docker has in-built DNS.
Containers on the same network can be resolved using just their hostname.
Aliases can be created to specify ab alternative name for networks.
Aliases can be created using the --network-alias <ALIAS> in docker run or: docker connect --alias <ALIAS> <NETWORK> <CONTAINER>.

Custom DNS

Globally in /etc/docker/daemon.json:
```
{
    "dns": ["1.1.1.1"]
}
```
systemctl restart docker
Per host: docker run --dns <DNS_ADDRESS>

Exposing ports

There is a -p HOST_PORT:CONTAINER_PORT option for both docker run and docker service.
docker port <CONTAINER>: list the published ports for a container.
docker ps: also shows published ports.
Docker Swarm supports two modes for published ports for services: ingress and host.

docker service create -p mode=host,published=<HOST_PORT>,\
target=<CONTAINER_PORT> --name <SERVICE_NAME> <IMAGE_NAME>

Ingress

The default.
Uses a routing mash. The published port listens on ever node in the cluster. Ingress transparently directs incoming traffic to any task that is part of the service.

Host

Publishes the port directly on the host where a task is running.
Cannot have multuple replicas on the same not if this mode is used.
Traffic to the published port on the node goes directly to the task running on that specific node.
Host mode publishing is best used for global services because the downsides are not important.

Networking troubleshooting

docker logs <CONTAINER>: get container logs.
docker service logs <CONTAINER>: get service logs.
journalctl -u docker: Docker daemon logs.

Netshoot

Netshoot is an image that comes with a variety of network troubleshooting tools.
docker run --rm --network <NETWORK> nicolaka/netshoot [COMMAND]: run Netshoot and use [COMMAND], for example, curl.
Netshoot can also be used in interactive mode.
docker run --rm --network container:<CONTAINER_NAME>: run Netshoot inside the networking namespace for a container. This includes containers with --network none set.

Docker Registries

https://docs.docker.com/registry/

Deploy a Private Registry

https://docs.docker.com/registry/deploying/

docker run -d -p 5000:5000 --restart=always --name registry registry:2

Create a registry.

Override specific configuration options

Use -e REGISTRY_<OPTION_NAME_CAPS>=<VALUE> within the run command.

Ex:

docker run -d -p 5000:5000 --restart=always --name registry \
     	-e REGISTRY_LOG_LEVEL=debug registry:2

Override the entire configuration file

If the default configuration is not a sound basis for your usage, or if you are having issues overriding keys from the environment, you can specify an alternate YAML configuration file by mounting it as a volume in the container.

Ex:

docker run -d -p 5000:5000 --restart=always --name registry \
      -v `pwd`/config.yml:/etc/docker/registry/config.yml \
      registry:2

Basic YAML file:

version: 0.1
log:
  fields:
    service: registry
storage:
  cache:
    blobdescriptor: inmemory
  filesystem:
    rootdirectory: /var/lib/registry
http:
  addr: :5000
  headers:
    X-Content-Type-Options: [nosniff]
auth:
  htpasswd:
    realm: basic-realm
    path: /etc/registry
health:
  storagedriver:
    enabled: true
    interval: 10s
    threshold: 3

Complete list of options: https://docs.docker.com/registry/configuration/

Registry Logs

docker logs registry: access the registry logs.

The default level is info.

Basic Security

By default, the registry is completely unsecured. It does not use TLS and does not require authentication.
Basic auth:

mkdir -p ~/registry/auth && cd ~/registry
docker run --entrypoint htpasswd registry:2 -Bbn <USERNAME> <PASSWORD> \
	> auth/htpasswd

Self-signed certificate to enable TLS:

mkdir certs
openssl req \
	-newkey rsa:4096 \
	-nodes \
	-sha256 \
	-keyout certs/domain.key \
	-x509 -days 365 -out certs/domain.crt 

Stand-up the registry with the basic auth:

docker run -d -p 443:443 --restart=always --name registry \
	-v /home/docker/registry/certs:/certs \
	-v /home/docker/registry/auth:/auth \
	-e REGISTRY_HTTP_ADDR:0.0.0.0:443 \
	-e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
	-e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
	-e REGISTRY_AUTH=htpasswd \
	-e "REGISTRY_AUTH_HTTPASSWD_REALM=Registry Realm" \
	-e REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd \
	registry:2

Line by line explanation.

Mount the certs directy on the host as /certs in the container. Ditto for /auth. Listen on port 443 rather. REGISTRY_AUTH=htpasswd: use htpasswd mode. Setup relam for htpassd. Point to our auth file that has a username and a hashed passwd.
Quick test: curl -k https://localhost:443. No output is expected, which shows that the registry is responsive. -k skips the trust check because the cert is self-signed.

Interacting with a Registry

docker pull [REGISTRY_FQDN/]<IMAGE>[:TAG]: download an image to your local machine.
docker search <TEXT>: search the images that are available on a registry.

When pushing to a private registry an image must be tagged with the registry’s FQDN.

docker tag <IMAGE> <REGISTRY_FQDN>/<IMAGE_NAME_PRIVATE_REGISTRY>
docker push <REGISTRY_FQDN>/<IMAGE> 

Logging in to a Registry

docker login <REGISTRY_FQDN>: log in to a registry.

Will throw an error if the registry uses a self-signed certificate.
docker logout: log out of a registry.
There are two methods to overcome errors involving self-signed certificates.
Certificate verficiation can be turned off, which is insecure and not recommended. In /etc/docker/daemon.json add:
```
{
  "insecure-registries": ["DOCKER_REGISTRY_FQDN1"]
}
```
Restart Docker systemctl restart docker.
Alternatively, the public certificate can be provided to the Docker engine. This is the preferred method of of resolving the self-signed certificate problem.

mkdir -p /etc/docker/certs.d/docker/certs.d/<REGISTRY_FQDN>
## scp or cp the domain public cert (domain.crt above) to this folder.

Logging Drivers

https://docs.docker.com/config/containers/logging/configure/

Check the default logging driver

docker info | grep Logging

The defualt logging driver is json-file.
The default setting can be overriden per container.

Changing the default logging driver

Edit /etc/docker/daemon.json.

Add or edit the following lines:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "15m"
  }
}

sudo systemctl restart docker

Overriding the default logging driver for each container

docker run --log-driver <DRIVER> [--log-opt] [OPTION_1] \ 
	[--log-opt] [OPTION_2] [...] <IMAGE>

Ex: docker run --log-opt max-size=10m --log-opt max-file=3 alpine

Logging drivers

none local: custom format for minimal overhead. json-file syslog journald gelf: Graylog Extended Log Format. fluetd: forward logs using a unified logging layer. awslogs: CloudWatch splunk

Other options are available.

Log options

max-size=<NUMBER>[k,m,g]: unlimited by default max-file=<NUMBER>: number of files that can be present compress={TRUE|FALSE}

Additional options for log tags: labels, env, and env-regex.

Managing Docker Images

https://docs.docker.com/engine/reference/commandline/image/

docker image pull <IMAGE_NAME>[:TAG]

Download an image from a remote registry. Will only pull an image if it doesn’t exist locally. docker pull is exactly the same.
docker image ls: list the images that exist locally.

-a: list intermediate images as well.
`docker image inspect {IMAGE_NAME}”: inspect image metadata.

Output is JSON. --format "{{.ARG1}} {{.ARG2}} ...": template for getting a subset of info. Ex: --format "{{.Architecture}} {{.0s}}"

Output: amd64 linux
docker image rm <IMAGE>[:FLAG]: delete an image.

Alternative: docker rmi <IMAGE>[:FLAG] -f: force. Use carefully when there is a running container. Doesn’t delete the underlying files if they are currently being used. To fully remove an image:
docker image prune: delete images not referenced by tags or containers.

Called dangling images. No tags or repos in docker image ls -a.
docker image prune -a: delete all unused images (not used by a container).
docker image history {IMAGE_NAME}: view layers of an image.

Disk usage

docker system df: get information about disk usage on a system.
docker system df -v: get detailed information about disk usage on a system.

Docker Images

Containers and images use a layered filesystem. For example, an OS layer, Python and then a web app.
An image consists of one or more read-only layers, while the container adds one addition writable layers.
The layered approach allows us to share layers between containers and images themselves.
When building a new image, only the layers that have changed, and the layers above them, need to be changed.
docker image pull <IMAGE>: pull an image from Docker Hub.
docker image history <IMAGE>: shows all the layers in an image.

Zero-byte layers are no operation layers (no-op); they don’t contain any new data, or have any differences, from the previous layer.

Layers that have size greater than zero, are the layers that are relevant to the image.

Creating a Docker Image

Docker Security

Signing Images

Docker Content Trust (DCT) provides a secure way to verify the integrity of images before they are pulled or run on systems.
Image creators sign images with a cert.

Create and use a trust key

A Docker Hub account is necessary to generate a trust key for that registry.
Sign-in with docker login.
docker trust key generate <SIGNER_NAME>: a public key is generated in the current directory. A password needs to be entered and should be retained.
Use a key to add a signer on a new Docker image:
```
docker trust signer add --key <PUBLIC_KEY> <REGISTRY_USERNAME> \
<REPOSITORY_NAME>`
```
It will be necessary to create passwords for the root signing key and repository key, which should be retained.

Enabling DCT

export DOCKER_CONTENT_TRUST=1
In Docker Enterprise Edition, it can be enabled in daemon.json.
When DCT is enabled, attempting to pull or run an unsigned image will result in an error message.
docker trust sign <IMAGE><:TAG>: signs an image and push it to the registry.
When DOCKER_CONTENT_TRUST=1 is set, docker push automatically signs images.

Namespaces and Cgroups

Namespaces and Control Groups provide isolation to containers.
This limits exploits or priv esc attacks.

Docker daemon attack surface

The Docker daemon itself requires root privileges. Hence, we need to consider the attack surface presented by the Docker daemon.
Only allow trusted users access to the daemon.
This also nees to be considered for automation that accesses the Docker daemon.

Linux kernel capabilities

Docker uses a feature of the Linux kernel called capabilities to fine-tune what a process can access.
This means that a process can run as root inside a container, but does not have access to do everything root could normally do on the host.
For example, Docker uses net_bind_service capability to allow container processes to bind to a port below 1024 without running as root.

Encryptiong Overlay Networks

overlay networks can be encrypted using the option --opt encrypted when they are created using the docker network create command.

MTLS in Docker Swarm

Mutually Authenticated Transport Layer Security.
Certs are exchanged.
All communication is authenticated and encrypted.
A root CA is created when a Swarm is initialized.
Worker and managed nodes are generated with the CA.
MTLS is used for all cluster-level communication between swarm nodes.
Enabled by default.

Securing the Docker Daemon HTTP Socket

Note: this potentially needs review using official documentation. The tutorial was rushed.
Docker uses a socket that is not exposed to the network by default.
Docker can be configured to listen on an HTTP port, to allow remote management.
In order to do this securely, we need to:
- Create a CA.
- Create server and client certs.
- extendedKeyUsage = serverAuth for the server cert.
- extendedKeyUsage = clientAuth for the client cert.
- Configure the Docker daemon to use tlsverify mode.
- Configure the client co connect securely using the client cert.
Basic openssl CA with the usual cert format for web servers.
In /etc/docker/daemon.json:

{

    "tlsverify": true,
    "tlscacert": "/path/to/ca.pem",
    "tlscert": "/path/to/server/public_key.pem",
    "tlskey": "/path/to/server/seret_key.pem"
}

Edit the Docker Daemon unit file /lib/systemd/system/docker.service.
Edit the ExecStart line.
Change the value of the -H to 0.0.0.0:2376.
systemctl daemon-reload
systemctl restart docker
On the client:
- Copy the cert.pem, client_pub.pem, and client_priv.pem to the client.
- Move everything to ~/.docker. Create it if it doesn’t exist.
- export DOCKER_HOST=tcp://IP_OF_DOCKER SERVICE:PORT
- export DOCKER_TLS_VERIFY=1

Docker Storage Drivers

Also known as Graph drivers.

Storage Drivers

Use docker info to ascertain which Storage Driver is currently being used. This includes the devicemapper subtypes.
overlay2 is file-based storage and the default for Ubuntu and CentOS 8+. Preferred; no extra config.
devicemapper is block storage and the default for CentOS 7 and earlier. More efficient for lots of writes. direct-lvm for production. loopback-lvm has very poor performance. loopback-lvm is default.
btrfs or zfs storage drivers are used if the underlying filesystem is either of these.
aufs is legacy for Ubuntu 14.04 on kernerl 3.13.
vfs is for testing purposes and for use on non-CoW filesystems. Performance is poor. Not recommended.

Select a Storage Driver:

There are two methods to set the storage driver.
Don’t use both methods as it prevents Docker from starting.

Method 1 (not recommended)

Edit /usr/lib/systemd/system/docker.service.
Look for the ExecStart line.
Add the argument --storage-driver <DRIVER>.

Reload daemon config and restart docker:

sudo systemctl daemon-reload
sudo systemctl restart docker

Method 2 (recommended)

Create or edit /etc/docker/daemon.json.

Edit or add:

{
  	"storage-driver": "devicemapper"
}

sudo systemctl restart docker

Change Storage Model

systemctl disable docker
rm -rf /var/lib/docker
vi /etc/docker/daemon.json # edit JSON
systemctl start docker
systemctl enable docker

Configuring DeviceMapper for `direct-lvm`

https://docs.docker.com/storage/storagedriver/device-mapper-driver/

direct-lvm should be used in

/etc/docker/daemon.json:

{  
  "storage-driver": "devicemapper",
  "storage-opts": [
    "dm.direct_lvm_device=/dev/xvd<a-z>",
    "dm.thinp_percent=95",
    "dm.thinp_metapercent=1",
    "dm.thinp_autoextend_threshold=80",
    "dm.thinp_autoextend_percent=20",
    "dm.directlvm_device_force=true"
  ]
}

dm.directlvm_device: The path to the block device to configure for direct-lvm. Required.
dm.thinp_percent: The percentage of space to use for storage from the passed in block device. Default: 95. Not required.
dm.thinp_metapercent: The percentage of space to use for metadata storage from the passed-in block device. Default: 1. Not required.
dm.thinp_autoextend_threshold: The threshold for when lvm should automatically extend the thin pool as a percentage of the total storage space. Default: 80. Not required.
dm.thinp_autoextend_percent: The percentage to increase the thin pool by when an autoextend is triggered. Default: 20. Not required.
dm.directlvm_device_force: Whether to format the block device even if a filesystem already exists on it. If set to false and a filesystem is present, an error is logged and the filesystem is left intact. Default: false. Not required.

Storage Models

Filesystem Storage

Data is stored in the form of a filesystem
Used by overlay2 and aufs.
Efficient use of memory.
Inefficient with write-heavy workloads.

Block Storage

Stores data in blocks.
Used by devicemapper.
Efficient with write-heavy workloads.

Object Storage

Stores data in an external object-based store.
Application must be designed to use object-based storage.
Flexible and scalable.

Storage Layers

Both containers and images have layers. Layers are stacked on top of each other. Each layer contains only the differences from the previous.

The container has all the layers of an image, plus one additional “Writable Container Layer”.

The location of the layered data on disk can be ascertained with docker inspect.

In docker inspect output, within the GraphDriver -> Driversection, there is a path to the location of the layers on disk.

Underlying Technology

Namespaces

Namespaces isolate processes.
Docker uses namespaces to isolate containers.

Types of namespaces:

pid: process isolation. net: network interfaces. ipc: inter-process communication. mnt: filesystem mounts. uts: kernel and version identifiers. user namespaces: requires config. Allows container processes to run as root inside the container while mapping that user to an unprivileged user on the host.

Control groups (cgroups)

Control groups limit what resources a process can use.

Docker Stacks

Services are capable of running a single replicated application across nodes in the cluster.
Stacks are a collcetion of iterrelated services that can be deployred and scaled as a unit.
Docker stacks are similar to the multi-container applications created using Docker Compose. However, they can be scaled and executed across the swarm just like normal Swarm sevices.
Docker stacks use compose files
Example docker-compose.yml:

version: '3'
services:
  web:
     image: nginx
     ports:
     - "8080:80"
  busybox:
    image: radial/busyboxplus:curl
    command: /bin/sh -c "while true; do echo $$MESSAGE; curl web:80; sleep 10;
done"
    environment:
    - MESSAGE=Hello!

docker stack deploy -c <COMPOSE_FILE> <STACK_NAME>: deploy a stack.
docker stack ls: list current stacks.
docker stack services <STACK_NAME>: list the services associated with a stack.
docker stack rm <STACK_NAME>: delete a stack.
Environment variables can be created and used in the compose file, as per the example.
Other containers in the stack can be referenced via their name in YAML, for example, curl web:80 in the example.

Docker Swarm

https://docs.docker.com/engine/swarm/

A swarm is a distributed cluster of Docker machines. A swarm has many features that can help provide orchestration, HA, and scaling.

docker swarm init --advertise-addr <IP_ADDRESS>: initalise the swarm.

Automatically makes the machine a swarm manager. The advertise address is the address other nodes will see.
docker node ls: list all nodes in the swarm.
docker swarm join-token {manager|worker}

Gets a new token and command to join a machine to the swarm.

Back-up and Restore a Docker Swarm

Back-up:

systemctl stop docker
Back-up everything in /var/lib/docker/swarm.
systemctl start docker

Restore:

systemctl stop docker
rm -rf /var/lib/docker/swarm/*
Extract the back-up to /var/lib/docker/swarm/.
systemctl start docker

Autolock

Docker Swarm encrypts raft logs and TLS communication. By default Docker manages the keys to enable this, however, they are left unencrypted on the managers’ disks.

Autolock locks the swarm, allowing for management of keys by the user.

Every time Docker is restarted, it is then necessary to unlock the swarm.

When Autolock is enabled, a key will print to stdout that will need to be saved in a secure place and used to unlock the swarm.

docker swarm init --autolock: initalise a Swarm with Docker.
docker swarm update --autolock=true: enable autolock on an existing Swarm.
docker swarm update --autolock=false: disable autolock on an existing Swarm.
docker swarm unlock-key: output the key on an unlocked Swarm manager.
docker swarm unlock-key --rotate: rotate the key. May take some time to propogate.

High Availability

Docker uses the Raft consensus algorithm to main a consistent cluster state across multiple managers.
For HA, it is a good idea to have multiple Swarm managers.
However, more managers can increase network load.
A quorum is the majority (more than half) of the managers in the swarm.
A quorum must be maintained in order to make changes to the cluster state.
Recommended to have an odd number of managers to satisfy the “more than half” requirement.
Recommended to have managers in multiple AZs.

Docker Services

A service is used to run an application on a Docker Swarm.
A task service specified a set of one or more replica tasks.
Tasks are distributed automatically across nodes.
docker service create <IMAGE>

--name <NAME> `-p :
The values of some flags can be dynamic using a feature called templates. hostname, mount, and env can use templates.

For example, to add the hostname of a node to a service as an environment variable:

-env NODE_HOSTNAME="{{.Node.Hostname}}"
docker service ls: list current services.
docker service ps <SERVICE>: list a service’s tasks.
docker service inspect <SERVICE>: see JSON config for a service.
docker service inspect <SERVICE> --pretty: YAML-like output.
docker servcice update <COMMAND_1> [COMMAND_2] <SERVICE>: update settings pm a service.

Ex: docker service update --replicas 2 nginx_service
docker service rm <SERVICE>: removes a service. Nothing else to do.
docker service create --replicas <INT> <SERVICE>: run repliacs of the task across the cluster.
docker service create --mode global <SERVICE>: run one task on each node in the cluster.
docker service logs <SERVICE>: view logs for a service.

Node Labels

Node labels are metadata.
docker node update --label-add <LABEL>=<VALUE> <NODE_NAME>: add a label.
docker node inspect --pretty <NODE>: output contains node labels.
Node labels can be used to add constraints when a service is created. Multiple containts can be added with multiple --constraint flags. All contstraints must be satisfied for a task to run on node.

ex: --constraint node.labels.availibilty_zone==east: run on nodes with an AZ level value of east.
--placement-pref spread=node.label.availability_zone: can be used to spread services evenly across nodes by label. Note that null counts as a label.

Docker Enterprise Edition

EE Features

Universal Control Plane (UCP): a web interface for Docker Swarm.
Docker Trusted Registry (DTR): an enterpise-grade private registry with more features, including a web interface.
Vulnerability scanning.
Federated application management. Manage all applications in one UI, usually UCP.

Installation

You get a link when buying or trialing Docker EE.
Ubuntu installation:

DOCKER_EE_URL=URL
DOCKER_EE_VERSION=VERSION
sudo apt install -y \
apt-trasport-https \
ca-certificates \
curl \
software-properties-common
curl -fsSL "${DOCKER_EE_URL}/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
"deb [arch=$(dpkg --print-architecture)] $DOCKER_EE_URL/ubuntu \
$(lsb_release -cs) \
stable-$DOCKER_EE_VERSION"
sudo apt update
sudo apt install docker-ee:<version>
sudo usermod -aG docker USER
## activate group

Bash Completion

On Debian 10, Bash completion was not installed by default.

To install it:

sudo curl -L https://raw.githubusercontent.com/docker/compose/1.25.5/contrib/completion/bash/docker-compose -o /etc/bash_completion.d/docker-compose

Source: https://docs.docker.com/compose/completion/

Docker Volumes

Two types: Bind mounts and Volumes.
Both collectively, somewhat confusingly, called Docker Volumes.

Bind mounts

Mount a specific path on the host machine to the container.
Not portable; dependent on the host machine’s fileystem and directory

structure.

Volumes

Stores data on the host filesystem but the storage location is managed by

Docker.
More portable.
Can mount the same volume to multiple containers.

Mount Docker Volumes

"https://docs.docker.com/engine/reference/commandline/service_create/"

"#add-bind-mounts-volumes-or-memory-filesystems"

Either syntax can be used with each type of Docker Volume.
For services, only --mount can be used.

`--mount`

docker run --mount [type={bind|volume}],<source={VOLUME_NAME|PATH}>,\

<destination=PATH>,[RO]

type: bind (bind mount), volume, or tmpfs (temporarily in-memory storage).

Does not necessarily need to be specified because it can be extrapolated from

source.
source: volume name or bind mount path.
destination target dst: path to mount inside the container.
readonly RO: make the Docker Volume read-only.

`-v`

docker run -v <SOURCE>:<DESTINATION>:[OPTIONS]
If SOURCE is a volume name, a volume is created. If SOURCE is a path, a

bind mount will be created.
DESTINATION: path to mount inside the container.
OPTIONS: comma-separated list of options. For example, ro for read-only.

Create Volumes

docker volume create <VOLUME_NAME>: create a volume.
docker volume ls: list current volumes.
docker volume inspect <VOLUME_NAME: detailed volume information.
docker volume rm <VOLUME_NAME: delete a volume.

Storage in a Cluster

There are two options for sharing data between nodes in a cluster:

Application logic to store data in external object storage.
Use a volume driver to create a volume that is external to any specific

machine in a cluster.
A list of volume drivers is available at:

https://docs.docker.com/engine/extend/legacy_plugins/

Simple Volume Driver Example with `sshfs`

On all nodes on the Swarm:

docker plugin install --grant-all-permissions vieux/sshfs

The Correct Way

docker service create \

--replicas <INT> \

--name <NAME> \

--mount volume-driver=vieux/sshfs,source=<VOLUME_NAME>,destination=<PATH>,\

volume-opt=sshcmd=<USERNAME>@<HOST>:<DEST_PATH>,volume-opt=password=<PASSWORD>\

<IMAGE>

Gotcha

On the server where the external volume is located:

docker create --driver vieux/sshfs \

-o sshcmd=<USERNAME>@<HOST>:<PATH> \

-o password=<PASSWORD> \

sshvolume

docker service create --mount source=sshvolume

Although this looks like this may work, we are only passing the arguments on

the node where docker create occurs. Any additional tasks on other nodes will

not have the arguments, therefore, will not write to external storage.

Creating Docker Images

A Dockerfile is a set of instructions used to construct a Docker image. These

instructions are called directives.

Steps

Create a directory.
Create a file Dockerfile inside the directory.
Populate the Dockerfile.
Run docker build -t <IMAGE_NAME> <PATH>.
<PATH> is the path of the directory where the Dockerfile resides.

Basic Dockerfile Syntax

https://docs.docker.com/engine/reference/builder/

#: single-line comment.
FROM {scratch|image_name}: starts a new build stage and sets the base image.

Usually must be the first directive in the Dockerfile.

ARG can be placed before FROM though.

scratch means from nothing.

FROM stages can be used by appending AS {NAME}.
ENV {IDENTIFIER} {STRING}: set environment variables.

These can be references in the Dockerfile itself.

The variables are visible to the container at runtime.

References using $INDENTIFIER.
RUN {COMMAND}: creates a new layer on top of the previous layer by running

a command inside that new layer and committing the changes.
CMD ["COMMAND","ARG1","ARG2"]: Specify a default command used to run a

container at execution time.

When no commands are passed to docker run, this command is run.

Ex: CMD ["nginx","-g","daemon off;"]

More Dockerfile Syntax

EXPOSE {PORT}: documents which port(s) are intended to be published when

running a container.

For documentation purposes. No direct impact with docker run.
WORKDIR {PATH}: sets the current working directory for subsequent

directives.

Can be used multiple times in a single file.

Can use relative paths.

For use with directives such as ADD, COPY, CMD and ENTRYPOINT.

Final WORKDIR determines the working directory at runtime.
COPY {LOCAL_PATH} {TARGET_PATH}: copy files from the local machine to the

image.

Relative paths, based off of the current WORKDIR, can be used.

There are some things that COPY can do that ADD cannot.

COPY --from={NAME|INDEX} {TARGET} {DESTINATION} can copy files from a

previous build stage. See the Multi-Stage Builds section.
ADD {URL|FILE_PATH|ARCHIVE_PATH} {TARGET_PATH}: like COPY but more

features.

Can copy from a URL or extract files from an archive.
STOPSIGNAL {SIGNAL}

Specify the signal that will be used to stop the container.

Location in the file doesn’t matter.

Ex. STOPSIGNAL SIGTERM.
HEALTHCHECK CMD {COMMAND}: create a custom health check.

By default Docker monitoring is insufficient because it just uses exit

codes.

Ex: HEALTHCHECK CMD curl localhost:80

Example 1 (nearly bare minimum)

##Simple Nginx image

FROM ubuntu:bionic

ENV NGINX_VERSION 1.14.0-0ubuntu1.3

RUN apt-get update && apt-get install -y curl

RUN apt-get update && apt-get install -y nginx=$NGINX_VERSION

CMD ["nginx","-g","daemon off;"]

Gotchas

Every time an image is re-built, only the layers that have changed are

re-built. Consider the following:

RUN apt-get update && apt-get install -y curl

RUN apt-get update && apt-get install -y nginx

If we re-built the image with a different version of Nginx, the layer created

to install curl would not re-build, hence, the need to include

apt-get update in both RUN statements.

Efficiency

Put things that are less likely to change on lower-level layers.
Don’t create unnecessary layers.
Avoid including any unnecessary files, packages, etc, in the image.

Multi-Stage Builds

A multi-stage build uses multiple FROM directives. Multi-stage builds can help

in building more efficient images.

For example, if we were to compile within a single-stage build, the compiler

binaries would be included in the resulting image.

To avoid this, we can copy files from one stage to another using COPY --from.

Example:

FROM golang:1.12.4 AS compiler

WORKDIR /helloworld

COPY helloworld.go .

RUN GOOS=linux go build -a -installsuffix cgo -o helloworld .

FROM alpine:3.9.3

WORKDIR /root

COPY --from=compiler /helloworld/helloworld/ .

## from=0 could also be used

CMD ["./helloworld"]

The resulting Docker image is 7MB, rather than 700MB using a single-stage

build.

Flattening an Image

Images with fewer layers can have better performance. Images can be flattened

using the following process:

Run a container from the image.
docker export {CONTAINER_NAME} > {FILE_NAME}.tar
cat {FILE_NAME}.tar | docker import - {NEW_IMAGE_NAME:TAG}
An appropriate TAG might be latest.

Docker installation in Debian

Uninstall old versions


sudo apt-get remove docker docker-engine docker.io containerd runc

Set up repos

Dependencies:

sudo apt-get update

sudo apt-get install \

    apt-transport-https \

    ca-certificates \

    curl \

    gnupg-agent \

    software-properties-common

GPG key:

curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -

Verify that you now have the key with the fingerprint 9DC8 5822 9FC7 DD38 854A E2D8 8D81 803C 0EBF CD88:

sudo apt-key fingerprint 0EBFCD88

Add repos:

sudo add-apt-repository \

   "deb [arch=amd64] https://download.docker.com/linux/debian \

   $(lsb_release -cs) \

   stable"

Install docker:

sudo apt-get update

sudo apt-get install docker-ce docker-ce-cli containerd.io

Enable a user to use Docker:

sudo usermod -aG docker james

newgrp docker # activate the new group

Changing Docker Versions

Upgrading Docker

apt-get install -y docker-ce=5:18.09.5~3-0~ubuntu-bionic \

	docker-ce-cli=5:18.09.5~3-0~ubuntu-bionic

docker version

No need to worry about existing containers.

Downgrading Docker

systemctl stop docker

apt remove -y docker-ce docker-ce-cli

apt update

apt install -y docker-ce=5:18.09.4~3-0~ubuntu-bionic \

	docker-ce-cli=5:18.09.4~3-0~ubuntu-bionic

docker version

This will not impact containers.

Docker Installation on CentOS

sudo yum install device-mapper-persistent-data lvm2

CentOS 7:

sudo yum-config-manager \

    --add-repo https://download.docker.com/linux/centos/docker-ce.repo

CentOS 8:

dnf config-manager \ 

	--add-repo https://download.docker.com/linux/centos/docker-ce.repo

List versions: yum list docker-ce --showduplicates | sort -r

Install a specific version:

sudo yum install docker-ce-<VERSION_STRING> docker-ce-cli-<VERSION_STRING> containerd.io

--nobest was necessary because the optimal version of containerd wasn’t available.

Add a user to the Docker group to enable the use of Docker command

sudo usermod -aG docker james

newgrp docker # activate the new group

Docker

Fundamental commands

Docker Volumes

Volume drivers and storage in clusters

Docker images and Dockerfiles

Dockerfile Syntax

Important notes on Dockerfiles

Docker Installation

Upgrading

Downgrading

Bash completion installation

Key concepts

Docker Compose

Installation

Basic Usage

Docker Networking

Commands

Theory

Basic Theory

Native Network Drivers

Host: --net=host

Bridge: --net=<BRIDGE_NETWORK_NAME>

Overlay: --net=<OVERLAY_NETWORK_NAME>

MAVCLAN --net=<MACVLAN_NETWORK_NAME>

None --net=none

DNS

Custom DNS

Exposing ports

Ingress

Host

Networking troubleshooting

Netshoot

Docker Registries

Deploy a Private Registry

Override specific configuration options

Override the entire configuration file

Registry Logs

Basic Security

Interacting with a Registry

Logging in to a Registry

Logging Drivers

Check the default logging driver

Changing the default logging driver

Overriding the default logging driver for each container

Logging drivers

Log options

Managing Docker Images

Disk usage

Docker Images

Creating a Docker Image

Docker Security

Signing Images

Create and use a trust key

Enabling DCT

Namespaces and Cgroups

Docker daemon attack surface

Linux kernel capabilities

Encryptiong Overlay Networks

MTLS in Docker Swarm

Securing the Docker Daemon HTTP Socket

Docker Storage Drivers

Storage Drivers

Select a Storage Driver:

Method 1 (not recommended)

Method 2 (recommended)

Change Storage Model

Configuring DeviceMapper for direct-lvm

Storage Models

Filesystem Storage

Block Storage

Object Storage

Storage Layers

Underlying Technology

Namespaces

Control groups (cgroups)

Docker Stacks

Docker Swarm

Back-up and Restore a Docker Swarm

Autolock

High Availability

Host: `--net=host`

Bridge: `--net=<BRIDGE_NETWORK_NAME>`

Overlay: `--net=<OVERLAY_NETWORK_NAME>`

MAVCLAN `--net=<MACVLAN_NETWORK_NAME>`

None `--net=none`

Configuring DeviceMapper for `direct-lvm`

`--mount`

`-v`

Simple Volume Driver Example with `sshfs`