Data Persistence and Networking in Docker


  One of the key features of Docker containers is their ephemeral nature: they are designed to be disposable. This means that any data written inside a container is lost when it is removed. For real applications that need to store information (databases, logs, user-uploaded files), data persistence is crucial. Furthermore, for services in a multi-container application to communicate with each other and with the outside world, it is essential to understand networking in Docker. In this lesson, we will explore the mechanisms for managing data and configuring networks in Docker.


Data Persistence in Docker


There are two main mechanisms for persisting data with Docker: Volumes and Bind Mounts.


1. Volumes

Volumes are Docker's preferred mechanism for persisting data. They are managed by Docker and reside in a part of the host's file system that is controlled by Docker.

  • Advantages: More robust for production, managed by Docker, easy to back up and migrate, better performance in some cases than bind mounts.
  • Usage: Ideal for databases or services that require persistent storage.

Examples with Docker CLI:

# Create a named volume
docker volume create my-app-data

# Use the volume when starting a container (e.g., MongoDB)
docker run -d \
  --name my-mongo \
  -p 27017:27017 \
  -v my-app-data:/data/db \ # my-app-data is the host volume, /data/db is the path in the container
  mongo:latest

# Inspect a volume
docker volume inspect my-app-data

# List volumes
docker volume ls

Examples with Docker Compose:

# docker-compose.yml
version: '3.8'
services:
  db:
    image: mongo:latest
    volumes:
      - db-data:/data/db # Reference to the volume defined below
volumes:
  db-data: # Named volume definition

2. Bind Mounts

Bind mounts allow you to directly mount a directory or file from the host's file system into a container. Docker does not manage the location of the file or directory on the host.

  • Advantages: Very useful for development (e.g., hot-reloading code), easy access to files from the host.
  • Usage: Local development where code changes frequently.

Examples with Docker CLI:

# Mount the current host directory into /app inside the container
docker run -d \
  --name my-nodejs-dev \
  -p 3000:3000 \
  -v "$(pwd)":/app \ # "$(pwd)" to get the current directory in Linux/macOS
  -v /app/node_modules \ # Add an anonymous volume for node_modules to prevent overwrites
  node:18-alpine npm start

Examples with Docker Compose:

# docker-compose.yml
version: '3.8'
services:
  web:
    build: .
    ports:
      - "3000:3000"
    volumes:
      - .:/usr/src/app # Mounts the project directory
      - /usr/src/app/node_modules # Prevents overwriting container's node_modules

When to use each?

  • Volumes: For persistent data that must outlive a container (e.g., databases, user data), especially in production.
  • Bind Mounts: For working with source code during development (hot-reloading) or when you need the container to access specific host files (e.g., configuration files).

Networking in Docker


Networking allows communication between containers, with the host, and with the outside world. Docker provides different network drivers.


Common Network Types:

  • Bridge:

    This is the default network. Containers connected to the same `bridge` network can communicate with each other using their IP addresses or, if used with Docker Compose, using their service names. Containers are only accessible from the outside if their ports are mapped to the host.

    # A container on the default bridge network
    docker run -d --name my-nginx -p 80:80 nginx
    
    # Create a custom bridge network
    docker network create my-app-network
    
    # Connect containers to a custom network
    docker run -d --name serviceA --network my-app-network my-image-a
    docker run -d --name serviceB --network my-app-network my-image-b
  • Host:

    The container shares the host's network stack. This means the container does not have its own isolated IP address, but instead uses the host's. There is no network isolation between the host and the container, and no port mapping is required.

    docker run -d --network host my-app-image
  • None:

    The container has no network interface, only the loopback interface.


Networking with Docker Compose


Docker Compose greatly simplifies network configuration. By default, Compose creates a `bridge` network for your entire project, and all services in your `docker-compose.yml` join this network by default. This allows services to communicate with each other using their service names as hostnames.


Example of implicit and explicit network in Compose:

# docker-compose.yml
version: '3.8'
services:
  web:
    build: .
    ports:
      - "3000:3000"
    environment:
      DB_HOST: db # Communication with the 'db' service using its name
    networks:
      - app-tier # Assign to a custom network

  db:
    image: mongo:latest
    volumes:
      - db-data:/data/db
    networks:
      - app-tier # Assign to the same custom network

networks: # Custom network definition
  app-tier:
    driver: bridge # Optional, it's the default value

Port Mapping (`ports`):

The `EXPOSE` instruction in a `Dockerfile` is just a declaration of intent. For a service in a container to be accessible from the host or from the outside, you must map the container's port to a port on the host using the `-p` option (or `ports` in Docker Compose).

# docker run -p <HOST_PORT>:<CONTAINER_PORT>
docker run -p 8080:3000 my-nodejs-app

# In docker-compose.yml
services:
  web:
    ports:
      - "8080:3000"

  Understanding and correctly applying data persistence and networking is vital for building robust and production-ready applications with Docker. Volumes ensure that your critical data is not lost, while proper network configuration ensures that your services can communicate seamlessly and that your application is accessible to users. These are the cornerstones for successfully deploying containerized applications.

JavaScript Concepts and Reference