Since the inception of Bio::Neos 16 years ago, it has provided life science researchers with custom-designed software. Recently, we have spent much of our time developing web applications. For efficiency, we typically have our developers work on a local copy of the codebase instead of using the web server environment. This means that we need to write a lot of portable code.
So, what is a container, and what is containerization?
A container is any receptacle or enclosure for holding a product used in storage, packaging, and shipping. This is a very broad definition and is used in many aspects of life. In terms of software development, containerization is an alternative to a Virtual Machine that involves encasing an application in a container with its own operating system. The primary difference between virtual machines and containers is that containers create a way to virtualize an OS so that multiple workloads can run on a single OS instance. Using virtual machines, the hardware is virtualized to run multiple OS instances. The speed, agility, and portability of containers make them another tool to help streamline software development.
I primarily focus on Docker because it simplifies and accelerates workflow while also allowing developers to have the freedom to innovate with their choice of tools, application stacks, and deployment environments for each project. Docker defines a container as a “standardized unit of software” that allows you to package software into standardized units for development, shipment, and deployment. Using Docker Compose alongside of Docker allows these “standardized units” to communicate with each other just as easily as if they were in one container. For example, keeping a database and a web server in separate containers is good practice. This helps keep the containers small, modular, and easy to update if needed.
So what does this all look like? Let’s walk through a fairly common use case using a Node.js web server and MySQL database.
The Database
First, let’s start by defining the database container with our Dockerfile
. This file defines what docker image we are using s a base and copies the files we need into the container. In this case, the base image that we are using, mariadb:10.4.12, takes the files in the location that we are copying them into, /docker-entrypoint-initdb.d
and runs the database scripts alphabetically to create our database schema and add seed data used for testing.
# ~/ProjectRoot/docker/database/Dockerfile
FROM mariadb:10.4.12
# Copying files to show how you would copy from the host machine into the container.
COPY ./docker/database/seed.sql /docker-entrypoint-initdb.d/1_schema.sql
Now let’s define a simple SQL script containing a table definition and some information to be stored.
DROP TABLE IF EXISTS users;
CREATE TABLE users (
id INTEGER NOT NULL AUTO_INCREMENT,
name VARCHAR(60) NOT NULL,
password_hash VARCHAR(70),
salt VARCHAR(20),
PRIMARY KEY(id)
);
INSERT INTO users (name, password_hash, salt) VALUES ('Test User', '258ee75ab394cbedf7d2505cbdb7d01f4015f11159a305c139f8bfb46468b15d', 'salt123');
The Node Server
We will need three files to get our node server up and running. First is the node server docker image definition.
# ~/ProjectRoot/docker/webserver/Dockerfile
FROM node:12.18.1
# Switch to our app directory
WORKDIR /app
# Copy in our package.json for our app
COPY ./src/package.json package.json
# Install our dependencies
RUN npm install
# Start it up
CMD [ "node", "index.js" ]
Next, we need our simple “Hello World” Express app.
// ~/ProjectRoot/src/app.js
const express = require('express');
const app = express();
const port = 8080;
app.get('/', (req, res) => {
res.send('Hello World!')
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
And finally, our package file, which is located at ~/ProjectRoot/src/package.json
{
"name": "src",
"version": "1.0.0",
"description": "",
"main": "app.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"author": "",
"license": "ISC",
"dependencies": {
"express": "^4.17.1"
}
}
Docker Compose
To run our containers so that they work together, we use Docker Compose. Docker Compose makes building and running containers a breeze. Below outlines what that docker-compose.yml
file looks like and includes some essential aspects.
# ~/ProjectRoot/docker-compose.yml
version: '3.5'
services:
server:
# Create an image from the given Dockerfile.
build:
context: .
dockerfile: ./docker/webserver/Dockerfile
# Give the image a name
image: test-docker-server
# Command to run on startup
command: [ 'node', 'app.js' ]
# Define the volumes
volumes:
- type: bind
source: ./src
target: /app
- type: volume
source: test-node-modules
target: /app/node_modules
# Port mapping (HOST:CONTAINER)
ports:
- "127.0.0.1:${LOCAL_SERVER_PORT}:8080"
depends_on:
- db
container_name: test-webserver
db:
build:
context: .
dockerfile: ./docker/database/Dockerfile
image: test-docker-db
# Define the environment variables to be used in the test-db container
# Docker compose looks for these variables in the .env file
environment:
- MYSQL_ROOT_PASSWORD=${MYSQL_ROOT_PASS}
- MYSQL_USER=${MYSQL_USER}
- MYSQL_PASSWORD=${MYSQL_PASS}
- MYSQL_DATABASE=${MYSQL_DB}
volumes:
- type: volume
source: test-data
target: /var/lib/mysql
ports:
- "127.0.0.1:${LOCAL_DB_PORT}:3306"
container_name: test-db
volumes:
test-data:
test-node-modules:
We can define environment variables denoted by ${VAR_NAME}
in a .env
file. This allows us to check the docker-compose.yml
file to version control, and each developer can create their passwords and determine which ports the application runs on.
# ~/ProjectRoot/.env
# Environment variables for use by Docker Compose
MYSQL_ROOT_PASS=rootpassword123
MYSQL_USER=testuser
MYSQL_PASS=testuserpassword!
MYSQL_DB=test-database
LOCAL_DB_PORT=3306
LOCAL_SERVER_PORT=8080
There is a decent amount to unpack here, and I urge you to read the Docker Compose documentation to understand it fully, but I want to point out a few things that I believe are critical to understanding from the docker-compose.yml
file.
- The volume definitions with the
bind
type allow files written to those locations to persist even when a container is stopped and restarted. Otherwise these will be lost and any data stored in the database would need to be recreated and the node modules would need to be reinstalled each time the containers are started.- The volume defintions with the
volume
type takes the source directory and mounts it directly into the container. Any changes that you make in that directory will be reflected in the docker container and vice versa. This allows you to access any data that is created in the container directly on the source machine. It is also great for developing since the image doesn't need to be recreated just to copy in the source files each time changes are made.- The port mappings definition that we set normally follows this convention
"127.0.0.1:<HOST_PORT>:<CONTAINER_PORT>"
. This way the running application is only accessible to the host machine and not available to anyone else sharing the network or even around the world in the case that you are working on a machine that is accessible via public DNS. This is important because passwords set for development tend to be on the weaker side.- Container networks are set up automatically and all of the containers are added to the network and reachable to the other containers by their hostname which matches the service name. For example, the database is accessible to the web server container at
mysql://db:33060/test-db
because db is the name of the service.
Running the Containers
Now that it is all configured, running the containers with the web server and database is like navigating to the directory with the docker-compose.yml
file and running the command:
docker-compose up
And that’s it! You should now be able to:
- Navigate to
http://localhost:8080
in the browser and see the running web server.- See that the database is up and running with:
- The command
docker exec -it test-db mysql -utestuser -p test-database
- Typing in the password
testuserpassword!
- And query for the user in the users table
SELECT * FROM users;
Final Note
Containerization is important because it allows for project standardization for all Devs and improved possibilities for DevOps. Setting up a full environment can be as simple as executing a single command. It also reduces workstation requirements. We can develop, execute, and test a web application locally without needing the internet. While containerization has some disadvantages, and not all projects can benefit from it, the amount of tasks that gain from using containers more than makes up for the time spent learning how to use them correctly.