Open source Java projects: Docker

03.11.2015
Docker is an open platform for building, shipping, and running distributed applications. Dockerized applications can run locally on a developer's machine, and they can be deployed to production across a cloud-based infrastructure. Docker lends itself to rapid development and enables continuous integration and continuous deployment like almost no other technology does. In short, Docker is a platform that every developer should be familiar with.

This installment of Open source Java projects introduces Java developers to Docker. I'll explain why it's important to developers, walk you through setting up and deploying a Java application to Docker, and show you how to integrate Docker into your build process.

A little over a decade ago, software applications were large and complex things, deployed to large machines. In the Java world, we developed enterprise archives (EARs) that contained both Enterprise JavaBeans (EJB) and web components (WARs), then we deployed them to large application servers. We did everything that we could to design our applications to run optimally on large machines, maximizing all of the resources available to us.

In the early 2000s, with the advent of the cloud, developers began using virtual machines and server clusters to scale out applications to meet user demand. Applications deployed virtually had to be designed quite differently from the monoliths of years past. Lighter weight, service-oriented applications were the new standard. We learned to design software as a collection of interconnected services, with each component being as stateless as possible. The concept and implementation of scalable infrastructure transformed; rather than depend on the vertical scalability of a single large machine, developers and architects started thinking in terms of horizontal scalability: how to deploy a single application across numerous lightweight machines.

Docker takes this virtualization a step further, providing a lightweight layer that sits between the application and the underlying hardware. Docker runs the application as a process on the host operating system. Figure 1 compares a traditional virtual machine to Docker.

A traditional virtual machine runs a hypervisor on the host operating system. The OS, in turn, runs a full guest operating system inside the virtual machine. The guest operating system then hosts the binaries and libraries required to run an application.

Docker, on the other hand, provides a Docker engine, which is a daemon that runs on the host operating system. The Docker engine translates operating system calls in the Docker container to native calls on the host operating system. A Docker image, which is the template from which Docker containers are created, contains a bare-bones operating system layer, and only the binaries and libraries required to run an application.

The differences might seem subtle, but in practice they are profound.

When we look at the operating system in a virtual machine, we see the virtual machine's resources, such as its CPU and memory. When we run a Docker container, we directly see the host machine's resources. I liken Docker to a process virtualization platform rather than a machine virtualization platform. Essentially, your application is running as a self-contained and isolated process on the host machine. Docker achieves isolation by leveraging a handful of Linux constructs, such as cgroups and namespaces, to ensure that each process runs as an independent unit on the operating system.

Because Dockerized applications run similar to processes on the host machine, their design is different from applications that run on a virtual machine. To illustrate, we might normally run Tomcat and a MySQL database on a single virtual machine, but Docker would have us run the app server and database in their own, respective Docker containers. This allows Docker to better manage the individual processes as self-contained units on the host operating system. It also means that in order to effectively use Docker, we need to design our applications as finely granular services, like microservices.

In a nutshell, microservices is a software architectural style that facilitates a modular approach to system building. In a microservices architecture, complex applications are composed of smaller, independent processes. Each process performs one or more specific tasks, communicating with other processes via language-independent APIs.

Microservices are very fine-grained, highly decoupled services that perform a single function, or a collection of related functions, very well. For example, if you are managing a user's profile and shopping cart, rather than packaging them together as a set of user services, you might opt to define them separately, as user profile services and user shopping cart services. In practical terms, building microservices means building web services, most commonly RESTful web services, and grouping them by functionality. In Java, we will package these as WAR files and deploy them to a container, such as Tomcat, then run Tomcat and our services inside a Docker container.

Before we dive into Docker, let's get your local environment set up. It's great if you are running Linux: you can just install Docker directly and start running it. For those of us using Windows or a Mac, Docker is available through a tool called the Docker Toolbox, which installs a virtual machine (using Oracle's Virtual Box technology), which runs Linux with the Docker daemon. We can then use the Docker client to execute commands that are sent to the daemon for processing. Note that you won't be managing the virtual machine; you'll just be installing the toolbox and executing the docker command line tool.

Start by downloading Docker with the appropriate Mac, Windows, or Linux setup instructions.

I use a Mac, so I downloaded the Mac version of Docker Toolbox and ran the installation file. Once the install completed I ran the Docker Quickstart Terminal, which started the Virtual Box image and provided a command shell. The setup should be more or less the same for Windows users, but see the Windows instructions for more information.

Before we start using Docker, take a minute to visit DockerHub, the official repository for Docker images. Explore DockerHub and you'll find that it hosts thousands of images, both official ones and many built by independent developers. You'll find base operating systems like CentOS, Ubuntu, and Fedora as well as configured images for Java, Tomcat, Jetty, and more. You can also find almost any popular application out-of-the-box, including MySQL, MongoDB, Neo4j, Redis, Couchbase, Cassandra, Memcached, Postgres, Nginx, Node.js, WordPress, Joomla, PHP, Perl, Ruby, and so on. Before you build an image, make sure it's not already on DockerHub!

As an exercise, try running a simple CentOS image. Enter the following command in your Docker Toolbox command prompt:

The docker command is your primary interface to communicating with the Docker daemon. The run directive tells Docker to download and run the specified image (assuming it isn't already on your computer). Alternatively, you can download an image without running it by using the pull directive. There are two arguments: i tells Docker to run this image in interactive mode, and t tells it to create a TTY shell. Note that unofficial images are named with the convention username/image-name, while official images are run without a username, which is why we only need to specify "centos" as the image we want to run. Finally, you can specify a version of the image to run by appending a :version-number to the end of the image name, such as centos:7. Each image defines a latest version that is used by default; in the case of CentOS, the latest version is 7.

After running $ docker run -it centos you should see Docker downloading the image, after which it will present an output similar to the following:

Because we ran this container in interactive mode, it presented us with a root command shell. Browse around the operating system for a little bit and then exit by executing the exit command.

You can see all images that you have downloaded by executing docker images:

You can see that I have the latest versions of CentOS and Tomcat, as well as Java 8.

Starting a Tomcat instance inside Docker is a little more complex than starting the CentOS image. Issue the command:

In this example, we're running tomcat as a daemon process, using the -d argument. We're exposing port 8080 on our Docker container as port 8080 on our Docker host (the Virtual Box virtual machine). When you run this command you should see output similar to the following:

This horribly long hexidecimal number is the container ID, which we will use in a minute. You can see all the running processes by executing docker ps:

You'll notice that the container ID returns the first 12 characters from the ID above. This ID is painful to type, so Docker allows you to specify enough of it in commands to uniquely identify it. For example, you could specify "bdb" and that would be enough for Docker to uniquely identify this instance. In order to see when Tomcat has finished loading, you would typically tail the catalina.out file. The alternative in the Docker world is to view the standard output using the docker logs command. Specify the -f argument to follow the logs:

Exit the log tailing by pressing Ctrl-C.

To test Tomcat, you need to find the address of your Virtual Box host. When you started the Docker Quickstart Terminal, you should have seen a line that looked like the following:

Alternatively, you can look at the DOCKER_HOST environment variable to find the machine's IP address:

Open a browser window to port 8080 on the Docker Host:

You should see the standard Tomcat homepage.

Before we stop our instance, use the docker command line tool to learn a little more about our processes:

When you're finished, you can stop Tomcat by executing the docker stop command with your container ID: $ docker stop bdb.

You can confirm that Tomcat is no longer running by executing docker ps again to verify that it is no longer listed. You can see the history of all the images you have run by executing docker ps -a:

At this point you should understand how to find images on DockerHub, how to download and run instances, how to view running instances, how to view an instance's runtime statistics and logs, and how to stop an instance. Now let's turn our attention to how Docker images are defined.

A Dockerfile is a set of instructions that tells Docker how to build an image. It defines the starting point and what specific things to do to configure the image. Let's take a look at a sample Dockerfile and walk through what it is doing. Listing 1 shows the Dockerfile for the base CentOS image.

You'll note that most of this Dockerfile is comments. It has four specific instructions:

Now that you have a sense of what a Dockerfile looks like, let's explore the official Tomcat Dockerfile. Figure 2 shows this file's hierarchy.

This hierarchy might not be as simple as you would have anticipated, but when we take it apart piece-by-piece it is actually quite logical. You already know that the root of all Dockerfiles is scratch, so seeing that as the root is no surprise. The first meaningful Dockerfile image is debian:jessie. The Docker official images are built from build packs, or standard images. This means that Docker does not need to reinvent the wheel every time it creates a new image, but rather it has a solid base from which to start building new images. In this case, debian:jessie is simply a Debian Linux image installed just like the CoreOS that we just looked at. It includes the three lines in listing 2.

On top of that we see two additional installations: CURL and Source Code Management. The Dockerfile for buildpack-deps:jessie-curl is shown in Listing 3.

This Dockerfile uses apt-get to install curl and wget so that the image will be able to download software from other servers. The RUN directive tells Docker to execute the specified command on the running instance. In this case it updates all libraries (apt-get update) and then executes the apt-get install to download and install curl and wget.

The Dockerfile for buildpack-deps:jessie-scp is shown in Listing 4.

This Dockerfile installs source code management tools, such as Git, Mercurial, and Subversion, following the same model as the jessie-curl Dockerfile we just looked at.

The Java Dockerfile is a little more complicated; it is shown in Listing 5.

Essentially, this Dockerfile runs apt-get install -y openjdk-8-jdk to download and install Java, with some added configuration to install it securely. The ENV directive sets system environment variables that will be needed during the installation.

Finally, Listing 6 shows the Tomcat Dockerfile.

Technically, Tomcat uses a Java 7 parent Dockerfile (the default or "latest" version of Java is 8, which is shown in Listing 5). This Dockerfile sets up the CATALINA_HOME and PATH environment variables using the ENV directive. It then creates the CATALINA_HOME directory by running the mkdir command. The WORKDIR directive changes the working directory to CATALINA_HOME. The RUN command executes a host of different commands in a single line:

Defining all of these instructions in one command means that Docker sees a single instruction, and caches the resulting image as such. Docker has a strategy for detecting when images need to be rebuilt and each instruction is verified during the build process. It caches the result of each step as an optimization, so that if the last step in a Dockerfile changes, Docker is able to start from a mostly complete image and just apply that last step. The downside of specifying all of these commands in one line is that if any one of the commands change, the whole Docker image will need to be rebuilt.

The EXPOSE directive tells Docker to expose a particular port on the Docker container when it starts. As you saw when we launched this image earlier, we needed to tell Docker what physical port to map to the container port (the -p argument), and the EXPOSE directive is the mechanism for defining available Docker instance ports. Finally, the Dockerfile starts Tomcat by executing the catalina.sh command (which is in the PATH) with the run argument.

Building the Dockerfile from inception to Tomcat was a long process, so let me summarize all the steps so far:

At this point you should be a Dockerfile expert -- or at least somewhat dangerous! Next we'll try building a custom Docker image that contains our application.

Because this tutorial is more about deploying a Java application to Docker and less about the actual Java application, I have built a very simple Hello World servlet. You can access the project on GitHub. The source code is nothing special; it's just a servlet that outputs "Hello, World!" What's more interesting is the accompanying Dockerfile, shown in Listing 7.

It might not look like much, but you should already understand what the code in Listing 7 is doing:

Clone the project locally and build it using the Maven command:

This will create the file, target/helloworld.war. Copy that file to the project's docker/deploy directory (which you will need to create). Finally, you need to build the Docker image from the Dockerfile. Execute the following command from the project's docker directory (which already contains the Dockerfile in Listing 7):

This command tells Docker to build a new image from the current working directory, which is indicated by the dot (.) with the tag (-t): lygado/docker-tomcat. In this case, lygado is my DockerHub username and docker-tomcat is the image name (you'll want to specify your own username). To see that the image has been built you can execute the docker images command:

Finally, you can launch the image with the docker run command:

After the instance starts, you can access it through the following URL (be sure to substitute the IP Address of the VirtualBox instance on your machine):

Again, you can stop this Docker container instance with the docker stop INSTANCE_ID command.

Once you have built and tested your Docker image, you can push that image to your DockerHub account with the following command:

This makes your image available to the world (or your private DockerHub repository if you opt for privacy), as well as to any automated provisioning tools you will ultimately use to deploy containers to production.

Next we'll integrate Docker into our build process, so that it can produce a Docker image that includes our application.

In the previous section we created a custom Dockerfile and deployed our WAR file to it. This meant copying the WAR file from our project's target directory to the docker/deploy directory and running docker from the command line. It wasn't much work, but if you are actively developing and want to modify your code and test it immediately afterwards, you might find this process tedious. Furthermore, if you want to run your build from a continuous integration (CI) server and output a runnable Docker image, then you are going to figure out how to integrate Docker with your CI tool.

So let's explore a more efficient process, where we build (and optionally run) a Docker image from Maven using a Maven Docker plug-in.

If you search the Internet you will find several Maven plugins that integrate with Docker, but for the purpose of this article I chose to use the following plug-in: rhuss/docker-maven-plugin. While it isn't an exhaustive comparison of all scenarios and plug-in providers, author Roland Huss provides a pretty good comparison of several contenders. I encourage you to read it, to help guide your decision about the best Maven Docker plug-in for your use case.

My use cases were:

The docker-maven-plugin accomplished these tasks, was very easy to use, and was easy to understand.

The plug-in itself is well documented, but as an overview it really consists of two main configuration components:

The build and run image configuration is defined in the plugins section of your POM file, shown in Listing 8.

As you can see, this configuration is pretty simple and consists of the following categories of elements:

Plug-in definition The groupId, artifactId, and version identify the plug-in to use.

Global configuration The dockerHost and certPath elements define the location of your Docker host, which is emitted when you start Docker, and the location of your Docker certificates. The Docker certificate file is available in your DOCKER_CERT_PATH environment variable.

Image configuration All images in your build are defined as image child elements under your images element. An image element has image-specific configuration, as well as build and run configuration (explained below). The core image-specific element is the name of the image you want to build. In our case this is my DockerHub username (lygado), the name of the image (tomcat-with-my-app), and the version of the Docker image (0.1). Note that you can use a Maven property for any of these values.

Image build configuration When you build an image, as we did with the docker build command, you needed a Dockerfile that defines how to build it. The Maven Docker plug-in allows you to use a Dockerfile, but in this example we are going to build the Docker image from a Dockerfile that is built on-the-fly and resident in memory. Therefore, we specify the parent image in the from element, which in this case is tomcat, then we reference an assembly configuration.

The maven-assembly-plugin, provided by Maven, defines a common structure for aggregating a project's output with its dependencies, modules, site documentation, and other files into a single distributable archive, and the docker-maven-plugin leverages this standard. In this example, we opt to use the dir mode, which means that the files defined in the src/main/docker/assembly.xml file should be copied directly to the basedir on the Docker image. Other modes include Tar (tar), GZipped Tar (tgz), and Zipped (zip). The basedir specifies where the files should be placed in the Docker image, which in this case is Tomcat's webapps directory.

Finally, the descriptor tells the plug-in the name of the assembly descriptor, which will be located in the src/main/docker directory. This is a very simplistic example, so I encourage you to read through the documentation. In particular, you will want to familiarize yourself with the entrypoint and cmd elements, which allow you to specify the command to start the Docker image, the env element to specify environment variables, the runCmds to execute commands just as you would in a proper Dockerfile, workdir to change the working directory, and volumes if you want to mount external volumes. In short, this plug-in exposes everything that you need to be able to build a Dockerfile, which means that all of the Dockerfile directives specified earlier in the article are all available through the plug-in.

Image run configuration When you run a Docker image using the docker run command, you may pass a collection of arguments to Docker. In this example we start our Docker instance with a command like: docker run -d -p 8080:8080 lygado/tomcat-with-my-app:0.1, so essentially we only need to specify our port mapping.

The run element allows us to specify all of our runtime arguments, so we specify that we should map port 8080 on our Docker container to port 8080 on our Docker host. Additionally, the run section allows us to specify volumes to bind (using the volumes element) and containers to link together (using the links element). Because the docker:start command is often used with integration tests, the run section includes the wait argument, which allows us to wait for a specified period of time, for a log entry, or for a URL to become available before continuing execution. This ensures that our image is running before we launch any integration tests.

The src/main/docker/assembly.xml file defines the files (or file in this case) that we want to copy to the Docker image. It is shown in Listing 9.

In Listing 9 we see a dependency set that includes our hello-world-servlet-example artifact, and we see see that we want to output it to the . directory. Recall that in our POM file we defined a basedir, which specified the location of Tomcat's webapps directory; the outputDirectory is relative to that base directory. So in other words we want to deploy the hello-world-servlet-example artifact to Tomcat's webapps directory.

The plug-in defines a set of Maven targets:

You can access the source code for the example application from GitHub, then build it as follows:

To build a Docker image you can execute the following command:

Once the image is built, you can see it in Docker using the docker images command:

You can run the container with the following command:

Now you should see it running with the docker ps command. It should be accessible through the same URL defined above:

Finally, you can stop the container with the following command:

Docker is a container technology that virtualizes processes more than it does machines. It consists of a Docker client process that communicates with a Docker daemon running on a Docker host. On Linux, the Docker daemon runs directly as a process on the Linux operating system, whereas on Windows and Mac it starts a Linux virtual machine in VirtualBox and runs the Docker daemon there. Docker images contain a very lightweight operating system footprint, in addition to whatever libraries and binaries you need to run your application. Docker images are driven by a Dockerfile, which defines the instructions necessary to configure an image.

In this Open source Java projects tutorial I've introduced Docker fundamentals, reviewed the details of the Dockerfiles for CentOS, Java, and Tomcat, and showed you how to build a Dockerfile from Tomcat. Finally, we integrated the docker-maven-plugin into a build process and we built a Docker image from Maven. Being able to build and run a Docker container as part of our build makes testing easier, and it also allows us to build Docker images from a CI server that is ready for production deployment.

The example application for this article was very simple, but the steps you've learned are applicable to more complex enterprise applications. Happy docking!

(www.javaworld.com)

Steven Haines