Docker Part II - How to build Docker Files and Images

Hi, I am Malathi Boggavarapu and welcome to my blog. This is the continuation course of Docker. Before going through this course, please take a look at my previous post 'What is a Docker' which introduce you to the building blocks of Docker. This course will teach you about most important concepts such as Dockerfile and Images with examples, pictures and real world usecases.

So let's get started to learn about what a Dockerfile and Docker Image is.

Dockerfile is a text file where we put bunch of instructions or commands so that Docker can use it to create an image. See the below picture for instructions or commands that are available.

By using the complete set of commands which were shown in above picture, we build a Dockerfile.

So let's look at FROM in the above picture. Using FROM command we can refer to the base image that is going to be used while building an image. Example it can be used to ask 'i need the user space for Ubuntu or i need the user space for CentOS or i need a java image'. CMD is used to for let's say you can ask to do something like 'copy my jar file from here or do Yum install' and so on.

Now think about in real world use cases, let's say you are running a devops organization and you want everyone to use consistent development or deployment environment then you can create an image which comprises base operating system image and required packages and then say guys 'within our team we are going to use this image'. It reduces the impedance mismatch which means that there would not be any confusion between the team members about the versions of softwares that should be installed because it will be in image that is created.

We can have multiple versions of image. We can say java:8, java:9 and if you have dependency on java:7 then you can fallback to the older version of the release. So in docker terms, it is called as Tag. So we can say or ask what is the Tag of my image? By default the Tag is java:latest. If you want you can use different tags. So if you go to hub.docker.com, you will see many variety of Tags that are available which were assigned to each image.

Now let's see how we build an Image. We build a very classical "Hello World" image which have only two instructions in the Dockerfile.

FROM ubuntu
CMD echo "Hello world"

In the above Dockerfile, we say FROM ubuntu, so i am starting with base operating system image. It actually means that "Hello World" image requires ubuntu image to be pulled or downloaded.
CMD is command that is going to be run by Docker container. In Dockerfile, if we have multiple CMD's, it will ignore the previous CMD's and it will just pick the last CMD.

For suppose over a period of time, you want to change from "Hello world" to something else like "Hello world again" then you need to build a new image and you can Tag it appropriately. If you don't Tag, by default it is Latest. So the idea is you can always push the Latest version to the Docker hub and users have access to the latest version. It is just like think about SNAPSHOT version in the Maven language.

So let's take a look at little advanced sample Dockerfile.

FROM java
COPY target/hello.jar /usr/src/hello.jar
CMD java -cp /usr/src/hello.jar or .example.App

So here i am saying FROM java, here we are not using base Operating system but using java as base image which uses Debian as Operating system. Then we use COPY command to copy hello.jar. The last command is CMD. The fact that i am using FROM java is that it assumes that the JDK is already installed and it already has java in CLI path. So i can just say java -cp /usr/src/hello.jar or .example.App.

Docker file Best Practices

Some of the best practices to consider is as follows.

Containers should be ephemeral - When you are building an image, make sure that there is no state stored in them. While there are certain times where you have to store a state. Example databases are by nature stateful but there are ways we can deal with that. But when we are building a web application, make sure that it is stateless because that way if a container dies we can bring up the container anywhere else.

.dockerignore - Just like gitignore we have .dockerignore file. This is used to exclude files which are not relevant for the build.

Avoid installing unnecessary packages

Think about packages that you really want to install and keep it lean.

Run only one process per container

Very strongly recommended practice is run only one process per container. Using CMD people do all kind of funky things like people will invoke a script and script will fire up database and app server. If the container goes down there is a single point of failure. So it is highly recommended that there should be only one process per container.

Minimize the number of layers

When Docker builds a image it creates multiple layers, each instruction in file essentially becomes a layer. You really have to minimize the number of layers so that the lesser the number of layers is the faster the image is loading up and it makes your image much more efficient to run.

These are some of the best practices but you can always follow the below link to learn more.

https://docs.docker.com/develop/develop-images/dockerfile_best-practices/

Union File System

Union File System is another important thing to understand from Docker perspective. This is a standard Linux concept. What i mean by that is, if you look at java 8 image in the below picture, i am giving java:8 and this the Tag that we have discussed earlier in this session. If we look at java 8 image and if we look at Dockerfile of it, that will have FROM buildpack-deps:jessie-scm. Now if we look at Dockerfile for buildpack-deps:jessie-scm it will have another FROM and eventually you will realize java 8 image is actually build in the debian:jessie release. So you can literally track back to your parent POM or your parent Dockerfile and look at exactly what is happening. All of these layers are read-only layers that are collapsed and they are unionized to give union view of entire file system. That's what makes Docker really fast and really good to boot up and startup.

Now let's say you have downloaded an image Couchbase. See below picture. let's discuss this a bit so you will have clear understanding of Docker image and Docker file

Using docker images couchbase, you can see the docker images that were available. So if you want to see the history of the image, it shows you the commands that were executed to build an image under CREATED BY column and on the right size it shows the size of each image like 135MB - a base Operating system, 200MB - database and so on.

And also let's take a look at Docker image for Java. In below picture, you can see all the java images and the history of the commands that are executed to build java image. One important point to understand is the images are stored in Docker Host. None of the images are on Client. For development purpose i might have the Client and Host on the same machine which is OK and that's the flexibility you have because this is completely disconnected architecture. So today my HOST is here and tomorrow HOST can be running on some other machine, it doesn't matter. It just scales over there very well.

So that's all about Dockerfile and Docker image. Hope you enjoyed learning about them. Please post if you have any questions. Interactive sessions is the great way to learn and explore more details.

Happy Learning!

In my next post, i am going to discuss about the following concepts in Docker.

- Installing Docker, Building an image and Running containers
- Docker Option and --help
- Docker Machine and Docker Toolbox
- Docker workflow for a developer
- Docker Maven Plugin

Search This Blog

A Blog by Malathi Boggavarapu