C1 - Docker Image for Python

Mar 23th, 2024 at Canberra, ACT, Australia

Recently I packed up a python program into a Docker image. It took me roughly 3hrs to set up a fully functional docker image for distribution (thanks to docker cached layers!!!).

Here, I’m listing the necessary steps that might help some newbies like me.

Directory Hierarchy

As an example, suppose we have a python program prog consists python scripts under src/, libraries under lib/, together with requirements.txt for pip (or environment.yml for conda), we can format prog as the following.

    prog/
        |-src/
            |-main.py
        |-lib/
        |-environment.yml
        |-Dockerfile

Dockerfile

Dockerfile is a configuration file to build a docker image, where we can find the reference here. Basically, here are several keywords might be helpful for creating the Dockerfile.

An example of Dockerfile is given as below.

FROM ubuntu:latest
MAINTAINER WHOAM I <whoami@where.com>

RUN apt-get -y update
RUN apt-get -y install curl

# check archs
RUN echo $(uname -a)
RUN mkdir -p /opt/miniconda3

# miniconda
RUN curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o /opt/miniconda3/miniconda.sh

RUN ln -s md5sum /bin/md5
RUN bash /opt/miniconda3/miniconda.sh -b -u -p /opt/conda;
RUN echo "export PATH=/opt/conda/bin:$PATH" > /etc/profile.d/conda.sh
ENV PATH "/opt/conda/bin:${PATH}"

COPY environment.yml /opt/environment.yml

# install all dependencies
RUN conda env create -f /opt/environment.yml

COPY ./prog /opt/prog

# clean up
RUN rm -rf /opt/miniconda3/
RUN apt-get remove -y --purge curl
RUN apt-get autoremove -y
RUN apt-get clean -y

ENTRYPOINT conda run -n venv python /opt/prog/src/main.py

Build the image

After creating the Dockerfile, we can build the image via docker build. Before doing so, make sure docker application is started already (use docker ps to check). To build the image, we can traverse to the directory prog/ and execute the command.

$ docker build --platform=<platform> -t <image_name> .

Where <platform> should be the destination platform, such as linux/amd64, linux/arm64, or darwin/amd64. We can supply multiple and connected by comma as well. <image_name> is optional, and a random string will be provided if <image_name> is not given. We can check the image name and id via docker image ls.

Run the image

Right now, we have created a docker image. we can run docker buildx prune with y to cleanup all the caches generated during the image build step. Then, we can run docker run to create a container and load the image, as follows.

docker run -w /opt -v $(pwd)/src:/opt/dst -it --rm <image_name> args

Here, -w/--workdir specifies the docker working directory, -i/--interactive to keep STDIN open even if not attached, -t/--tty to allocate a pseudo-TTY, and --rm to delete the created container after completion.

-v/--volume is the option to bind mount the volume from local directory to a directory within the docker. As an example, -v $(pwd)/src:/opt/dst bind mount the local directory $(pwd)/src to the docker directory /opt/dst within the container. When we create/modify/delete a file within one of them, it will be reflected/shared by the other. In this case, when our program generates output files, we can place them into /opt/dst. After program completion, they can be found under $(pwd)/src. We can also put input files into $(pwd)/src before docker run as well. args are the arguments passing to the command specified after ENTRYPOINT in Dockerfile, which are served as command-line arguments.

Be mindful that the filesystem in docker is independent to the local machine. Thus, any shared files or directories must be bind mounted via -v.

Now, you should have a portable docker image, enjoy it!


Built with Jekyll, using the Researcher theme.