A full example of this repository can be found on my Github here.


I often find myself exploring data using Python, Jupyter Lab, Pandas and various graphing programs. This has caused me to want to have a light weight, easy to use, and portable approach to standing up these tools. In the past, I have found myself using tools like pyenv to install a desired version of python for each project. While there is nothing wrong with this approach, I generally prefer to have my project packaged up with Docker, and Docker Compose. Therefore, let this blog post show you how I’ve achieved just that.

Right out of the gate, we have to decide if we’re going to build our own base Docker image or leverage one that someone has already built and published. Fortunately, we do not have to create our own from scratch, as the Jupyter project as created a number of public images that you can choose from here. In this example, I’ve decided to use the image titled jupyter/minimal-notebook, which contains:

  • Everything in jupyter/base-notebook
  • Common useful utilities like curl, git, nano (actually nano-tiny), tzdata, unzip, and vi (actually vim-tiny),
  • TeX Live for notebook document conversion

If I was going to be leveraging a GPU, I’d look at the CUDA enabled variants, as they’ll have CUDA preinstalled.

From there we’re able to start writing our docker-compose.yaml file, which will be the meat of our configuration. Let’s look at the entire file, and then break it down:

version: "3"

services:
    jupyterlab:
        image: jupyter/minimal-notebook
        volumes:
        - .:/home/jovyan/work
        - ./configure_environment.sh:/usr/local/bin/before-notebook.d/configure_environment.sh
        ports:
        - 8888:8888

In this simple file, we define a single service called jupyterlab, and point to the very image we described above. We take advantage of two volumes entries:

  1. To ensure that the files we’re working on on our host OS are accessible in the container, we volume in the entire directory into the image’s default user’s home directory.
  2. To tie into this image’s Startup Hooks, we define and volume in a bash script called configure_environment.sh. This shell script will automatically be executed by the container before the Jupyter Lab process is started, allowing us perform some pre execution commands like installing our dependencies and preparing environment variables. That bash script looks like this:
#!/bin/bash
set -eo pipefail
IFS=$'\n\t'
# From: http://redsymbol.net/articles/unofficial-bash-strict-mode/
# Not using `set -u`, as this will not play nice with the image's usage of `JUPYTER_ENV_VARS_TO_UNSET`

# To be run inside the Docker container automatically.
# Ensures that the dependencies are installed when the container spins up

pip install -r ~/work/requirements.txt
export PYTHONPATH=PYTHONPATH:/home/jovyan/work

Finally we describe the ports we want to use, exposing port 8888 on the host OS, as well as binding to port 8888 which Jupyter Lab will use in the container.

We also expect to have a requirements.txt file which describes our dependencies that we’ll leverage in our Notebook. For example:

pandas==2.2.1
python-dotenv==1.0.1

To start our container, we can just run $ docker compose up, which will emit a message giving us a URL with an embedded token to authenticate with our service. For example, it might look like this:

jupyterlab-1  |     To access the server, open this file in a browser:
jupyterlab-1  |         file:///home/jovyan/.local/share/jupyter/runtime/jpserver-7-open.html
jupyterlab-1  |     Or copy and paste one of these URLs:
jupyterlab-1  |         http://c6bcfb18cb93:8888/lab?token=somefaketoken
jupyterlab-1  |         http://127.0.0.1:8888/lab?token=somefaketoken

Then you’d navigate to http://127.0.0.1:8888/lab?token=somefaketoken or localhost:8888/lab?token=somefaketoken in your web browser.

Disclaimers

  1. You’ll notice that you have one directory labeled work. Please be aware that only the files you write into this directory will actually be saved, as this is the only directly that is volumed into the container by our docker-compose.yaml file.
  2. If you want to install a new dependency, the easiest way is to modify the requirements.txt and restart the container. If that is to heavy handed, you can modify the requirements.txt, which will set you up for future executions, and then access a terminal inside the Jupyter Lab instance to install the dependency for this session.

Conclusion

Overall, this should provide a simple, non-production approach to using Jupyter Lab and Python for data exploration.


A full example of this repository can be found on my Github here.