
Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: June 5, 2025
When building Docker images for Python applications, waiting for packages to reinstall on every Docker image build can become frustrating. For instance, in regions with slower internet connections, it can derail productivity. To resolve this and avoid unnecessary package reinstallation, we need to understand layer caching in Docker.
In this tutorial, we’ll first discuss the problem, create a simple project to demonstrate how Docker rebuilds images, and finally, improve our Dockerfile.
We want to prevent the reinstallation of Python packages on every Docker image build when the packages haven’t changed.
Here’s a Dockerfile structure that leads to redundant installations:
FROM python:3.10-slim
WORKDIR /app
ADD . /app
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
When we make a code change and rebuild the image, Docker re-executes all steps starting from ADD . /app. As a consequence, the cache for pip install -r requirements.txt is invalidated, leading to the reinstallation of all packages — even if they haven’t changed.
The combination of this Docker structure and iterative development results in slower builds because of repeated pip install, wasted bandwidth and CPU resources, and poor caching behavior.
Let’s explore how Docker builds images to understand why separating COPY instructions is crucial.
Docker builds images in layers whereby each command like COPY, RUN, or ADD creates a new layer. If a layer hasn’t changed, Docker reuses the cached version of that layer instead of rebuilding it. However, in case a command’s input changes — for instance, when files are copied — Docker invalidates that layer and all layers after it.
That’s why ADD . /app forces the reinstallation of packages since it includes all the source code and files. Any small change to a source file invalidates the cache for the pip install step, forcing package reinstallation. Even renaming a Python file or updating a comment inside a file can break the cache for subsequent layers.
With this in mind, developers can structure Dockerfiles more intentionally for maximum build efficiency. Thus, the instruction COPY requirements.txt ./ helps Docker avoid unnecessary rebuilds unless the dependency file itself changes.
Let’s set up a simple Python project to walk through the problem and its solution:
$ tree flask-demo
flask-demo
├── app.py
├── Dockerfile
└── requirements.txt
0 directories, 3 files
The tree command above displays the structure of our project.
First, let’s create the file app.py:
$ cat app.py
from flask import Flask
app = Flask(__name__)
@app.route("/")
def home():
return "Hello, Docker!"
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
After this, let’s create the file Dockerfile:
$ cat Dockerfile
FROM python:3.10-slim
WORKDIR /app
# Add source code and requirements in one step
ADD . /app
# Install dependencies
RUN pip install -r requirements.txt
CMD ["python", "app.py"]
Finally, let’s create requirements.txt:
$ cat requirements.txt
flask==2.3.2
Once we create the project files and paste the necessary content, let’s build the image:
$ docker build -t flask-demo .
[+] Building 72.5s (9/9) FINISHED docker:desktop-linux
...
=> [1/4] FROM docker.io/library/python:3.10-slim@sha256:49454d2bf78a48f217eb25ecbcb4b5face313fea6a6e82706465a6990303ada2 47.0s
...
=> [2/4] WORKDIR /app 1.4s
=> [3/4] ADD . /app 1.1s
=> [4/4] RUN pip install -r requirements.txt 14.6s
...
The command above adds the entire directory (requirements.txt and app.py files) into the image and installs the dependencies listed in requirements.txt.
Now, let’s slightly modify app.py, for instance by changing the return string in the / route to “Hello, World!” and rebuilding the image:
$ docker build -t flask-demo .
[+] Building 24.1s (9/9) FINISHED docker:desktop-linux
...
=> [1/4] FROM docker.io/library/python:3.10-slim@sha256:49454d2bf78a48f217eb25ecbcb4b5face313fea6a6e82706465a6990303ada2 0.0s
...
=> CACHED [2/4] WORKDIR /app 0.0s
=> [3/4] ADD . /app 1.0s
=> [4/4] RUN pip install -r requirements.txt 15.3s
...
During the image rebuild process, pip install runs again even though requirements.txt hasn’t changed.
To clarify, Docker builds images in steps and usually attempts to reuse previous results to save time. However, when we use ADD . /app, it copies all our files at once. So, even if we modify only one file, Docker assumes the entire directory was updated. Thus, the cache becomes invalid, forcing Docker to repeat all the steps that come after, such as reinstalling packages.
To prevent unnecessary reinstallations of packages, we need to separate the addition of requirements.txt and the installation step from the rest of the application code. This modification ensures Docker caches the pip install layer as long as requirements.txt doesn’t change.
Let’s begin by optimizing our Dockerfile:
$ cat Dockerfile
FROM python:3.10-slim
WORKDIR /app
# Install dependencies early for caching
COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt
# Copy only the rest of the app after dependencies are installed
COPY . .
CMD ["python", "app.py"]
Here’s a breakdown of the modifications:
Now, if we change app.py but not requirements.txt, Docker skips the package installation and uses the cached layer.
Once we modify the Dockerfile and optimize it, let’s build the image:
$ docker build -t flask-demo .
[+] Building 25.5s (10/10) FINISHED docker:desktop-linux
...
=> [1/5] FROM docker.io/library/python:3.10-slim@sha256:49454d2bf78a48f217eb25ecbcb4b5face313fea6a6e82706465a6990303ada2 0.0s
...
=> CACHED [2/5] WORKDIR /app 0.1s
=> [3/5] COPY requirements.txt ./ 0.9s
=> [4/5] RUN pip install --no-cache-dir -r requirements.txt 15.7s
=> [5/5] COPY . . 1.4s
...
Running this command shows that pip install is executed.
Next, let’s modify app.py and then rebuild:
$ docker build -t flask-demo .
[+] Building 6.7s (10/10) FINISHED docker:desktop-linux
...
=> [1/5] FROM docker.io/library/python:3.10-slim@sha256:49454d2bf78a48f217eb25ecbcb4b5face313fea6a6e82706465a6990303ada2 0.0s
...
=> CACHED [2/5] WORKDIR /app 0.0s
=> CACHED [3/5] COPY requirements.txt ./ 0.0s
=> CACHED [4/5] RUN pip install --no-cache-dir -r requirements.txt 0.0s
=> [5/5] COPY . . 0.9s
...
Here, Docker skips the pip install step.
Finally, let’s modify requirements.txt and then rebuild:
$ echo "requests==2.31.0" >> requirements.txt && docker build -t flask-demo .
[+] Building 30.2s (10/10) FINISHED docker:desktop-linux
...
=> [1/5] FROM docker.io/library/python:3.10-slim@sha256:49454d2bf78a48f217eb25ecbcb4b5face313fea6a6e82706465a6990303ada2 0.0s
...
=> CACHED [2/5] WORKDIR /app 0.0s
=> [3/5] COPY requirements.txt ./ 0.8s
=> [4/5] RUN pip install --no-cache-dir -r requirements.txt 21.4s
=> [5/5] COPY . . 1.1s
...
Due to an update in the requirements.txt file, Docker reruns the pip install step as expected.
Notably, we added –no-cache-dir to pip install:
RUN pip install --no-cache-dir -r requirements.txt
The addition prevents pip from saving downloaded packages, thereby reducing image size. Since Docker already caches the entire layer, –no-cache-dir helps keep the image small without impacting performance.
When building the Docker image, Docker sends the whole project directory to the Docker engine as the build context. Among the files sent, we can find files that aren’t needed in the image such as temporary files or version control data. To avoid this, we can create a .dockerignore file to exclude unnecessary files and directories:
$ cat .dockerignore
__pycache__/
*.pyc
*.pyo
*.pyd
.env
.git
The .dockerignore file instructs Docker to skip these files during the build. From this addition, we get faster build times and a smaller build context, and we also avoid accidental cache invalidation caused by unrelated file changes. Thus, .dockerignore helps us make the Docker builds more efficient.
Caching can still misbehave if requirements.txt changes often — for instance, if requirements.txt contains loosely defined package versions.
To make Docker caching even more effective, we can pin our Python package versions in the requirements.txt file. To clarify, we lock each dependency to a specific version:
$ cat requirements.txt
flask==2.3.2
requests==2.31.0
On the other hand, let’s consider unpinned versions:
$ cat requirements.txt
flask
requests
In this case, if the latest version of a package changes remotely, the image may reinstall everything, even though requirements.txt hasn’t changed.
So, when we pin versions we get more consistent caching and more predictable builds since our application runs with packages of specific versions.
In this article, we explored how to avoid unnecessary reinstallation of Python packages when building Docker images. First, we started with a simple Dockerfile that leads to repeated package reinstallations and slow builds. Then, we demonstrated how Docker’s layer caching works and how it can be optimized.
By separating requirements.txt from the rest of the code, we enable Docker to cache the pip install step. As a result, the image build time is reduced during iterative development.
Additionally, we can use a .dockerignore file to reduce build context size and enhance caching reliability. Finally, we can lock each dependency to a specific version for consistent and predictable builds.