Pod Redundancy in Kubernetes or Why Deleted Pods Get Recreated

1. Introduction

Most basic features of the Kubernetes orchestration platform can be imitated via scripts and specifications directly via the configured container runtime. The benefit of Kubernetes is the convenient orchestration and automation of these activities. For instance, many deployments make use of different methods that the platform offers for redundancy and maintenance.

In this tutorial, we explore why Kubernetes pods often get recreated when deleted and the mechanism behind this feature. First, we briefly refresh our knowledge about pods. After that, we turn to restart policies and their function. Next, we go over the interaction between pods and controller. Then, we check how and why Kubernetes can recover from pod deletion. Finally, we see some ways to reduce or disable redundancy in Kubernetes.

We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. Unless otherwise specified, it should work in most POSIX-compliant environments.

2. Kubernetes Pods

A pod in Kubernetes is the minimal computing unit. It can comprise one or more containers, which share network, storage, and other resources.

Let’s run a basic pod with two containers:

$ kubectl apply --filename=- <<< '
apiVersion: v1
kind: Pod
metadata:
  name: compod
spec:
  containers:
  - name: deb
    image: debian:latest
    command: ["sh", "-c"]
    args: ["echo Initialized... && sleep 666"]
    tty: true
  - name: nginx
    image: nginx:latest
    command: ["sh", "-c"]
    args: ["echo Initialized... && nginx -g \"daemon off;\""]
    tty: true
'

Here, we use kubectl to apply a relatively simple Pod definition. Specifically, we create and run a Pod [name]d compod with two containers (deb and nginx) running the debian:latest and nginx:latest [image]s respectively.

Let’s check on the pod status status:

$ kubectl get pod compod
NAME     READY   STATUS    RESTARTS   AGE
compod   2/2     Running   0          12s

Effectively, by running this pod, we execute two tasks. Often, each task is a server, service, or microservice of some kind. Thus, we usually want the relevant process to be available.

3. Pod Restart Policies

Because of the common requirement to keep a container and, by extension, pod running, Kubernetes supports several restart policies:

Always: immediately restart the container (default)
OnFailure: restart the container only if it exited because of a failure
Never: no restart on exit

The restart policy is enforced via the Kubernetes pod monitoring facilities. If a container exists, its restart policy is checked and complied with.

Using the pod definition from earlier as an example, a slight modification demonstrates the effect of restarts:

$ kubectl apply --filename=- <<< '
apiVersion: v1
kind: Pod
metadata:
  name: compod
spec:
  containers:
  - name: deb
    image: debian:latest
    command: ["sh", "-c"]
    args: ["echo Initialized... && sleep 5"]
    tty: true
  - name: nginx
    image: nginx:latest
    command: ["sh", "-c"]
    args: ["echo Initialized... && sleep 5"]
    tty: true
'

Now, both containers output a string and sleep for 5 five seconds. Due to the short running period, the containers exit and then restart according to the (default) policy in effect: Always. Thus, we can see the restart counter increasing every few seconds:

$ kubectl get pod/compod
NAME     READY   STATUS    RESTARTS        AGE
compod   2/2     Running   4 (5s ago)      36m

Importantly, restart policies only work for existing pods that terminate. Simply deleting the pod won’t activate a restart policy.

In any case, to change the restart policy, we’d have to include the pod in a Deployment or other controlled entity.

4. Pods and Controllers

Kubernetes works with pods as its minimal deployment unit. However, sometimes a single pod definition isn’t enough to support the expected application load.

In such cases, we combine pods with a controller such as ReplicaSet, Deployment, or others. This combination is usually called a workload.

For instance, Kubernetes provides a ReplicaSet controller. In short, it supports a mechanism to ensure a certain number of replica pods of a given kind is always running.

Further, many of the Kubernetes controllers have a replicas field that indicates how many pods of the same kind we want running and a selector that uses labels to map pods to controllers.

Let’s see a fairly minimal ReplicaSet example:

$ kubectl apply --filename=- <<< '
apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: compod-rs
  labels:
    name: compod
spec:
  replicas: 3
  selector:
    matchLabels:
      name: compod
  template:
    metadata:
      labels:
        name: compod
    spec:
      containers:
      - name: deb
        image: debian:latest
        command: ["sh", "-c"]
        args: ["echo Initialized... && sleep 666"]
        tty: true
      - name: nginx
        image: nginx:latest
        command: ["sh", "-c"]
        args: ["echo Initialized... && nginx -g \"daemon off;\""]
        tty: true
'

In this case, the selector is a name label equal to compod and the pod definition with that label is within the ReplicaSet.

As a result, we see more than one pod:

$ kubectl get rs compod-rs
NAME        DESIRED   CURRENT   READY   AGE
compod-rs   3         3         3       30s
$ kubectl get pods --selector=name=compod
NAME              READY   STATUS    RESTARTS   AGE
compod-rs-dlh4d   2/2     Running   0          31s
compod-rs-m7r5b   2/2     Running   0          31s
compod-rs-vrmff   2/2     Running   0          31s

First, we get the compod-rs [R]eplica[S]et information via its name. After that, we use a –selector to filter the pods by the same name label equal to compod. Notably, there are three pods, each with a random unique five-character identifier at the end.

This mechanism is similar for other controllers such as Deployment.

5. Pod Deletion

Knowing how controllers watch over pods, let’s understand what may happen to a pod when it gets deleted after being defined as part of a controller.

As an example, we use the ReplicaSet we created earlier:

$ kubectl get pods --selector=name=compod
NAME              READY   STATUS    RESTARTS   AGE
compod-rs-dlh4d   2/2     Running   0          51s
compod-rs-m7r5b   2/2     Running   0          51s
compod-rs-vrmff   2/2     Running   0          51s

At this point, we delete one of the pods from the compod-rs ReplicaSet:

$ kubectl delete pod compod-rs-dlh4d
pod "compod-rs-dlh4d" deleted

By the time we get the shell prompt back after the deletion, we can rerun the listing above:

$ kubectl get pods --selector=name=compod
NAME              READY   STATUS    RESTARTS        AGE
compod-rs-m7r5b   2/2     Running   0               1m
compod-rs-ntskj   2/2     Running   0               5s
compod-rs-vrmff   2/2     Running   0               1m

Already, we see three pods are again in place within the compod-rs ReplicaSet. In particular, compod-rs-ntskj replaced the deleted compod-rs-dlh4d.

This way, Kubernetes maintains the status quo as expected by the administrators, clients, deployment, application, and any other involved parties and components.

6. Reduce or Disable Redundancy

In some cases, we might want to minimize or completely prevent the effect that Kubernetes has on pods respawning. There are several considerations in this regard.

6.1. Change Restart Policy

To begin with, a single pod doesn’t usually have any redundancy beyond its basic restart policy:

$ kubectl get pod/compod --output=jsonpath='{..restartPolicy}'
Always

Here, we use the –output option with a jsonpath to extract the restartPolicy.

If we try to edit it for a pod, Kubernetes won’t let us:

$ kubectl edit pod/compod
[...]
error: pods "compod" is invalid

To change the restartPolicy, we can only recreate a pod or pair it with a controller, which has an alternative preconfigured. Even if we change the default policy system-wide, we’d still have to recreate the pod.

Naturally, if a pod already has a controller, we can attempt to change the restartPolicy of that controller. Yet, Kubernetes doesn’t allow this either. Thus, to modify the policy of a controller, we’d have to recreate it as well.

6.2. Edit Controller Replicas

With many [CONTROLLER_TYPE]s, we can set the NUMBER of –replicas for a given WORKLOAD_NAME via the scale subcommand of kubectl:

$ kubectl scale <CONTROLLER_TYPE> <WORKLOAD_NAME> --replicas=<NUMBER>

For instance, let’s do this for the compod-rs ReplicaSet:

$ kubectl scale replicaset compod-rs --replicas=1
replicaset.apps/compod-rs scaled

Thus, we’re left with a single pod:

$ kubectl get pods --selector=name=compod
NAME        DESIRED   CURRENT   READY   AGE
compod-rs   1         1         1       12h

Notably, we can even use 0 as the –replicas number. Doing so results in no running pods, but doesn’t delete the definitions, so we can always scale up again.

Yet, critically, just like the restartPolicy, the [selector]s are immutable after creation.

6.3. Delete or Detach From Controller

If we decouple a pod from a controller, it should act like any other stand-alone entity and disappear as soon as we delete it. Further, deleting the controller drops the pod monitoring also stops.

Of course, without a controller watching over it, we also expect a pod to act according to the current global restartPolicy.

7. Summary

In this article, we talked about redundancy at the pod level in Kubernetes and understood why pods may get recreated even after being manually deleted.

In conclusion, when we delete a pod in Kubernetes, its controller acts according to its definition, which in many cases means the pod is recreated automatically.

Full Archive

About Baeldung