1. Overview

Have you ever wondered what goes on behind the scenes when we start or kill a process? In this tutorial, we’ll learn how Linux generates PIDs for processes.

2. Linux’s Process Table

The Linux kernel uses a data structure called the process table for various tasks like process scheduling. It inserts an entry with the following information into the table whenever we start a process:

  • PID
  • Parent Process
  • Environment Variables
  • Elapsed Time
  • Status – one of D (Uninterruptible), R (Running), S (Sleeping), T (Stopped), or Z (Zombie)
  • Memory Usage

We can consume this information via the procfs file system mounted at the /proc directory through various resource monitoring tools like top.

Let’s take a look at some of the information by running the top command:

Mem: 4241112K used, 12106916K free, 360040K shrd, 20K buff, 1772160K cached
CPU:  0.8% usr  0.8% sys  0.0% nic 98.3% idle  0.0% io  0.0% irq  0.0% sirq
Load average: 0.26 0.33 0.35 1/489 21888
  PID  PPID USER     STAT   VSZ %VSZ CPU %CPU COMMAND
    1     0 root     S     2552  0.0  10  0.0 init
30405 30404 baeldung S     2552  0.0   7  0.0 /bin/sh
  219     1 root     S     2552  0.0  10  0.0 /bin/getty 38400 tty2
 1309  1308 baeldung S     2552  0.0   9  0.0 /bin/sh
21873   640 baeldung S     2552  0.0   5  0.0 [sleep]
21874 21758 baeldung R     2552  0.0   8  0.0 top

3. PID Generation

Linux allocates process IDs in sequence, starting at 0 and staying below a maximum limit.

The kernel’s idle task process, which ensures that a runnable task is always available for scheduling reserves PID 0 while the init system, being the first process, reserves PID 1.

We can check the limit on a system by looking at the /proc/sys/kernel/pid_max file. It is usually a 5-digit number:

$ cat /proc/sys/kernel/pid_max 
32768

We can configure the limit to a maximum number of 222 (4,194,304) by writing the desired number to the file as root:

# desired=4194304
# echo $desired > /proc/sys/kernel/pid_max
# desired=9999999
# echo $desired > /proc/sys/kernel/pid_max
sh: write error: Invalid argument

Our second attempt fails as the number is greater than 222.

When we launch a process, a PID for the process is generated to allow uniquely identifying it. This is done simply by incrementing the current highest PID by 1.

Let’s confirm this with the help of a trivial shell script:

#!/bin/sh -e
# This script assumes that printf is a shell builtin and hence doesn't take up extra PIDs.

highest=0

for pid in /proc/[0-9]*; do
    pid="${pid##*/}" # Extract PID
    [ "$pid" -gt "$highest" ] && highest="$pid" # -gt means "greater than"
done

printf "Highest PID is %d\n" "$highest"

for _ in $(seq 4); do
    printf "Launched new process with PID %d\n" "$(readlink /proc/self)"
done

Firstly, we calculated the highest PID on the system. Next, we launched four readlink processes, each of which checks the new PID assigned to them.

Let’s look at the output of the script:

Highest PID is 10522
Launched new process with PID 10524
Launched new process with PID 10525
Launched new process with PID 10526
Launched new process with PID 10527

We can notice that there is a gap of 1 PID between us checking the highest PID and printing the new PIDs because the seq command itself starts an extra process. Hence, external processes influence such tests by creating new PIDs.

4. Can We Run out of PIDs?

We discussed the maximum PID limit in the previous section, so what happens when we hit that limit?

If we reach the maximum PID limit, the following PID wraps around the maximum value. This is because the kernel starts at 1 and looks for free PIDs that belong to processes that have now finished.

We should note that a process is considered “finished” only if its termination status has been collected by its parent. So, a maliciously coded program can starve our system of PIDs.

5. Conclusion

In this article, we learned about process IDs in Linux – how they are generated, how high they can go, and what happens when the limit is hit.