In this tutorial, we look at reasons kill -9 can leave a target process intact. First, we discuss why a process might need to be killed. Next, we go into reasons the standard kill command could appear to have no effect. Finally, we explore workarounds for such situations.
We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It should work in most POSIX-compliant environments.
2. Killing in the Name
Processes can become unresponsive or obsolete due to many reasons. From a simple hang, through resource hogging and time wasting, to backgrounding, we might need solutions to terminate such processes.
Naturally, we already have some, depending on the situation:
- kill hung terminals
- kill respawning processes
- kill background processes
- kill socket-waiting processes
# kill -SIGKILL 666 # kill -9 166
However, kill -9 might still not do the job even when run as a superuser. Let’s see why.
3. When kill -SIGKILL Appears to Fail
Let’s look at both.
3.1. Uninterruptible Sleep
Basically, sleeping states usually mean a process is waiting on resources. In particular, a process in uninterruptible sleep reacts only to awaited resources becoming available but ignores all signals.
For example, bad drivers or hardware or something like waiting on a remote filesystem can cause blockages:
$ umount /mnt/smb [...]
Meanwhile, they don’t react to any signal, including SIGKILL.
Another potentially problematic process type is the zombie. In a zombie state, terminated or completed processes remain in the system because their parent hasn’t issued a wait() to acknowledge the child’s end as signaled to it via the SIGCHLD signal:
$ (sleep 1 & exec /bin/sleep 11) & $ ps 100 pts/1 00:00:00 bash 660 pts/1 00:00:00 sleep 666 pts/1 00:00:00 sleep
Here, we spawn a subshell that runs sleep in the background and follows it with an exec sleep. The latter prevents the acknowledgment of the former’s death, thus creating a zombie process for 10 (11–1) seconds, as confirmed by its status in the output of ps.
In essence, the behavior of zombies is similar to that of a process in uninterruptible sleep, but they can be considered to wait for the parent instead of a resource. Hence, not even SIGKILL can change the state of a zombie process. Of course, the states aren’t equivalent since zombies are actually terminated processes.
Importantly, kill commands that target zombie, or uninterruptible sleep processes will succeed, but the signals will never be received.
4. What We Can Do About It
After knowing the two main reasons a process can be unresponsive to kill -SIGKILL, we can see explore options.
4.1. Work Around Uninterruptible Processes
While no direct influence is possible over them, we still have methods to prevent and terminate an uninterruptible process:
- avoid their creation by maintaining hardware, drivers, networking, and the system in general
- make any awaited resources available
- in some cases, killing the parent can lead to an uninterruptible child terminating
For example, let’s use ps to show a process in uninterruptible sleep due to network connectivity issues:
$ ps 666 PID TTY STAT TIME COMMAND 666 ? D 0:00 [cifsd]
In this example, after experiencing problems with CIFS (SMB), we verify the CIFS client daemon cifsd is in the uninterruptible sleep state. As discussed, using kill -9 at this point doesn’t have an effect:
$ ps 666 PID TTY STAT TIME COMMAND 666 ? D 0:00 [cifsd] $ kill -9 666 $ ps 666 PID TTY STAT TIME COMMAND 666 ? D 0:00 [cifsd]
So, we restore network connectivity and check again:
$ ps 666 PID TTY STAT TIME COMMAND 666 ? S 0:00 [cifsd]
Now, our process continues as usual. In cases when it doesn’t, we still have the option to reboot.
4.2. Work Around Zombie Processes
There are three standard ways to clean zombie processes:
- send SIGCHLD to the parent, triggering a wait()
- kill the parent
Of course, doing the latter might affect all child processes:
$ ps -H PID TTY TIME CMD 10 pts/1 00:00:00 su 100 pts/1 00:00:00 bash 660 pts/1 00:00:00 bash 666 pts/1 00:00:00 proc1 667 pts/1 00:00:00 sleep $ kill -9 660 $ ps -H PID TTY TIME CMD 10 pts/1 00:00:00 su 100 pts/1 00:00:00 bash 667 pts/1 00:00:00 sleep
Here, we see killing parent 660 terminates the child process proc1 with PID 666 but leaves the other child (PID 667) running.
Still, the SIGCHLD approach is preferable. However, simply sending the signal doesn’t force the parent to handle SIGCHLD and acknowledge a child.
Naturally, a system reboot resolves the issue as a last resort.
In this article, we explored why attempting to kill a process might not always succeed and what we can do about it.
In conclusion, while kill -SIGKILL is the most lethal and direct way to terminate a process, some pitfalls may need consideration.