Dealing with processes is a necessity when administrating a Linux system. Tasks range from checking process resource usage, through streamlining running applications and services, to preventing memory leaks and beyond.
In this tutorial, we look into ways of killing a process that keeps restarting. First, we do a brief refresher on process hierarchy. Next, we define persistent processes and how they come to be. After that, we check out ways to detect and identify such processes. Finally, we explore ways to deal with a process that keeps restarting.
We tested the code in this tutorial on Debian 11 (Bullseye) with GNU Bash 5.1.4. It should work in most POSIX-compliant environments.
2. Process Hierarchy
In Linux, each process is a fork of another. Because of this, there is one process, which is a parent to all: PID 1, usually called init or, more recently, systemd.
Conversely, parenting continues downstream, meaning each process has its own direct parent. In fact, we can view these relationships via ps -H:
$ ps -AH PID TTY TIME CMD 1 ? 00:06:56 systemd 238 ? 00:03:28 systemd-journal 269 ? 00:00:18 systemd-udevd [...]
Showing all processes (-A) with their children (-H), we see systemd and their direct descendants. Direct means that systemd itself spawned them at one point.
3. Persistent Processes
No process can revive itself once killed. Only a different one can execute the binary of the original. Even then, that’s a different process and PID, albeit with the same code:
$ sleep 10 &  666 $ kill -9 666 $  Killed sleep 10 $ sleep 10 &  667
Here, we used the kill command to terminate a background job. Next, we restarted it with the same code, but it got a new process ID.
There can be many reasons for a process restarting right after it’s killed.
3.1. Watchdog Services
The concept of a watchdog is universal and easily guessable from the name. Watchdog processes monitor the system for a given event and react to it.
For example, events can be the creation of a file, resource usage around a given threshold, or the termination of a process. A very simple watchdog can be a script:
for SERVICE in ssh apache2 do service $SERVICE status 2>&1 >/dev/null [ $? -ne 0 ] && service $SERVICE restart done
This script checks whether the given services (ssh and apache2) are up and restarts them if not. To truly make it work like a watchdog, it’s usually best to schedule it with cron.
Another typical example is Docker restart policies. They can ensure the restart of containers that stop abnormally.
Of course, both cases above also result in process restarts.
3.2. Scheduled Restarts
Sometimes we may want a service to periodically restart regardless of its state. This may include a startup procedure or a simple time-based restart.
Either way, we end up with regular process restarts.
3.3. Malicious Processes
Critically, malicious processes may have a keep-alive mechanism similar to a watchdog. Moreover, such a process can employ other means to protect itself from termination:
- multiple instances of the same process or a separate watchdog
- binary replication, ensuring multiple different executables exist
- binary infection, whereby standard tool executables are replaced or infected, running the malicious code
These cases often make the detection of a restarting process much harder. In fact, the signature of the process becomes hard to pinpoint.
4. Identifying a Persistent or Haywire Process
Indeed, the first and most critical steps of any operation with a process are identification and PID acquisition. However, these can be very difficult since the process:
- restarts with a different PID
- if too resource-intensive, it may slow down the system and attempts at detection
- privileges can be higher, and permissions – restricting, particularly when dealing with daemons or services
- may restart under many names, especially when malicious
- could be illusive if restarting very frequently
With this in mind, we should try to get through any delays and (cautiously) get higher privileges or more permissions, if necessary.
Once this is done, we can try to detect the process via top or even watch ps, both with a small refresh interval to catch short-lived processes:
top - 04:05:27 up 3 days, 12:02, 1 user, load average: 0.55, 1.06, 1.27 Tasks: 362 total, 2 running, 290 sleeping, 0 stopped, 0 zombie %Cpu(s): 35.8 us, 10.7 sy, 0.0 ni, 52.4 id, 0.3 wa, 0.0 hi, 0.7 si, 0.0 st KiB Mem : 8060436 total, 150704 free, 4438276 used, 3471456 buff/cache KiB Swap: 2097148 total, 1656152 free, 440996 used. 2557604 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 32081 baeldung+ 20 0 879676 198164 106096 S 102.6 2.5 0:10.16 firefox 582 baeldung+ 20 0 51448 4088 3372 R 15.8 0.1 0:00.04 top 875 message+ 20 0 53120 5900 3204 S 5.3 0.1 10:10.14 dbus-daemon 1 root 20 0 225840 7200 4720 S 0.0 0.1 4:51.28 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.20 kthreadd 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
For example, the TIME column from the top output above can help with the recency of the process launch, while %MEM, %CPU, and others can show heavy processes. Of course, we can determine the USER as well – a hint for the privileges.
Critically, once we manage to detect and identify it, we might be able to get a stable piece of information about the offending process via ps -AH: its parent. As discussed, with one exception, for any process to exist, another one must fork it.
With this information at hand, let’s see how we can prevent the process from restarting.
5. Blocking a Persistent or Haywire Process
Now, once we have some unique information about the problematic process, we use that to either kill or prevent the process from starting.
5.1. Manual Termination
Naturally, we can try to just kill a process by hand:
$ kill -9 666
Sometimes a process won’t revive itself more than a few times or before it has run for a given period of time. The problem with this approach is that it may take time to detect whether the offending process has stopped restarting. In essence, it’s a loop of:
- Identify the process and its PID
- Terminate the process
Looking through the results of top or ps can be painstaking and tedious. Doing this by hand is probably not the best approach with persistent processes.
Still, this might be the only choice with a malicious process, as its behavior may be erratic and non-regular. Further, it can infect innocent-looking binaries, making detection even harder.
5.2. Automatic Termination
Watchdogs can be used to terminate processes as well as restart them. Automating the steps we discussed in a simple script may free us from detecting and terminating by hand:
while true do bppid=$(pgrep badproc) [ $? -eq 0 ] && kill -9 $bppid sleep 60 done
Here, we use several commands to monitor and manipulate the process:
- sleep to wait
- pgrep for checking (by name) whether the process exists
- pkill to kill the appropriate process
In this watchdog, we use the name of the process to find and kill it. Of course, we can detect processes by resource, by port, or use other criteria as needed.
While it should work in theory, there are many issues with this approach as well:
- hard to determine the optimal sleep time
- pgrep and pkill may detect and kill the wrong processes
- waste of resources in the contest between check-start and check-kill
To circumvent these problems, we can try to attack the source.
5.3. Binary Manipulation
While some are spawned scripts, many processes start with their own custom binary executable file. To find it, we can again use ps, but with the PID of our process:
$ ps 666 PID TTY STAT TIME COMMAND 666 ? Ss 0:02 /home/baeldung/badproc
The file path is usually in the default COMMAND or CMD column (/home/baeldung/badproc). Having this information, we can do one of two things:
- delete the executable file
- rename the executable file
By doing so, we can effectively stop the process from being executed. Of course, this does not prevent attempts to start it.
5.4. Kill Parent Process
Unless they are malicious, persistent processes are commonly run and revived by a single parent process. Targeting that parent, we can apply any of the actions already discussed for the process being restarted:
$ ps -AH PID TTY TIME CMD [...] 660 ? 00:05:56 parent-badproc 666 ? 00:01:28 badproc [...] $ ps 666 PID TTY STAT TIME COMMAND 666 ? Ss 0:02 /home/baeldung/badproc $ kill -9 660 $ ps 666 PID TTY STAT TIME COMMAND
Here, we see a process (badproc, 666) and its parent (parent-badproc, 660). While the parent is alive, the child process exists. However, both processes should terminate when we kill the parent.
Clearly, an important drawback is that other processes may depend on the same parent.
In this article, we looked at ways to kill constantly restarting processes. To enumerate, we went through manual handling, scripts, and attacking the root of the problem.
In conclusion, there are many ways to handle persistent, haywire, and malicious processes that keep restarting, but we first need to identify them and their behavior.