Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: January 23, 2024
Restarting a running process involves shutting down that process by sending a termination signal. Processes usually perform some cleaning before exiting, which may take some time. Finally, we again start the program or service after its former process terminates.
In this tutorial, we’ll examine different ways to restart a running process, thus ensuring that it’s always up and running.
To begin with, let’s create a short shell script that we’ll use for testing purposes:
$ cat aprogram.sh
#!/bin/bash
trap "echo 'the program is exiting';sleep 3;exit 0" SIGINT;
echo "the program is running...";
while true; do echo '' > /dev/null; done;
Here, we use the trap command in the first line to define how to handle the SIGINT signal. Upon receiving SIGINT, the script prints a message on the screen, sleeps for 3 seconds, and exits with a 0 status.
Next, the script prints a message when it starts. Finally, in the last line, the script enters an infinite while loop that simulates the execution of a program.
We can ensure that a program keeps running using a loop command like while or until.
Let’s see an example loop that restarts our program when it exits:
$ while ./aprogram.sh ; do : ; done
the program is running...
^Cthe program is exiting
the program is running...
Indeed, we can see that, when we send the SIGINT signal by pressing the Ctrl + c keys, the program exits and starts over again.
The while loop calls aprogram.sh as part of its condition evaluation. In addition, the while loop waits until aprogram.sh finishes, thereby not proceeding to the execution of its body. If the script returns an exit status of 0, the while loop continues to the next iteration.
Also, since we don’t want to execute anything in the while loop’s body but can’t leave it empty, we just used the null command (:) as the body.
As usual, instead of a while loop, we can use the until loop command. In contrast to while, until keeps iterating as long as the condition command returns an exit status that isn’t 0. As a result, since aprogram.sh returns a 0 exit status upon receiving the SIGINT signal, we’ll have to add the negation operator (!) so that the until loop continues its execution:
$ until ! ./aprogram.sh ; do : ; done
the program is running...
^Cthe program is exiting
the program is running...
As expected, the program restarted when we sent the SIGINT signal.
Another option for ensuring that a process restarts is to create a systemd service and use the systemctl command to control it. Such services are usually called watchdogs, since they monitor and watch over a given process.
First, we create a systemd service unit file and save it to the /etc/systemd/system directory:
$ cat /etc/systemd/system/aprogram.service
[Unit]
Description=Service to run aprogram.sh
[Service]
ExecStart=/usr/local/bin/aprogram.sh
Restart=on-success
StandardOutput=append:/var/log/aprogram.log
In the above example, we defined a system service using the bare minimum properties:
The Restart property is where we set the restart policy. We can set different values here:
Notably, we consider that there’s a clean exit in one of three cases:
In our example, we simulate a clean exit. As a result, we’re using the on-success policy.
Finally, it’s also common to use the on-failure policy to handle failures and ensure that a process is always running.
Before starting the service, we should call the systemctl daemon-reload command to reload all unit files:
$ sudo systemctl daemon-reload
Now, we’re ready to start our service using the systemctl command:
$ sudo systemctl start aprogram
$ sudo systemctl status aprogram
● aprogram.service - Service to run aprogram.sh
Loaded: loaded (/etc/systemd/system/aprogram.service; static)
Active: active (running) since Sat 2023-12-23 12:13:08 EET; 1s ago
Main PID: 1029579 (aprogram.sh)
Tasks: 1 (limit: 1024)
Memory: 444.0K
CPU: 659ms
CGroup: /system.slice/aprogram.service
└─1029579 /bin/bash /usr/local/bin/aprogram.sh
Indeed, the service has started.
Next, let’s send a SIGINT signal using the pkill command and see if the service manager restarts the program:
$ sudo pkill -2 aprogram
$ cat /var/log/aprogram.log
the program is running...
the program is exiting
the program is running...
Indeed, we can see in the log that the program received a SIGINT signal, exited, and was restarted by the service manager.
Furthermore, let’s again check the status of the service:
$ sudo systemctl status aprogram
● aprogram.service - Service to run aprogram.sh
Loaded: loaded (/etc/systemd/system/aprogram.service; static)
Active: active (running) since Sat 2023-12-23 12:21:39 EET; 2min 24s ago
Main PID: 1029623 (aprogram.sh)
...
Indeed, we can see that our service is still running. Notably, the main PID number is 1029623. Considering the command reported PID 1029579 when starting the service, it’s evident that the service manager started another process when the initial one terminated.
Interestingly, Docker has a mechanism to ensure the continuous execution of a program.
To keep things simple, we’ll use the docker run command to create a new process that sleeps for a certain amount of time:
$ sudo docker run -d --rm --name aprocess ubuntu:latest sleep 10000
e2aca63336be97c7c1ea77e9cb82b09a2d657ad86fadf9bee8f10f99995642c3
Here, we started a new container using the latest Ubuntu image. Furthermore, the container ran the sleep command and fell asleep for 10000 seconds. Let’s explore the options we use:
Following this, let’s view the container’s process PID using the pgrep command:
$ pgrep -a sleep
7575 sleep 10000
Indeed, there’s a process with PID 7575 that runs the sleep command.
Next, let’s kill the process:
$ sudo kill -9 7575
$ pgrep -a sleep
$ sudo docker container inspect aprocess
[]
Error response from daemon: No such container: aprocess
As we expected, the process was terminated successfully. To verify this, we ran the pgrep command, which reported no processes running the sleep command anymore. Next, we ran the docker container inspect command, which failed to find the aprocess container.
The –restart option enables the Docker daemon to restart a container if it stops running. Let’s add this option to the command that creates our container:
$ sudo docker run -d --restart always --name aprocess ubuntu:latest sleep 10000
071362dac05bb39af81b0d0a83b08efffadf090563ceedbc8363b741d6b0373b
$ pgrep -a sleep
8340 sleep 10000
As can be seen, a new container was created successfully. Also, we can see that we removed the –rm option since it’s incompatible with the –restart option.
Next, let’s kill the process with PID 8340:
$ sudo kill -9 8340
$ pgrep -a sleep
8460 sleep 10000
Indeed, we terminated process 8340. Nevertheless, in contrast to the previous example, pgrep found another process with PID 8460, running the sleep command.
Furthermore, we can verify our result with the docker container ls command:
$ sudo docker container ls -a | grep aprocess
bd280dc21418 ubuntu:latest "sleep 10000" About a minute ago Up 2 seconds aprocess
As we expected, the aprocess container is still running. In addition, the notably short uptime of 2 seconds suggests that the Docker daemon restarted the container.
In this article, we examined three methods to restart a process when it exits:
In conclusion, based on our specific case, we can select the most appropriate method for restarting a process.