1. Overview

In Linux systems, when a process terminates, it returns a code. We can use this code, for example, to know if there was an error during the process execution. The convention is to return 0 if the execution was successful and non-zero in case there was an error.

In this tutorial, we’ll discuss how we can get the exit code of a process running in the background. To accomplish this, we’ll use the Bash built-in command wait.

First, we’ll focus on only one process in the background. Then, we’ll see how to monitor several processes at once.

2. Using wait to Get the Exit Code

We can get the exit code of a process running in the background using wait:

$ wait <pid>

When wait is executed, it receives the process ID as a parameter and waits until that process terminates. Then, wait itself returns back the original exit code returned by the process.

Bash provides two special variables we can use:

  • $! is the PID of the last process sent to the background
  • $? is the exit code of the last process

Let’s write a function to print the exit code of a PID passed as the first argument:

$ wait_and_echo() {
    PID=$1
    echo Waiting for PID $PID to terminate
    wait $PID
    CODE=$?
    echo PID $PID terminated with exit code $CODE
    return $CODE
}

Note our function also returns the exit code.

Now, we can wait for a process running in the background and get its exit code.  Let’s run a subshell that exits with code 42, and then we use our function wait_and_echo() on its PID:

$ (sleep 20; exit 42) & 
$ wait_and_echo $!
Waiting for PID 13160 to terminate
PID 13160 terminated with exit code 42
$ echo $?
42

3. Monitor a Process Inside a Loop

Sometimes, we don’t want to block the execution to wait for the process. In that case, we can use a loop to check if the process is still running on each iteration. This allows us to do other things while waiting.

When a process terminates, its exit code is saved. So, we can use wait to get the exit code of a process that has already finished.

One way to determine whether a process is still running is by checking if the folder /proc/<PID> exists.

Let’s write a function that does a non-blocking wait:

$ non_blocking_wait() {
    PID=$1
    if [ ! -d "/proc/$PID" ]; then
        wait $PID
        CODE=$?
    else
        CODE=127
    fi
    return $CODE
}

In this function, we only call wait when the directory /proc/$PID doesn’t exist, which means the process is not running. If it’s still running, we return the value 127. The wait command uses that same value when there is an error, so our function behaves similarly to the wait command itself.

Now, we can run a process in the background and get its exit code without blocking. Let’s write a script called wait_in_loop.sh:

#!/bin/bash

non_blocking_wait() {
    PID=$1
    if [ ! -d "/proc/$PID" ]; then
        wait $PID
        CODE=$?
    else
        CODE=127
    fi
    return $CODE
}

(sleep 5; exit 42) &
PID=$!
while /bin/true; do
    date
    non_blocking_wait $PID
    CODE=$?
    if [ $CODE -ne 127 ]; then
        echo "PID $PID terminated with exit code $CODE"
        break
    fi
    sleep 2
done

Finally, let’s run it:

$ ./wait_in_loop.sh
Tue May  4 15:11:17 2021
Tue May  4 15:11:19 2021
Tue May  4 15:11:21 2021
Tue May  4 15:11:23 2021
PID 20664 terminated with exit code 42

The process in the background exits with code 42, as we did in the previous section. However, with this method, we can still run the date command while waiting for the process to terminate.

4. Handling SIGCHLD

Another method to know that the process has terminated is handling the SIGCHLD signal. This signal will arrive if any process, that’s a child of the current shell, terminates.

Then, we can execute wait when the signal arrives, without needing to block the execution nor to constantly check whether the process has terminated or not.

Let’s write a function to be the SIGCHLD’s handler:

$ handle_sigchld() {
    if [ -n "$PID" -a ! -d "/proc/$PID" ]; then
        wait $PID
        CODE=$?
        echo PID $PID terminated with exit code $CODE
        unset PID
    fi
}

As SIGCHLD is delivered when any child terminates, we should first save the PID of the process we need to monitor. So, in this handler, we’ll call wait both when the variable $PID has a value and when the folder /proc/$PID doesn’t exist. This means there is a process that needs to be checked, and that it has terminated.

Once we get the exit code, we unset the $PID variable so the handler will ignore SIGCHLD until there is a new PID to monitor.

Now, we call handle_sigchld() when the signal SIGCHLD arrives using trap handle_sigchld SIGCHLD.

Let’s write a script called wait_in_sigchld.sh to launch a process in the background while also performing another task:

#!/bin/bash

handle_sigchld() {
    if [ -n "$PID" -a ! -d "/proc/$PID" ]; then
        wait $PID
        CODE=$?
        echo PID $PID terminated with exit code $CODE
        unset PID
    fi
}

unset PID
trap handle_sigchld SIGCHLD

(sleep 5; exit 42) &
PID=$!
echo Starting background process with PID $PID
echo Starting dd
timeout 7s dd if=/dev/zero of=/dev/null
echo dd terminated

In this example, we use the timeout command to execute dd only for 7 seconds.

And now, let’s run it:

$ ./wait_in_sigchld.sh 
Starting background process with PID 28173
Starting dd
PID 28173 terminated with exit code 42
dd terminated

As we can see, we got the exit code without needing to block the execution. Also, we run dd in the foreground without requiring a loop.

However, this method has some drawbacks. As the handler needs to know the child’s PID, it may be the case that the process terminates faster than what it takes to set PID=$!. Also, if there is a process running in the foreground (dd in our example), the script won’t handle the signal until the process in the foreground terminates.

5. How to Monitor Several Processes at the Same Time

So far, we discussed how to get the exit code when there is only one process in the background. What if we want to monitor more than one process at the same time?

To do this, we can use an array to store all the PIDs we need to monitor. Then, we can use wait on each PID.

After the process in the background has terminated, we’ll need to remove the PID from the array. We’ll be using the array index to store the PID. This way, it is easy to remove the PID using $ unset PIDS[$PID].

5.1. Using wait

We’ve been using wait with only one parameter, but we can pass to it more PIDs to wait for. To know what process has terminated and which was its exit code, we need to use the parameters -n and -p.

With -n, we tell wait to return when any PID has terminated, without waiting for all of them. With -p, we specify a variable name where the PID is store.

Let’s modify the example in section 2, adding support for multiple processes in the function wait_and_echo:

$ wait_and_echo() {
    PIDS=()
    for PID in $@; do
        PIDS[$PID]=1
    done
    while [ ${#PIDS[@]} -ne 0 ]; do
        wait -n -p PID ${!PIDS[@]}
        CODE=$?
        echo PID $PID terminated with exit code $CODE
        unset PIDS[$PID]
    done
}

Let’s try it to monitor 3 background processes:

$ (sleep 20; exit 42) &
$ PID1=$!
$ (sleep 22; exit 43) &
$ PID2=$!
$ (sleep 24; exit 44) &
$ PID3=$!
$ wait_and_echo $PID1 $PID2 $PID3
Waiting for PID 31759 to terminate
PID 31759 terminated with exit code 42
Waiting for PID 31902 to terminate
PID 31902 terminated with exit code 43
Waiting for PID 32014 to terminate
PID 32014 terminated with exit code 44

5.2. Using Our Non-Blocking wait Inside a Loop

To monitor several processes, we’ll iterate over the array of PIDs, checking if any of them has terminated. Let’s rewrite our wait_in_loop.sh script:

#!/bin/bash

non_blocking_wait() {
    PID=$1
    if [ ! -d "/proc/$PID" ]; then
        wait $PID
        CODE=$?
    else
        CODE=127
    fi
    return $CODE
}

PIDS=()
(sleep 5; exit 42) &
PIDS[$!]=1
(sleep 7; exit 43) &
PIDS[$!]=1
(sleep 9; exit 44) &
PIDS[$!]=1
while [ ${#PIDS[@]} -ne 0 ]; do
    date
    for PID in ${!PIDS[@]}; do
        non_blocking_wait $PID
        CODE=$?
        if [ $CODE -ne 127 ]; then
            echo "PID $PID terminated with exit code $CODE"
            unset PIDS[$PID]
        fi
    done
    sleep 2
done

As we see, we’ll call date while there are still PIDs running in the background. When any of them terminates, we remove it from the PIDS array.

Let’s see how it works:

$ ./wait_in_loop.sh 
Mon May  4 17:40:39 2021
Mon May  4 17:40:41 2021
Mon May  4 17:40:43 2021
Mon May  4 17:40:45 2021
PID 12018 terminated with exit code 42
Mon May  4 17:40:47 2021
PID 12019 terminated with exit code 43
Mon May  4 17:40:49 2021
PID 12020 terminated with exit code 44

5.3. Using the SIGCHLD Handler

Finally, we can modify our SIGCHLD handler in the script wait_in_sigchld.sh to support multiple PIDs:

#!/bin/bash

handle_sigchld() {
    for PID in ${!PIDS[@]}; do
        if [ ! -d "/proc/$PID" ]; then
            wait $PID
            CODE=$?
            echo PID $PID terminated with exit code $CODE
            unset PIDS[$PID]
        fi
    done
}

PIDS=()
trap handle_sigchld SIGCHLD

(sleep 9; exit 44) &
PIDS[$!]=1
(sleep 7; exit 43) &
PIDS[$!]=1
(sleep 5; exit 42) &
PIDS[$!]=1
echo Starting background processes with PIDS ${!PIDS[@]}
echo Starting dd
timeout 15s dd if=/dev/zero of=/dev/null
echo dd terminated

In this script, when handle_sigchld() is called, we iterate over all the PIDs to check if any of them has exited.

Let’s see how it works:

$ ./wait_in_sigchld.sh 
Starting background process with PIDS 31491 31492 31493
Starting dd
PID 31491 terminated with exit code 44
PID 31492 terminated with exit code 43
PID 31493 terminated with exit code 42
dd terminated

6. Conclusion

In this article, we saw how we could obtain the exit code from a process running in the background. We saw three approaches:

  • Calling wait blocking the execution
  • Wrapping wait in a new function to make it non-blocking
  • Using wait when the signal SIGCHLD arrives

And finally, we saw how to get the exit code of more than one process at the same time.

Comments are closed on this article!