1. Overview

In this article, we’ll present uninterruptible sleep processes. We’ll begin by describing them together with their counterparts, interruptible sleep processes. Then, we’ll discuss its identification and how we can manage them.

2. Interruptible and Uninterruptible Sleep

Before talking about uninterruptible processes, we need to discuss what interruptible processes are. Linux has different states for any given process. A process is running (or runnable) when it is in the R state.

Linux also has two ways of putting running processes R to sleep. A process can be put into interruptible sleep S or uninterruptible sleep D. In both cases, the process is waiting for an event to complete. On the one hand, a process in the S state can return to the R state either by using an explicit wake-up call or by receiving a signal. The error returned when interruptible system calls receive a signal is EINTR.

On the other hand, the D state means that the process is ignoring signals. Thus, the only way to go from uninterruptible sleep D to running R is with an explicit wake-up call:


2.1. Stopping Processes in Interruptible Sleep With Signals

Processes can be in one of two different modes: user mode and kernel mode. In user mode, we can send signals, but they are taken into account once the process returns from the kernel mode. We cannot kill processes in kernel mode because this might corrupt data.

Let’s consider what happens with the system call read( ). When we use this system call, we don’t know how much time the read( ) operation will take. Thus, the scheduler puts the process into interruptible sleep so that the system can use the free resources for something else. The process created by read( ) will be sleeping during the time it takes to complete the operation: it is blocked by the hardware, so it cannot continue. If, during this sleeping time, we issue an asynchronous signal (such as pressing Ctrl+C), we’ll end the process.

However, we cannot stop all system calls like this. This is the case for the processes that are in an uninterruptible sleep state – which require an explicit wake-up call!

2.2. Why Can’t We Stop Processes in Uninterruptible Sleep?

As opposed to interruptible processes, when uninterruptible processes are in kernel mode, they won’t accept interruption signals. The system call expects to be woken up exclusively by the process it depends on.

One common example of uninterruptible sleep is creating a folder with mkdir, which only enters uninterruptible sleep. Normally, mkdir has to perform some disk searches exclusively. If the I/O operation gets successfully completed, the process will continue. If the operation fails, the kernel will get a SIGBUS (or an alternative signal) to manage the parent process.

Nevertheless, if there is a network problem, mkdir might get stuck into an uninterruptible sleep state. The only way to leave this mode is with an explicit wake-up call from the parent call and not from the user.

2.3. Rationale Behind Uninterruptible Sleep

When reading the previous lines, it might seem that uninterruptible sleep is not that different from interruptible sleep. Moreover, it might not seem necessary at all to have the uninterruptible sleep state, as interruptible sleep looks like enhanced uninterruptible sleep where the user can terminate the system call.

However, the uninterruptible sleep state has some advantages over interruptible sleep. Programming system calls for interruptible sleep is significantly harder than for uninterruptible sleep. Writing code that handles interruptible sleep requires extensive code both in kernel mode (constantly checking for any wake-up call, and if so, handle it, clean the memory and return) and in user mode (as the process should respond accordingly to the interrupted system call).

Making a system call go into uninterruptible sleep instead makes handling it smoother. Software developers rely on this type of sleep for short processes, to generate atomic calls, or to avoid restarting the main call.

2.4. Is Uninterruptible Sleep the Best Solution?

We have described interruptible sleep, uninterruptible sleep, and other states for Linux processes. However, there is a recent type of state known as killable that might be useful when writing custom code, as it is a compromise between the two types of sleep states.

The killable state is based on the uninterruptible state but accepts FATAL signals that will interrupt the sleep state. Thus, if the call gets stuck, it can be killed. This state originated by observing that an application bug usually is not relevant if we kill the parent process. However, not all system calls implement this state yet. Those that don’t still rely on uninterruptible sleep.

3. Manage Uninterruptible Processes

Now that we know how uninterruptible processes work, we’ll discuss how to manage them. We need to identify, prevent, and kill them when they appear.

3.1. Identification of Uninterruptible Processes

We can check the state of the different processes of our machine with ps:

$ ps a
796    tty1   Sl    75:53  /usr/lib/Xorg -nolisten tcp :0 vt1 -keeptty -auth /tmp/serverauth.MKLkhKNSLR
813    tty1   Ss+   0:21   i3 -a --restart /run/user/1000/i3/restart-state.813
907    tty1   S     9:30   firefox
111189 pts/2  R+    0:00   ps avi /path/to/file/to/edit

We see that there are identifiers for each process state: Ss+, Sl, R+,… The state has four letters or symbols that describe its status within the system. The first letter represents the status of the process, and it corresponds to the different letters previously mentioned (S, R, and D).

We can also use top, which displays the process status in the column under S. We only see the first letter shown in the column STAT from ps:

$ top
PID    USER   PR  NI     VIRT     RES     SHR  S   %CPU   %MEM     TIME+   COMMAND
907     foo   20   0    14.0g    2.0g  879928  S    8.3   26.5   9:30.65   firefox
796     foo   20   0  1174872  746824  721272  S    3.7    9.5  75:53.34   /usr/lib/Xorg -nolisten tcp :0 vt1 -keeptty -auth /tmp/serverauth.MKLkhKNSLR
813     foo   20   0  2828196  436624   99976  S    2.7    5.6   0:21.69   i3 -a --restart /run/user/1000/i3/restart-state.813
111189  foo   20   0  2781348  351432  117044  R    1.7    4.5   0:46.73   ps avi /path/to/file/to/edit

3.2. Preventing Processes to Enter Uninterruptible Sleep

We cannot completely avoid system calls entering uninterruptible sleep. As we discussed, they are something present in Linux, and programming with them is helpful. From a user point of view, there is not much that we can do. Most of the system calls that are uninterruptible happen instantaneously, so we usually don’t observe processes in the D state.

However, the process can get stuck in the uninterruptible sleep phase. Apart from wrong I/O connections, we might also have buggy kernel drivers that freeze processes into uninterruptible sleep. Thus, keeping our system drivers up to date and checking the release notes will help to minimize these problems.

From a development point of view, we can try to reduce the number of uninterruptible system calls used in our code. Moreover, and depending on the depth of our development, we might want to use the killable state as an alternative to the uninterruptible sleep state.

3.3. Methods to Stop a Process in Uninterruptible Sleep

If we ever encounter a process into uninterruptible sleep, we need to check our hardware. If we encounter the issue when using network storage, it might be down, and the process is waiting for the server to recover. Once we know the driver that is causing the trouble, we can stop it. We might need rmmod to remove the module supporting the hardware device.

Another alternative is to use the parent process identifier of the process in uninterruptible sleep. We can get the identifier of the parent process (known as PPID) and stop this process. This is sufficient for cases where the parent process is an errant shell. Killing the parent process kills the child processes, which may trigger the explicit call required by the process in uninterruptible sleep.

Finally, the last solution when nothing else works is to suspend-to-disk or restart the system. We can try first to suspend-to-disk (also known as hibernate) and resume to see if this unfreezes the process in uninterruptible sleep. If this does not work, we have to restart the system. We might not be able to restart some systems, for example, a connected network device. In this case, we should attempt to unfreeze the process with the previous methods.

4. Conclusion

In this article, we’ve talked about processes that enter the uninterruptible sleep state, what this means, and how we can handle them.