Baeldung Pro – Linux – NPI EA (cat = Baeldung on Linux)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

Partner – Orkes – NPI EA (tag=Kubernetes)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

1. Introduction

The top command is perhaps the most ubiquitous convenient way to explore and manage processes in UNIX and Linux environments. Although its usage is often straightforward, understanding why certain entries exist in a given environment might not be.

In this tutorial, we go through the concept of read-copy-update (RCU) with a particular focus on rcuos and rcuob threads. First, we understand the idea behind RCU and its historical implementation. After that, we explore threads as the drivers of the RCU mechanism. Finally, we briefly touch upon the configuration aspect of RCU.

We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. Unless otherwise specified, it should work in most POSIX-compliant environments.

2. read-copy-update (RCU)

When multiple workers operate on a piece of data, synchronizing writes can be a tedious task. Ensuring data integrity in such a situation usually involves lock primitives that require careful planning, orchestration, and testing. Further, even if done correctly, this type of synchronization could introduce unacceptable delays, especially when it comes to critical system structures.

Because of this, the Linux kernel often employs read-copy-update (RCU) to synchronize data when multiple threads work with it. This principle comprises several main points:

  • readers don’t need to block
  • writers use a new location and only change pointers to perform an update (or removal)
  • old data lives as long as there are readers attached
  • deletion (or reclamation) takes place when old data has no readers

Due to their fundamental nature, these concepts were historically integrated with the Interrupt Request (IRQ) mechanism. In particular, reclamation had a very high priority, so data blocks could be freed as soon as they weren’t needed.

However, since storage and memory aren’t as scarce a resource as they once were, taking processor time away from running programs just to prevent stale data blocks from lingering is now rarely vital.

3. RCU Threads

Although RCU can be critical to most OS interactions, the functionality has been abstracted further down the chain of priority. Specifically, the Linux kernel creates and manages so-called RCU threads, which handle different aspects of the mechanism.

This way, the operating system (OS) jitter is significantly reduced by preventing an IRQ from stealing time from potentially more important processes. In its most basic form, OS jitter is a measurement of the frequency with which the OS interrupts running programs.

Conveniently, RCU thread names usually start with rcu*. Perhaps most prevalent are the rcuob and rcuos threads:

$ top -H
[...]
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    1 root      20   0 22110 1666 1656 S    0  0.0   0:06.56 init
    2 root      20   0     0    0    0 S    0  0.0   0:00.25 kthreadd
    3 root      20   0     0    0    0 S    0  0.0   0:02.40 ksoftirqd/0
    5 root       0 -20     0    0    0 S    0  0.0   0:00.00 kworker/0:0H
    6 root      20   0     0    0    0 S    0  0.0   0:00.00 kworker/u48:0
    8 root      20   0     0    0    0 S    0  0.0   1:57.92 rcu_sched
    9 root      20   0     0    0    0 S    0  0.0   2:31.03 rcuos/0
   10 root      20   0     0    0    0 S    0  0.0   0:54.04 rcuos/1
   11 root      20   0     0    0    0 S    0  0.0   0:53.16 rcuos/2
   12 root      20   0     0    0    0 S    0  0.0   0:55.61 rcuos/3
   13 root      20   0     0    0    0 S    0  0.0   0:55.16 rcuos/4
   14 root      20   0     0    0    0 S    0  0.0   0:41.83 rcuos/5
   15 root      20   0     0    0    0 S    0  0.0   0:02.56 rcuos/6
   16 root      20   0     0    0    0 S    0  0.0   0:07.35 rcuos/7
   17 root      20   0     0    0    0 S    0  0.0   0:12.36 rcuos/8
   18 root      20   0     0    0    0 S    0  0.0   0:12.14 rcuos/9
   19 root      20   0     0    0    0 S    0  0.0   0:11.51 rcuos/10
   20 root      20   0     0    0    0 S    0  0.0   0:05.66 rcuos/11
   21 root      20   0     0    0    0 S    0  0.0   0:34.42 rcuos/12
   22 root      20   0     0    0    0 S    0  0.0   0:32.66 rcuos/13
   23 root      20   0     0    0    0 S    0  0.0   0:21.45 rcuos/14
   24 root      20   0     0    0    0 S    0  0.0   0:00.88 rcuos/15
   30 root      20   0     0    0    0 S    0  0.0   0:00.00 rcu_bh
   31 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/0
   32 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/1
   33 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/2
   34 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/3
   35 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/4
   36 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/5
   37 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/6
   38 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/7
   39 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/8
   40 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/9
   41 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/10
   42 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/11
   43 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/12
   44 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/13
   45 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/14
   46 root      20   0     0    0    0 S    0  0.0   0:00.00 rcuob/15

Notably, there’s usually a fixed number of RCU threads per thread type and this count depends on multiple factors:

  • kernel configuration
  • number of processors
  • system configuration
  • system load

In the case above, there are 16 rcuob, 16 rcuos, 1 rcu_bh, and 1 rcu_sched. For some systems, this could mean a misconfiguration due to the high total count, but for others, it might be a necessity.

Of course, there are also instances of problematic kernel or hypervisor behavior that lead to the overallocation of RCU threads.

4. Configuring RCU and RCU Threads

Primarily, we can preconfigure RCU functionality during kernel compilation through different settings with three major ones:

  • CONFIG_RCU_NOCB_CPU_ALL: prevents callbacks from using any processor
  • CONFIG_RCU_NOCB_CPU: similar to CONFIG_RCU_NOCB_CPU_ALL, but can apply to certain processors via the rcu_nocbs boot parameter
  • CONFIG_RCU_USER_QS: ensures a userspace processor avoids RCU

In fact, all of the above along with CONFIG_RCU_STALL_COMMON are a standard part of the Ubuntu kernel setup.

Generally, knowing how to tweak the RCU policies can be especially critical for production environments that experience a lot of OS jitter.

When it comes to virtual machine (VM) environments, disabling the CPU hotplug feature could help to prevent unexpected processor counts, hence stabilizing the RCU thread counts as well.

5. Summary

In this article, we talked about the kernel read-copy-update (RCU) mechanism, its historical implementation, and current guiding principles. Furthermore, we briefly explored how to configure RCU.

In conclusion, the RCU threads can be vital to the correct operation of the OS by avoiding jitter, but they can also become burdensome if misconfigured.