Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Overview

It’s possible to modify the kernel with loadable kernel modules, so it is also possible to place the kernel in an unknown or unreliable status. When this happens, we say the kernel to be tainted. When a kernel marks itself tainted, most of the time there is no problem running it; the information only gets relevant when someone wants to investigate some problem or bug.

At this stage, the community can’t guarantee it would function as expected.

2. Taint Status

The taint status is a set of flags to identify specific conditions where kernel developers can’t investigate the kernel issue. For example, if a proprietary kernel module caused an issue, it can’t be debugged reliably because its source code isn’t available, and its effects cannot be determined.

Similarly, if a serious kernel or hardware failure had previously occurred, the integrity of the kernel space may have been compromised, meaning that any subsequent debug messages generated by the kernel may not be reliable.

2.1. Decoding Status in Kernel Errors

Each kernel bug, oops, or panic error report includes a tainted flag at the top.

Here’s a bug in the “not tainted” kernel:

BUG: unable to handle kernel paging request at ffffc90012a9c418
Oops: 0000 [#1] SMP
Modules linked in: parport_pc ppdev bnep rfcomm bluetooth libahci ...
CPU: 6 PID: 2925 Comm: ryzom_client Not tainted 3.10.0-031000rc5-generic #201306082135
Hardware name: Gigabyte Technology Co., Ltd.
task: ffff880414908000 ti: ffff880403afc000 task.ti: ffff880403afc000
RIP: 0010:[<ffffffffa03a2ace>] [<ffffffffa03a2ace>] radeon_fence_process+0x8e/0x160 [radeon]

We can find a “Not tainted:” in the line that begins with “CPU:”, which means the kernel isn’t tainted.

Let’s look at another bug, this time in a tainted kernel:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000000
Oops: 0002 [#1] SMP PTI
CPU: 0 PID: 4424 Comm: insmod Tainted: P         W O 4.20.0-0.rc6.fc30 #1
Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
RIP: 0010:my_oops_init+0x13/0x1000 [kpanic]

Here, we find “Tainted: P         W O”. The flag characters after “Tainted: “ could be letters or blanks. We can decode the state with these letters. Some of the common flags are:

  • P: This means that a Proprietary licensed module is loaded in the kernel. It could be a module that is not under GNU General Public License (GPL) or a compatible license.
  • G: All modules loaded are licensed under the GPL or a license compatible with the GPL, but something else has tainted the kernel; a different flag will indicate that.
  • F: This is a module that was loaded using the Force option -f of insmod or modprobe, so versioning information cannot be checked.
  • M: This is a Machine Check Exception (MCE) triggered by hardware to indicate a hardware-related problem.

For more flags, please check the kernel docs.

2.2. Decoding Status in Runtime

We can check the tainted status of our system in runtime:

$ cat /proc/sys/kernel/tainted

For instance, if the command returns 0, then the kernel isn’t tainted. Similarly, if the command returns 4609, we can decode the reason with it. There are scripts shipped by different kernel tools for this.

We have here a simple script to do the decoding:

$ for i in $(seq 18); do echo $(($i-1)) $((4609>>($i-1)&1));done
0 1
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 1
10 0
11 0
12 1
13 0
14 0
15 0
16 0
17 0

Now, we can check these numbers in the aforementioned docs to find the reasons. In our case, the reasons are:

  • 0: proprietary module was loaded
  • 9: kernel issued warning
  • 12: externally-built (out-of-tree) module was loaded

2.3. Eliminating the Taint

Let’s see a bug report like in our previous example and check the section with a line similar to:

Oops: 0000 [#1] SMP

That’s the first Oops since boot-up, as the #1 between the brackets shows. Every Oops and any other problem that happens after that point might be a follow-up problem to that first Oops, even if both look unrelated. Rule this out by getting rid of the cause for the first Oops and reproducing the issue afterward.

Re-configuring or updating the kernel might also be helpful for this case. The status is permanent even after we undo the cause of tainting. For example, if we unload the proprietary kernel module or fix the hardware error, the kernel will still be tainted. We have to restart the system to reset the taint flag.

In addition, when our system uses software that installs its kernel modules like Nvidia’s proprietary graphics driver, VirtualBox, or any external sources, the kernel taints itself. Here, we usually uninstall the modules temporarily before recreating the actual issue.

However, if the reason for taint is a module that resides in the staging tree of the kernel source, we can report the issue, but we have to make sure the module is the only reason for the taint.

3. Conclusion

To report an issue to the Linux kernel developers, we must first eliminate the taint. Decoding the status gives us various reasons for the tainted kernel.

This way, the taint status aids the process to investigate and fix the issue regarding the Linux kernel.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

Comments are closed on this article!