Managing NVIDIA Drivers and How to Disable and Enable a GPU

1. Introduction

As one of the top brands for video hardware and graphics cards in particular, NVIDIA has support for many platforms. Although previously reserved mostly for gaming, a graphics processing unit (GPU) in personal computers now has alternative applications such as cryptocurrency mining, encryption, and machine learning. These new uses make Linux systems with their lightweight footprint and flexible kernels invaluable, despite the reduced gaming capabilities.

In this tutorial, we talk about the management of an NVIDIA graphics adapter. First, we discuss NVIDIA driver options. After that, we explore how to check the current video hardware that we have. Next, we delve into ways to configure a given driver. Finally, we turn to methods to disable and reenable a specific NVIDIA GPU.

We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. It should work in most POSIX-compliant environments unless otherwise specified.

2. Drivers

As part of the Linux graphics system, GPU drivers have the important role of ensuring optimal communication between the kernel and the graphics hardware.

Still, it’s important to match the driver package with the actual hardware in the system. For example, there’s the Maxwell next-generation NVIDIA architecture, as well as some older cards that might not be supported by newer driver versions.

Indeed, NVIDIA provides native drivers for Linux. In fact, there are several main choices with different configuration options, catering to different system types and scenarios:

nvidia driver: proprietary, with broader device and general support
nvidia-open experimental driver: open source, supports fewer devices
nouveau experimental driver: freedesktop.org open-source implementation, limited support for all NVIDIA cards

In theory, both nvidia and nvidia-open can support additional kernel modules:

nvidiafb: framebuffer support
nvidia_modeset: Kernel Mode Setting (KMS) support
nvidia_uvm: Unified Virtual Memory (UVM) support
nvidia_drm: Direct Rendering Management (DRM) support

Each of these modules can add an extra feature. Their interoperability between each other and with the different nvidia drivers depends on multiple factors. So, if a feature isn’t enabled automatically, tests are usually the best way to establish compatibility.

3. Check Video Hardware

First, it’s usually best to verify what our current GPU is:

$ lspci -k | grep -A 2 -E '(3D|VGA)'
lspci -k | grep -A 2 -E '(3D|VGA)'
00:08.0 VGA compatible controller: NVIDIA Corporation GR666GL [GeForce GX 666] (rev a0)
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia

Here, we use the lspci command with its -k switch to also show any kernel drivers. Further, to ensure we only look at graphics hardware, we filter through grep, showing 2 lines [-A]fter each one matching the (3D|VGA) [-E]xtended regular expression.

In this case, we have an NVIDIA card with the respective drivers.

Let’s verify that via the NVIDIA-native nvidia-smi:

$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 535.129                Driver Version: 535.129                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GX 666 ...  Off  | 0000:08:00.0     Off |                  N/A |
| 23%   49C    P8    33W / 200W |  10666MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
[...]

Thus, we see the same graphic adapter at the same PCI bus slot that lspci showed.

4. Set Drivers

There are several ways to use a given graphics driver or configure its features:

kernel boot parameters
module loading system
graphics server changes

In some cases, we may need all three.

For example, let’s set the nvidia_modeset via the boot command line:

$ cat /proc/cmdline
BOOT_IMAGE=/boot/vmlinuz-5.10.0-666-amd64 root=UUID=606660e8-7a07-4fc2-dead-27ed8425e7a0 nvidia_modeset=1 ro quiet

To change driver options, we can use module parameters as well. For that, we usually create a file under /etc/modprobe.d/:

$ cat /etc/modprobe.d/nvidia.conf
options nvidia <OPTIONS>

Here, we can include any NVIDIA driver module option. Notably, not all hardware supports all options.

Finally, we can set the nvidia driver in Xorg via an /etc/X11/xorg.conf.d/ configuration file:

$ cat /etc/X11/xorg.conf.d/10-nvidia.conf
Section "Device"
        Identifier "NVIDIA Card"
        Driver "nvidia"
        VendorName "NVIDIA Corporation"
        BoardName "GeForce GX 666"
EndSection

Now, let’s understand how to enable and disable an NVIDIA card from the shell.

5. Enable and Disable NVIDIA Hardware

Especially in systems with more than one GPU, we might want to switch between different hardware for the rendering of applications. Further, we may choose to disable a given graphics card completely.

Let’s understand how to do the latter.

5.1. Remove Hardware

As with most other physical components, unmounting the graphics card from the motherboard is a fairly basic way to prevent it from being used.

The exact way to disconnect or reconnect a card depends on a number of factors. However, in most cases, we’d have to turn off the machine and open it up.

Naturally, this isn’t optimal, as it can lead to different issues:

down time
warranty voiding
malfunction
damage

Further, we’d be unable to restore the function of the GPU without physical access to the machine. So, let’s look at software possibilities.

5.2. Configure BIOS

Commonly, the Basic Input/Output System (BIOS) or Unified Extensible Firmware Interface (UEFI) options include ways to control external hardware and peripherals.

Thus, we might be able to enable and disable any GPU from those interfaces.

Again, the exact way to do that depends on the manufacturer. Interface menus usually differentiate between internal (integrated) and external (discrete) graphics cards, different adapters, and PCI slots, as well as priority.

Let’s see some example categories:

Graphics Configuration
Graphics Device: Integrated Graphics, Discrete Graphics, NVIDIA Optimus
Integrated Graphics: Auto, Forced, Disabled
Internal Graphics: Auto, Disabled, Enabled
Primary Graphics Adapter: Internal, PCI, PCI-E
Onboard VGA: Onboard, Offboard
VGA priority: Auto, Onboard, Offboard

Importantly, some systems also have a switchable graphics setting, which employs the external adapter only when heavy graphics processing is necessary, while the integrated chip is used for most other needs.

Although the BIOS or UEFI method is fairly convenient, needing a reboot to disable, reenable, and prioritize a video card isn’t usually optimal.

5.3. Use Management Tools

After the system is fully started and the kernel takes over, only specialized management tools with privileged access can provide ways to enable or disable the GPU.

In practice, NVIDIA provides the aforementioned nvidia-smi tool as a wrapper for its system management interface (SMI).

So, to fully disable or enable a given NVIDIA graphics adapter via nvidia-smi, we follow three steps:

check current adapters
note down slot numbers
disable or enable slot by toggling modes

Let’s see this process in practice.

First, we check the currently available devices via one of the two means we already discussed:

$ lspci -k | grep -A 2 -E '(3D|VGA)'
lspci -k | grep -A 2 -E '(3D|VGA)'
00:08.0 VGA compatible controller: NVIDIA Corporation GR666GL [GeForce GX 666] (rev a0)
        Kernel driver in use: nvidia
        Kernel modules: nvidiafb, nouveau, nvidia

In this case, the slot of interest is 00:08.0. So, we use nvidia-smi to disable the GPU on that slot:

$ nvidia-smi --id 0000:xx:00.0 --persistence-mode 0
$ nvidia-smi drain --pciid 0000:xx:00.0 --modify 1
$ nvidia-smi --persistence-mode 1

In both of the first commands, we replace xx with 08.

Essentially, this sequence performs several actions:

disables –persistence-mode (-pm) for our specific GPU as identified by –id (-i) (UUID, PCI bus ID, or serial number)
enables drain via –modify 1 (when persistence mode is off) for the same GPU as identified by –pciid (-p) which only uses the XXXX:YY.Z.a domain:bus.device.function format
ensures all other graphics controllers are in –persistence-mode

Thus, our drain card doesn’t show up or get activated. In general, persistence mode ensures the NVIDIA driver is loaded all the time, not only when requested. On the other hand, drain prevents the GPU from accepting new client applications, usually employed before turning off the card. Importantly, both of these options are only available on Linux.

At this point, the target device should only be visible via tools like lspci.

To restore visibility and functionality, we just disable drain mode:

$ nvidia-smi drain --pciid 0000:xx:00.0 --modify 0

Critically, root or sudo privileges are usually required. Further, this may crash processes that are using the GPU.

6. Summary

In this article, we explored the management of NVIDIA graphics controllers in a Linux system. Specifically, we checked out different drivers and ways to disable and reenable a particular GPU.

In conclusion, although we can enumerate devices in a system via different means, the best way to configure them often involves the included toolset.

Full Archive

About Baeldung

Administration

Filesystems

Processes

Files

Scripting

Installation

Networking

Security