Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Introduction

The file system cache holds data recently read from secondary storage. This makes it possible for subsequent requests to obtain data from the cache instead of reading it from slower memory.

In this tutorial, we’ll learn how to configure file system caching on a Linux system.

2. File System Caching in Linux

File system caching in Linux is a mechanism that allows the kernel to store frequently accessed data in memory for faster access. The kernel uses the page cache to store recently-read data from files and file system metadata.

For instance, when a program reads data from a file, the kernel performs several tasks:

  1. checks the page cache to see if the data is already in memory
  2. if the data is in memory, the kernel simply returns the data from the cache
  3. otherwise, it reads the data from the drive and stores a copy of it in the cache for future use

In addition, the kernel uses the dentries cache to store information about file system objects. These file system objects include directories and inodes.

Hence, the page cache handles file system metadata while the dentries cache manages the file system objects.

Again, the kernel uses a Least Recently Used (LRU) algorithm to manage the page and dentries cache. In other words, when the cache is full and there’s more data to add, the kernel removes the least recently used data to make room for the new data.

Let’s now proceed to check the file system cache.

3. Checking the Cache

The vmstat command provides detailed information about virtual memory. In particular, it shows the amount of memory in use for caching:

$ vmstat
procs  -----------memory----------   ---swap---    -----io----   --system--    ------cpu-----
r   b  swpd     free   buff  cache   si     so      bi     bo     in    cs    us sy id wa st
0   0     0  6130448 11032 589532    0       0      422    52     160   362    3  3 76 18  0

The cache column shows the amount of memory used for file system caching in kilobytes. In addition, to get more details using the vmstat command, we can use the -s flag:

$ vmstat -s
8016140 K total memory
1282340 K used memory
 207744 K active memory
 711356 K inactive memory
6133536 K free memory
  11032 K buffer memory
 589232 K swap cache
2097148 K total swap
      0 K used swap
2097148 K free swap
   3458 non-nice user cpu ticks
    389 nice user cpu ticks
   3371 system cpu ticks
  60823 idle cpu ticks
  20782 IO-wait cpu ticks
      0 IRQ cpu ticks
     34 softirq cpu ticks
      0 stolen cpu ticks
 494275 pages paged in
  56168 pages paged out
      0 pages swapped in
      0 pages swapped out
 170063 interrupts
 384058 CPU context switches
1673971944 boot time
   5151 forks

Alternatively, we can use the free command to check the amount of file system cache memory in the system. It shows the memory usage in kilobytes under the buff/cache column:

$ free
                  total        used           free      shared      buff/cache      available
Mem:            8016140     1284652        6130952      144680          600536       6353032
Swap:           2097148           0        2097148

The -m flag alters the command output values to megabytes. Notably, the value of the buff/cache column is the sum of the values of the buffer memory and swap cache rows for vmstat.

Next, let’s configure the file system cache for our system.

4. Configuring File System Cache

In general, we can use the sysctl command to configure the file system cache in Linux. Also, the sysctl command can modify kernel parameters in the /etc/sysctl.conf file. This file contains system-wide kernel parameters that we can set at runtime.

4.1. Setting vfs_cache_pressure

With sysctl, we can also set the value of vm.vfs_cache_pressure, which controls the tendency of the kernel to reclaim the memory used for caching directory and inode objects:

$ sudo sysctl -w vm.vfs_cache_pressure=50
vm.vfs_cache_pressure = 50

Here, we set the vfs_cache_pressure value to 50 via the -w switch of sysctl. Consequently, the kernel will prefer inode and dentry caches over the page cache. This can help improve performance on systems with a large number of files.

Notably, a higher value makes the kernel prefer to reclaim inodes and dentries over cached memory. On the other hand, a lower value makes it reclaim cached memory over inodes and entries. Hence, we can adjust the value according to our preference.

Next, let’s set the value of swappiness.

4.2. Configuring Swappiness

Swappiness controls how aggressively the kernel swaps memory pages. Lowering the value of swappiness means the kernel will be less likely to swap out less frequently used memory pages. Thus, the kernel will be more likely to keep these pages cached in RAM for faster access.

Further, we can again use sysctl to set the vm.swappiness parameter:

$ sudo sysctl -w vm.swappiness=10
vm.swappiness = 10

Here, the command sets the value of vm.swappiness to 10. Again, lower values will make the kernel prefer to keep more data in RAM. Thus, higher values make the kernel swap more.

Next, let’s see other parameters we can set to configure the cache.

5. File System Cache Optimization

To optimize file system caching, we can modify several parameters:

  • vm.dirty_background_ratio
  • vm.dirty_background_bytes
  • vm.dirty_ratio
  • vm.dirty_bytes
  • vm.dirty_writeback_centisecs
  • vm.dirty_expire_centisecs

These parameters control the percentage of total system memory we can use for caching. They regulate the caching memory before the kernel writes dirty pages to the storage. Importantly, dirty pages are memory pages that aren’t written to secondary memory yet.

Let’s see the dirty_* variables on our system using the sysctl command:

$ sysctl -a | grep dirty
vm.dirty_background_ratio = 10
vm.dirty_background_bytes = 0
vm.dirty_ratio = 20
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_writeback_centisecs = 500

Here, the -a option displays all the variables we can set along with their values. Then the grep command filters all the vm.dirty_* variables.

Let’s start with a summary of how these parameters work.

5.1. vm.dirty_background_ratio

The vm.dirty_background_ratio parameter is the amount of system memory in percentage that can be filled with dirty pages before they’re written to the drive. For instance, if we set the value of the vm.dirty_background_ratio parameter of a 64GB RAM system to 10, it entails that 6.4GB of data (dirty pages) can stay in RAM before they’re written to the storage.

Now, let’s configure the value of vm.dirty_background_ratio for our system:

$ sudo sysctl -w vm.dirty_background_ratio=10
 vm.dirty_background_ratio = 10

Alternatively, we can set the vm.dirty_background_bytes variable in place of vm.dirty_background_ratio. The *_bytes version takes the amount of memory in bytes. For example, we can set the amount of memory for dirty background caching to 512MB:

$ sudo sysctl -w vm.dirty_background_bytes=511870912

However, the *_ratio variant will become 0 if we set the * _bytes variant, and vice versa.

Next, let’s look at vm.dirty_ratio.

5.2. vm.dirty_ratio

Specifically, vm.dirty_ratio is the absolute maximum amount of system memory in percentage that can be filled with dirty pages before they’re written to the drive. At this level, all new I/O activities halt until dirty pages are written to storage.

Notably, the vm.dirty_bytes turns to 0 when we set a value in bytes for vm.dirty_ratio and vice versa. To illustrate, let’s define the value for vm.dirty_ratio:

$ sudo sysctl -w vm.dirty_ratio=20
 vm.dirty_ratio = 20

Similarly, the vm.dirty_ratio will become 0 if we configure a value for the vm.dirty _bytes.

5.3. The *_centisecs variables

Of course, data cached in the system memory is at risk of loss in case of a power outage. Hence, to safeguard the system from data loss, the following variables dictate how long and how often data is written to secondary storage:

  • vm.dirty_expire_centisecs
  • vm.dirty_writeback_centisecs

The vm.dirty_expire_centisecs manages how long data can be in the cache before it’s written to drive. Let’s set the variable so that data can stay for 40 seconds in the cache:

$ sudo sysctl -w vm.dirty_expire_centisecs=4000
vm.dirty_expire_centisecs = 4000

In this case, cached info can stay up to 40 seconds before it’s written to the drive. Notably, 1s equals 100 centisecs.

Further, the vm.dirty_writeback_centisecs is the variable for how often the write background process checks to see if there’s data to write to secondary storage. Thus, the lower the value, the higher the frequency, and vice versa.

Let’s configure vm.dirty_writeback_centisecs to check the cache every 5 seconds:

$ sudo sysctl -w vm.dirty_writeback_centisecs=500
vm.dirty_writeback_centisecs = 500

Again, the 500 centisecs value is equal to 5 seconds. Next, let’s make our configurations permanent.

6. Modifying the /etc/sysctl.conf File

Having set up the file system caching configurations at runtime, we’ll want to make these changes persistent. To do so, let’s add all the changes in the /etc/sysctl.conf file. The system reads this file during the boot process.

Now, let’s open /etc/sysctl.conf in an editor and add the earlier configurations to it:

vm.vfs_cache_pressure=50
vm.swappiness=10
vm.dirty_ratio=20
vm.dirty_background_ratio=10
vm.dirty_expire_centisecs=4000
vm.dirty_writeback_centisecs=500

Again, the vm.*_ratio variables will become 0 if we set the value for the vm.* _bytes and vice versa.

Further, to apply the changes in the /etc/sysctl.conf file without reboot, we can use the -p switch of sysctl:

$ sudo sysctl -p

Lastly, we can verify each setting via sysctl -a. Then, we attach the variable in question using the format sysctl vm.*:

$ sysctl vm.vfs_cache_pressure
vm.vfs_cache_pressure = 50

However, we can also use the cat command with the full path to the variable:

$ cat /proc/sys/vm/vfs_cache_pressure
10

Again, we can apply these commands to any parameter to confirm our settings.

7. Conclusion

In this article, we’ve learned how to configure file system caching on a Linux system. We saw how to use and set variables for cache optimization. Also, we set up the changes in the /etc/sysctl.conf file to be persistent on the system.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

guest
0 Comments
Inline Feedbacks
View all comments