Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Introduction

Linux memory management is built around the concept that unused memory is wasted memory. So, we shouldn’t expect that only processes store their data in memory. The kernel equally keenly uses RAM to store files, as accessing them in memory is much faster than on disk.

In this tutorial, we’ll learn about managing and monitoring the Linux page cache.

2. The Page Cache

The kernel uses page cache to speed up any further file access. In the case of writing, the new file content is stored in the memory and transferred to the underlying storage device after some time. Thus, the system has the ability to collect further modifications and write them all in one I/O operation. Similarly, while reading, the kernel preserves once-read file content for any upcoming use.

These pages stay in the memory until there are new memory requests or explicit clearing.

2.1. Getting Page Cache Information

We have a bunch of tools to report the page cache size. First, let’s use the free command:

$ free -h --si
               total        used        free      shared  buff/cache   available
Mem:             15G        3,0G          9G        457M        2,9G         12G
Swap:            15G        1,0M         15G

So, we’ve found out in the buff/cache column that the buffers and page cache use 2,9 GB of memory — the same information we’re going to find in the top output:

MiB Mem :  15888,2 total,   9885,4 free,   3074,2 used,   2928,6 buff/cache

Now, let’s look for the size of the page cache alone in the /proc/meminfo file:

$ cat /proc/meminfo | grep ^Cached
Cached:          2942600 kB

3. Dropping the Caches

We can drop the page cache by writing an appropriate number to the /proc/sys/vm/drop_caches file. So, with 1 (one), we’re going to ask the kernel to drop the page cache:

$ echo 1 | sudo tee /proc/sys/vm/drop_caches > /dev/null

Next, writing 2 causes freeing dentries and inodes:

$ echo 2 | sudo tee /proc/sys/vm/drop_caches > /dev/null

Finally, passing 3 results in emptying everything — page cache, cached dentries, and inodes:

$ echo 3 | sudo tee /proc/sys/vm/drop_caches > /dev/null

3.1. When the Dropping Fails: Dirty Pages

Notice that the kernel won’t drop the dirty pages (those with fresh or changed file content). By itself, the kernel schedules the synchronization or does it under memory pressure. Thus, we need to write such content manually before dropping the cache. So, let’s use the sync command first:

$ sudo bash -c 'sync; echo 3 > /proc/sys/vm/drop_caches'

3.2. When the Dropping Fails: Locked Content

The system offers the possibility of locking the file content in the memory. So, we can’t just drop the corresponding pages. As an example, let’s use the vmtouch command to cache and lock the test.dat file:

$ vmtouch -tl test.dat
LOCKED 492712 pages (1G)

Next, let’s try to drop the caches, just to find out that our efforts have failed:

$ cat /proc/meminfo | grep ^Cached &&
sudo bash -c 'sync; echo 3 > /proc/sys/vm/drop_caches' &&
cat /proc/meminfo | grep ^Cached
#
Cached:          3336092 kB
Cached:          3169224 kB

So, the difference is obviously too small to be caused by freeing around 2 GB of memory. Now, our only solution is to find the PID of the lock-holding process and kill it beforehand.

Finally, notice that the user limit memlock specifies the maximum amount of locked memory per process. So, let’s check it:

$ ulimit -Ha | grep locked
max locked memory           (kbytes, -l) 2033684

4. Tracking Kernel Events

Now, let’s take a closer look at the file caching process. Hence, let’s track the kernel events with the powerful Linux profiling tool perf from the linux-tools-common package:

$ sudo perf record -a -e filemap:mm_filemap_add_to_page_cache,filemap:mm_filemap_delete_from_page_cache sleep 120

So, we focus on two events: mm_filemap_add_to_page_cache, related to reading a file into memory, and mm_filemap_delete_from_page_cache, emitted when the file is being evicted. Finally, we’re going to record events through 120 seconds thanks to the sleep command.

After the perf command’s started, let’s cache the test.dat file while reading it with cat:

$ cat test.dat > /dev/null

Afterward, let’s inspect the log with:

$ sudo perf script
# ...             
             cat  8423 [002] 13771.287206:      filemap:mm_filemap_add_to_page_cache: dev 8:4 ino 8e0042 page=0x228eb7 pfn=0x228eb7 ofs=0
             cat  8423 [002] 13771.287208:      filemap:mm_filemap_add_to_page_cache: dev 8:4 ino 8e0042 page=0x24defb pfn=0x24defb ofs=4096
# ...

So, we’ve learned that the cat command of PID 8423 caused the mapping of the file indexed by node 8e0042. Subsequently, we see information about used memory pages.

5. Case Study

Now, let’s design a short study of how the kernel works with page cache under memory pressure. So, let’s prepare a file big_file of size comparable to the computer’s RAM:

$ ls -hs big_file
12G big_file

Next, let’s move the file to the memory with the usual help of the cat command. Then, let’s check the results with fincore:

$ fincore big_file
  RES   PAGES  SIZE FILE
11,3G 2956272 11,3G big_file

As the kernel loads the memory with the file content, let’s add some memory stress with the stress-ng command. So, we’re going to dispatch two hogs to consume 4 GB of RAM for 60 seconds:

$ stress-ng --vm 2 --vm-bytes 4G --timeout 60s

Let’s keep in mind that we need to start the perf profiler before the hogs run to register the kernel events.

When stress-ng finishes, let’s check the results. First, we’re going to find out how much of the big_file resides in memory now:

$ fincore big_file
  RES   PAGES  SIZE FILE
 7,7G 2012534 11,3G big_file

So, we see that the kernel has evicted around 30% of this file away.

Next, let’s examine the perf log file to trace down corresponding events:

# ...
         kswapd0   105 [001]  3146.624312: filemap:mm_filemap_delete_from_page_cache: dev 8:4 ino 8e0067 page=0x25ddfb pfn=0x25ddfb ofs=344064
         kswapd0   105 [001]  3146.624313: filemap:mm_filemap_delete_from_page_cache: dev 8:4 ino 8e0067 page=0x2e75a3 pfn=0x2e75a3 ofs=348160
# ...

So, we see the delete events for inode 8e0067. Let’s convert this to decimal 9306215 and make sure that it’s the big_file‘s inode:

$ ls -i big_file
9306215 big_file

Finally, keep in mind that the kernel assigns the task of cleaning the page cache to kswapd0, the virtual memory manager.

6. Why Drop Caches?

When facing memory shortages, dropping the cache is a tempting workaround. However, let’s discuss if it’s a correct measure. First, we’ve observed in our case study that the kernel is quite eager to reclaim the pages. So, we shouldn’t treat the growing cache as a threat, as this memory is still available for the applications.

On the other hand, we can’t drop dirty or locked pages anyway, without some additional procedures. Thus, we need to perform cache synchronization or even eliminate misbehaving processes. We should regard the full cache as a symptom of an incorrect configuration of the system or users’ applications.

Consequently, we shouldn’t view dropping the cache as a normal system administration routine.

7. Conclusion

In this article, we learned about the Linux page cache. First, we briefly clarified the idea of using a memory area to store files. Then, we found out how to monitor the cache size.

We also dropped the cache manually and discussed some limitations to this process. Next, we used kernel events to track the caching process. Then we put that all together as we studied the kernel’s activity in the face of memory shortage.

Finally, we discussed the possible reasons behind manual cache dropping.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

guest
0 Comments
Inline Feedbacks
View all comments