Authors Top

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

1. Overview

We often need to get or monitor resource usage by currently running processes in the Linux command line. When we’re facing this kind of requirement, two handy commands may come up: the ps command and the top command.

By default, they will list the resource usages of all processes that the current login user can see. However, sometimes, we would like to periodically log the resource usage, such as CPU usage, of a specific process.

This tutorial will discuss the best way to do that.

2. Introduction to the Problem

Let’s say we want to write the CPU usage of a specific process every two seconds into a log file so that we can analyze the data later.

We will build our solutions based on the ps and top commands.

To make our test easier, let’s take the mpv media player and play an HD movie to simulate a long-running and random CPU-consuming target process.

3. CPU Usage: ps vs. top

We know that both the ps and top commands can report the CPU usage of processes.

Before we start looking at how to get CPU usages of a single process using these commands, we should first understand how the two commands calculate their CPU usage data.

This is important because their CPU usage values have different meanings.

3.1. Understanding CPU Usage With the ps Command

Let’s take a look at the ps command first. The ps command reports a snapshot status of current processes.

However, its CPU usage value isn’t the real-time usage metric of the time point we execute the command. Instead, the CPU usage provided by the ps command is expressed as the percentage of time spent running during the entire lifetime of a process.

So, in other words, it’s the average CPU usage of a process over the time it has been running.

Next, let’s take a look at how the top command calculates its CPU usage value.

3.2. Understanding CPU Usage With the top Command

Unlike the ps command, the top command can show detailed information of processes in an interactive user interface. In addition, it refreshes the result in an interval that we can define using the -d option.

It calculates the CPU usage value in a different way from the ps command.

The top command calculates the elapsed CPU time since the last screen update, expressed as a percentage of total CPU time.

For example, suppose we set two seconds as the refresh interval, and the CPU usage reports 50% after a refresh. The usage value “50%” means in the last two seconds, the accumulated CPU running time for the process is one second.

Despite both ps and top reporting the CPU usage value in a percentage form, the meanings are different. Therefore, we should choose the right tool depending on the requirement. Otherwise, our analysis based on the logged data may go in the wrong direction.

4. Writing the CPU Usage to a Log File Every Two Seconds

Now, we’ll address how to create ps- and top-based shell scripts to write the CPU usages of a given process into a log file.

As we’ve said earlier, we’ll take the running mpv process as an example. Also, we’ll skip the argument check and error handling parts in our example scripts.

4.1. Building a Shell Script Based on the ps Command

We can use the command “ps -C PROCESS_NAME -o %cpu” to retrieve the CPU usage of the given process, for example:

$ ps -C mpv -o %cpu
%CPU
63.8

We can build a simple shell script to write the CPU usage to a log file every two seconds:

$ cat ./cpu_usage_ps.sh
#!/bin/bash

PNAME="$1"
LOG_FILE="$2"

while true ; do
    echo "$(date) :: $PNAME[$(pidof ${PNAME})] $(ps -C ${PNAME} -o %cpu | tail -1)%" >> $LOG_FILE
    sleep 2
done

The script above expects two arguments: a process name and a log file path.

We put the logging implementation in a “while true…“. Therefore, when we start the script, it’ll keep writing CPU usage of the given process to the log file until we terminate it manually.

Also, we write the current time “$(date)”, process name and PID “$PNAME[$(pidof ${PNAME})]” together with the CPU usage data to the log file.

Next, let’s start the script:

$ ./cpu_usage_ps.sh mpv /tmp/log_ps.txt

Then, we can monitor the changes of the log file using the tail -f command:

$ tail -f /tmp/log_ps.txt
Fri Sep  3 10:05:08 PM CEST 2021 :: mpv[215406] 40.9%
Fri Sep  3 10:05:10 PM CEST 2021 :: mpv[215406] 40.9%
Fri Sep  3 10:05:12 PM CEST 2021 :: mpv[215406] 41.0%
Fri Sep  3 10:05:14 PM CEST 2021 :: mpv[215406] 41.1%
Fri Sep  3 10:05:16 PM CEST 2021 :: mpv[215406] 41.2%
Fri Sep  3 10:05:18 PM CEST 2021 :: mpv[215406] 41.2%
...

As the output shows, our script has written CPU usages of the mpv process to the specified log file every two seconds.

4.2. Using the top Command to Print the Status of a Single Process in Batch Mode

Usually, the top command will start in an interactive interface mode to show the process status.

However, the -b option tells the top command to run in a batch mode to redirect the output to a log file.

Furthermore, we need to set the refresh interval by using the -d option.

For example, we can make the top command keep reporting the status of the mpv process every two seconds until we manually kill it:

$ top -b -d 2 -p $(pidof mpv)
top - 21:26:39 up 1 day, 12:05,  1 user,  load average: 1.42, 1.35, 1.34
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  5.3 us,  4.8 sy,  1.6 ni, 87.2 id,  0.0 wa,  0.5 hi,  0.5 si,  0.0 st
MiB Mem :  31891.0 total,   6762.8 free,  10898.7 used,  14229.6 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  16885.9 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 185292 kent      20   0 3077452 282804  82880 S   56.7   0.9   2:01.35 mpv

top - 21:26:41 up 1 day, 12:05,  1 user,  load average: 1.47, 1.36, 1.35
 ( ... summary omitted ... )
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 185292 kent      20   0 3077452 283332  82880 S   58.0   0.9   2:01.41 mpv

top - 21:26:43 up 1 day, 12:05,  1 user,  load average: 1.47, 1.36, 1.35
 ( ... summary omitted ... )
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 185292 kent      20   0 3077452 284124  82880 S   57.0   0.9   2:01.47 mpv

top - 21:26:45 up 1 day, 12:05,  1 user,  load average: 1.43, 1.36, 1.34
 ( ... summary omitted ... )
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 185292 kent      20   0 3077452 284916  82880 S   53.5   0.9   2:01.54 mpv
^C

As the output above shows, the top command prints the process status every two seconds. We’ve terminated the top command by pressing Ctrl-C after four iterations.

According to this problem, we only need the CPU usage data. Therefore, we need to extract the timestamp and the CPU usage from this output.

4.3. Building a Shell Script Based on the top Command

awk is a powerful weapon for processing text. So, let’s pipe the top output to an awk command to solve the problem:

$ cat cpu_usage_top.sh
#!/bin/bash
PNAME="$1"
LOG_FILE="$2"
PID=$(pidof ${PNAME})

top -b -d 2 -p $PID | awk \
    -v cpuLog="$LOG_FILE" -v pid="$PID" -v pname="$PNAME" '
    /^top -/{time = $3}
    $1+0>0 {printf "%s %s :: %s[%s] CPU Usage: %d%%\n", \
            strftime("%Y-%m-%d"), time, pname, pid, $9 > cpuLog
            fflush(cpuLog)}'

The script isn’t hard to understand. The awk command extracts the required data from the top output and redirects to the log file.

We should note that at the end of the awk command, we must call the fflush function to write any new output into the log file.

Otherwise, it’ll buffer the output, and we may lose data when we kill the script manually.

Now, let’s give this script a try. First, we start the script and pass mpv and the log file /tmp/log_top.txt as arguments:

$ ./cpu_usage_top.sh mpv /tmp/log_top.txt

Similarly, we move to the log file and check if our script writes the expected data in it every two seconds:

$ tail -f /tmp/log_top.txt
2021-09-03 22:14:37 :: mpv[215406] CPU Usage: 40%
2021-09-03 22:14:39 :: mpv[215406] CPU Usage: 43%
2021-09-03 22:14:41 :: mpv[215406] CPU Usage: 40%
2021-09-03 22:14:43 :: mpv[215406] CPU Usage: 54%
2021-09-03 22:14:45 :: mpv[215406] CPU Usage: 57%
2021-09-03 22:14:47 :: mpv[215406] CPU Usage: 56%
...

As we can see, our script works as expected.

5. Conclusion

In this article, we’ve addressed building simple shell scripts to log the CPU usages of a single process.

Further, we’ve discussed why the CPU usage values from ps and top have different meanings. We should choose the right tool depending on the requirement.

Authors Bottom

If you have a few years of experience in the Linux ecosystem, and you’re interested in sharing that experience with the community, have a look at our Contribution Guidelines.

Comments are closed on this article!