Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: March 18, 2024
Filtering log file entries based on a date range is an important task in system administration. Log files provide a record of system events and errors that can be useful for troubleshooting and monitoring. We can use the timestamped entries in a log file to filter content within a specific date range.
In this tutorial, we’ll explore how to filter log file entries using Bash scripts based on a single date range or multiple ones.
Let’s suppose we want to filter entries in the /var/log/auth.log file. Inspecting the log file, we observe entries with timestamps ranging from Sep 17 till the current date:
$ sudo cat /var/log/auth.log
Sep 17 06:22:17 debian login[652]: pam_unix(login:session): session opened for user sysadmin by LOGIN(uid=0)
...
Sep 23 08:34:24 debian sudo: pam_unix(sudo:session): session opened for user root by (uid=0)
Our sample task is two-fold:
For the first objective, we specify a single start and end time, while in the second, we specify multiple start and end times spanning across three days.
Let’s see how we can solve this task.
To extract log entries from /var/log/auth.log spanning the last hour till the current time, we specify the start and end times as Unix timestamps using the date command:
$ start_time="$(date -d 'now - 1 hour' +'%s')"
$ end_time="$(date -d now +'%s')"
The -d option specifies a date such as now for the current time or now – 1 hour for 1 hour ago. On the other hand, the +’%s’ format converts the dates to Unix time. Thus, we can compare dates numerically.
Next, we extract the timestamps from /var/log/auth.log. To do so, we can use the cut command to extract the timestamps:
$ sudo cat /var/log/auth.log | cut -d ' ' -f 1-3
...
Sep 23 08:35:13
Sep 23 08:35:13
The -d option used with cut specifies the delimiter, whereas the -f option specifies the fields to extract by their consecutive numbers.
After this, we can use a while loop to read and extract the timestamp from each log entry before converting it to Unix time. Then, we compare these Unix timestamps against the start and end times specified earlier.
The filter_log.sh script implements the entire procedure:
$ cat filter_log.sh
#!/usr/bin/env bash
start_time="$(date -d 'now - 1 hour' +'%s')"
end_time="$(date -d now +'%s')"
sudo cat /var/log/auth.log | while read -r line; do
time_stamp=$(echo "$line" | cut -d ' ' -f 1-3)
time_stamp=$(date -d "$time_stamp" +'%s')
if [ "$time_stamp" -ge "$start_time" -a "$time_stamp" -le "$end_time" ]; then
echo "$line"
fi
done
To summarize, the script carries out several steps:
Next, we grant the script execute permissions using chmod:
$ chmod +x filter_log.sh
Finally, we run the script:
$ ./filter_log.sh
Sep 23 07:45:01 debian CRON[29156]: pam_unix(cron:session): session opened for user root by (uid=0)
...
Sep 23 08:35:41 debian sudo: pam_unix(sudo:session): session opened for user root by (uid=0)
Notably, the extracted log entries fall within the last hour from the current time.
We can set up a script named multi_range_filter.sh to extract log entries spanning Sep 19-21 between 10:00:00 and 10:05:00:
$ cat multi_range_filter.sh
#!/usr/bin/env bash
start_times=()
end_times=()
for day in {19..21}; do
start_times+=("$(date -d "Sep $day 2023 10:00:00" +'%s')")
end_times+=("$(date -d "Sep $day 2023 10:05:00" +'%s')")
done
n="${#start_times[@]}"
sudo cat /var/log/auth.log | while read -r line; do
time_stamp=$(echo "$line" | cut -d ' ' -f 1-3)
time_stamp=$(date -d "$time_stamp" +'%s')
for index in $(seq 0 $((n-1))); do
start_time="${start_times[$index]}"
end_time="${end_times[$index]}"
if [ "$time_stamp" -ge "$start_time" -a "$time_stamp" -le "$end_time" ]; then
echo "$line"
fi
done
done
The script implements a series of steps:
Notably, we introduce an inner for loop to iterate over each pair of corresponding start and end dates. Therefore, as we read each line of the log file, we check if its Unix timestamp falls within any of the three date ranges. If so, we print the line.
Finally, we can run and test the script:
$ ./multi_range_filter.sh
Sep 19 10:00:01 debian CRON[19718]: pam_unix(cron:session): session opened for user sysadmin by (uid=0)
Sep 19 10:00:01 debian CRON[19718]: pam_unix(cron:session): session closed for user sysadmin
Sep 20 10:00:01 debian CRON[16730]: pam_unix(cron:session): session opened for user sysadmin by (uid=0)
Sep 20 10:00:01 debian CRON[16730]: pam_unix(cron:session): session closed for user sysadmin
Sep 21 10:00:01 debian CRON[2051]: pam_unix(cron:session): session opened for user sysadmin by (uid=0)
Sep 21 10:00:02 debian CRON[2051]: pam_unix(cron:session): session closed for user sysadmin
We see that the timestamps of the extracted log entries fall within the three date ranges.
In this article, we explored how to filter the entries of a log file based on a date range or a series of ranges.
In particular, we used a scripting approach that relies on extracting and converting the timestamps in the log file to Unix time. The timestamps are then compared numerically to those defining the date range or ranges, and log entries are printed only if their timestamps falls within.