When working with logs in Linux, we often need to focus on a particular date range. For example, if we have a program that crashed during the night, we can filter a log file for this program based on that interval alone. This can significantly speed up our debugging.
In this tutorial, we’ll learn ways to read log files between two dates. Firstly, we’ll learn the command that we can use for time-based filtering. Secondly, we’ll look at different time formats and see how they can affect the command we use. Finally, we’ll discover alternative tools to filter based on time intervals.
2. The sed Command
There are many tools that we can use for the time-based filtering of log files. However, in this tutorial, we’ll mainly focus on the sed command as a log filter due to its reliability and simplicity of usage. Moreover, sed is a POSIX standard tool, so it’s available on most Linux distributions.
To make sed filter the log file based on the dates, we run the following command:
$ sed -n '/START_DATE/,/FINISH_DATE/p' LOG_FILE
Let’s see what the symbols in the command mean:
- the -n option is used to not output each line of the file it reads
- START_DATE is the start date sequence for our search
- FINISH_DATE is the end date sequence for our search
- both sequences are surrounded by forward slashes and separated by the comma
- the p symbol prints lines that match the preceding expression
- LOG_FILE is the name of our log file
Notably, sed looks for exact pattern matches in a file. So this command works correctly only when both the START_DATE and FINISH_DATE patterns exist in the log file.
We’ll see examples of sed usage in the next section.
3. Filter Based on Time Formats
Before we begin, let’s quickly look at the standard time format symbols:
- %Y – year
- %y – last two digits of the year (00..99)
- %m – month (01..12)
- %b – locale’s abbreviated month name (e.g., Jan)
- %B – locale’s full month name (e.g., January)
- %d – day of the month (e.g., 01)
- %H – hour (00..23)
- %M – minute (00..59)
- %S – second (00..60)
Now, we’ll run the sed command for different time-logging formats.
3.1. Format %Y-%m-%d %H:%M:%S
Let’s look at the log file with this time format using the cat command:
$ cat file_1.log 2021-09-20 13:00:00 INFO line1 2021-09-21 13:00:00 INFO line2 2021-09-21 13:00:01 INFO line3 2021-09-22 12:59:59 INFO line4 2022-09-20 13:00:00 INFO line5
We can see the file with five lines in the defined time format.
Let’s use the sed command to filter this log file between the dates 2021-09-20 and 2021-09-22:
$ sed -n '/2021-09-20/,/2021-09-22/p' file_1.log 2021-09-20 13:00:00 INFO line1 2021-09-21 13:00:00 INFO line2 2021-09-21 13:00:01 INFO line3 2021-09-22 12:59:59 INFO line4
We can see that line 5 has been filtered because it doesn’t match our time range.
Similarly, we can include the hours, minutes, and seconds in our search. For example, let’s filter the log between the times 2021-09-20 and 2021-09-21 13:00:00:
$ sed -n '/2021-09-20/,/2021-09-21 13:00:00/p' file_1.log 2021-09-20 13:00:00 INFO line1 2021-09-21 13:00:00 INFO line2
Now, we are left with lines 1 and 2, as all the rest don’t match the range.
3.2. Format [%Y-%m-%d %H:%M:%S]
This format is similar to the above with the only difference being that now we use square brackets around the time.
Let’s look at our log file:
$ cat file_2.log [2021-09-20 13:00:00] INFO line1 [2021-09-21 13:00:00] INFO line2 [2021-09-21 13:00:01] INFO line3 [2021-09-22 12:59:59] INFO line4 [2022-09-20 13:00:00] INFO line5
We can see that the log times are identical to the earlier example.
To print this log between any time frame, we can use the same format as before. For example, let’s filter for the time range between 2021-09-20 and 2021-09-22:
$ sed -n '/2021-09-20/,/2021-09-22/p' file_2.log [2021-09-20 13:00:00] INFO line1 [2021-09-21 13:00:00] INFO line2 [2021-09-21 13:00:01] INFO line3 [2021-09-22 12:59:59] INFO line4
Line 5 hasn’t been printed as we expected.
3.3. Format %d/%m/%Y %H:%M:%S
Let’s look at the log file containing this time format:
$ cat file_3.log 20/09/2021 13:00:00 INFO line1 21/09/2021 13:00:00 INFO line2 21/09/2021 13:00:01 INFO line3 22/09/2021 12:59:59 INFO line4 20/09/2022 13:00:00 INFO line5
The main difference in this format is the presence of the / symbols between the day, month, and year.
Let’s try to filter the dates between 20/09/2021 and 22/09/2021 using the same command as above:
$ sed -n '/20/09/2021/,/22/09/2021/p' file_3.log sed: -e expression #1, char 5: unknown command: `0'
Oops. We’ve got the unknown command error. This is because sed uses the / symbol for its delimiters. So, it can’t recognize the end of the search pattern.
To run the command correctly, we’ll need to put the \ symbol in front of each / symbol in the START_DATE and FINISH_DATE patterns:
$ sed -n '/20\/09\/2021/,/22\/09\/2021/p' file_3.log 20/09/2021 13:00:00 INFO line1 21/09/2021 13:00:00 INFO line2 21/09/2021 13:00:01 INFO line3 22/09/2021 12:59:59 INFO line4
Now, the log file has been filtered as expected.
3.4. Pattern in the Log Data
Although the above examples will work in most cases, there’s an exception to the rule.
Let’s look at the following log file:
$ cat file_4.log 2021-09-20 13:00:00 INFO line1 2021-09-21 13:00:00 INFO line2 2021-09-21 13:00:01 INFO 2021-09-22 2021-09-22 12:59:59 INFO line4 2022-09-20 13:00:00 INFO line5
As we can see, line 3 contains a pattern that can confuse the search procedure. For example, let’s run sed for the dates 2021-09-20 and 2021-09-22:
$ sed -n '/2021-09-20/,/2021-09-22/p' file_4.log 2021-09-20 13:00:00 INFO line1 2021-09-21 13:00:00 INFO line2 2021-09-21 13:00:01 INFO 2021-09-22
Now, the output is missing line 4 because sed stopped its search when it found the exact pattern match in line 3.
One way to ensure this misbehavior doesn’t happen is to add the ^ symbol in front of the patterns. This symbol makes sed match the pattern only if it’s located at the beginning of a line:
$ sed -n '/^2021-09-20/,/^2021-09-22/p' file_4.log 2021-09-20 13:00:00 INFO line1 2021-09-21 13:00:00 INFO line2 2021-09-21 13:00:01 INFO 2021-09-22 2021-09-22 12:59:59 INFO line4
Now, the time-filtering works correctly. Similar corrections might be necessary, depending on the time format.
3.5. Other Time Formats
In general, the sed command we’ve considered above can be applied to any time format. However, we shouldn’t forget the main principles we discussed:
- The START_DATE and FINISH_DATE patterns should exist in the log file.
- The formats that contain the / symbols need to be searched in the way described above.
- If the START_DATE or FINISH_DATE patterns exist in the log message, we can add the ^ symbol in front of the patterns to enable matches only at the start of the line.
Now that we know how to use the sed command, let’s look at some other time-logging tools.
4. Other Time Logging Tools
Although the sed command is a great tool to use, there are situations when we need more flexibility in printing the time range.
For example, we often may not know the exact time pattern to use. In this situation, we’d need to define an approximate time range. For such a scenario, we can use specialized time-searching tools like dategrep.
To install dategrep, we can use Git and build it ourselves:
$ git clone https://github.com/mdom/dategrep.git ... $ cd dategrep $ ./build-standalone ...
dategrep uses the following format:
$ dategrep --start "START_DATE" --end "FINISH_DATE" --format "TIME_FORMAT" LOG_FILE
Here, TIME_FORMAT is the time format written using the standard time symbols. For example, let’s now filter file_1.log from earlier based on the time range between 2021-09-20 12:59:59 and 2021-09-20 13:00:01:
dategrep --start "2021-09-20 12:59:59" --end "2021-09-20 13:00:01" --format "%Y-%m-%d %H:%M:%S" file_1.log 2021-09-20 13:00:00 INFO line1
As we can see, dategrep correctly printed the log from line 1 only as the other lines don’t match the time range. This example shows that, unlike the simple sed filters, dategrep can process time intervals even if their exact patterns don’t exist in the log file.
However, tools like dategrep have their limitation in the accepted time formats.
In this article, we learned how to print a log file segment based on a time range. Firstly, we looked at the sed command and applied it to different timing formats. Then, we discovered the dategrep tool, which is a good alternative if we need more flexibility.