1. Introduction

When working with logs in Linux, we often need to focus on a particular date range. For example, if we have a program that crashed during the night, we can filter a log file for this program based on that interval alone. This can significantly speed up our debugging.

In this tutorial, we’ll learn ways to read log files between two dates. Firstly, we’ll learn the command that we can use for time-based filtering. Secondly, we’ll look at different time formats and see how they can affect the command we use. Finally, we’ll discover alternative tools to filter based on time intervals.

2. The sed Command

There are many tools that we can use for the time-based filtering of log files. However, in this tutorial, we’ll mainly focus on the sed command as a log filter due to its reliability and simplicity of usage. Moreover, sed is a POSIX standard tool, so it’s available on most Linux distributions.

To make sed filter the log file based on the dates, we run the following command:

$ sed -n '/START_DATE/,/FINISH_DATE/p' LOG_FILE

Let’s see what the symbols in the command mean:

  • the -n option is used to not output each line of the file it reads
  • START_DATE is the start date sequence for our search
  • FINISH_DATE is the end date sequence for our search
  • both sequences are surrounded by forward slashes and separated by the comma
  • the p symbol prints lines that match the preceding expression
  • LOG_FILE is the name of our log file

Notably, sed looks for exact pattern matches in a file. So this command works correctly only when both the START_DATE and FINISH_DATE patterns exist in the log file.

However, sometimes, we may only want to search log entries before or after one single date pattern. We’ll discuss this scenario in the tutorial as well.

So next, let’s see examples of sed usage.

3. Filter Based on Time Formats

Before we begin, let’s quickly look at the standard time format symbols:

  • %Y – year
  • %y – last two digits of the year (00..99)
  • %m – month (01..12)
  • %b – locale’s abbreviated month name (e.g., Jan)
  • %B – locale’s full month name (e.g., January)
  • %d – day of the month (e.g., 01)
  • %H – hour (00..23)
  • %M – minute (00..59)
  • %S – second (00..60)

Now, we’ll run the sed command for different time-logging formats.

3.1. Format %Y-%m-%d %H:%M:%S

Let’s look at the log file with this time format using the cat command:

$ cat file_1.log 
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO line3
2021-09-22 12:59:59   INFO line4
2022-09-20 13:00:00   INFO line5

We can see the file with five lines in the defined time format.

Let’s use the sed command to filter this log file between the dates 2021-09-20 and 2021-09-22:

$ sed -n '/2021-09-20/,/2021-09-22/p' file_1.log 
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO line3
2021-09-22 12:59:59   INFO line4

We can see that line 5 has been filtered because it doesn’t match our time range.

Similarly, we can include the hours, minutes, and seconds in our search. For example, let’s filter the log between the times 2021-09-20 and 2021-09-21 13:00:00:

$ sed -n '/2021-09-20/,/2021-09-21 13:00:00/p' file_1.log 
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2

Now, we are left with lines 1 and 2, as all the rest don’t match the range.

3.2. Format [%Y-%m-%d %H:%M:%S]

This format is similar to the above with the only difference being that now we use square brackets around the time.

Let’s look at our log file:

$ cat file_2.log 
[2021-09-20 13:00:00]   INFO line1
[2021-09-21 13:00:00]   INFO line2
[2021-09-21 13:00:01]   INFO line3
[2021-09-22 12:59:59]   INFO line4
[2022-09-20 13:00:00]   INFO line5

We can see that the log times are identical to the earlier example.

To print this log between any time frame, we can use the same format as before. For example, let’s filter for the time range between 2021-09-20 and 2021-09-22:

$ sed -n '/2021-09-20/,/2021-09-22/p' file_2.log 
[2021-09-20 13:00:00]   INFO line1
[2021-09-21 13:00:00]   INFO line2
[2021-09-21 13:00:01]   INFO line3
[2021-09-22 12:59:59]   INFO line4

Line 5 hasn’t been printed as we expected.

3.3. Format %d/%m/%Y %H:%M:%S

Let’s look at the log file containing this time format:

$ cat file_3.log 
20/09/2021 13:00:00   INFO line1
21/09/2021 13:00:00   INFO line2
21/09/2021 13:00:01   INFO line3
22/09/2021 12:59:59   INFO line4
20/09/2022 13:00:00   INFO line5

The main difference in this format is the presence of the / symbols between the day, month, and year.

Let’s try to filter the dates between 20/09/2021 and 22/09/2021 using the same command as above:

$ sed -n '/20/09/2021/,/22/09/2021/p' file_3.log 
sed: -e expression #1, char 5: unknown command: `0'

Oops. We’ve got the unknown command error.  This is because sed uses the / symbol for its delimiters. So, it can’t recognize the end of the search pattern.

To run the command correctly, we’ll need to put the \ symbol in front of each / symbol in the START_DATE and FINISH_DATE patterns:

$ sed -n '/20\/09\/2021/,/22\/09\/2021/p' file_3.log 
20/09/2021 13:00:00   INFO line1
21/09/2021 13:00:00   INFO line2
21/09/2021 13:00:01   INFO line3
22/09/2021 12:59:59   INFO line4

Now, the log file has been filtered as expected.

3.4. Pattern in the Log Data

Although the above examples will work in most cases, there’s an exception to the rule.

Let’s look at the following log file:

$ cat file_4.log 
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO 2021-09-22
2021-09-22 12:59:59   INFO line4
2022-09-20 13:00:00   INFO line5

As we can see, line 3 contains a pattern that can confuse the search procedure. For example, let’s run sed for the dates 2021-09-20 and 2021-09-22:

$ sed -n '/2021-09-20/,/2021-09-22/p' file_4.log
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO 2021-09-22

Now, the output is missing line 4 because sed stopped its search when it found the exact pattern match in line 3.

One way to ensure this misbehavior doesn’t happen is to add the ^ symbol in front of the patterns. This symbol makes sed match the pattern only if it’s located at the beginning of a line:

$ sed -n '/^2021-09-20/,/^2021-09-22/p' file_4.log
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO 2021-09-22
2021-09-22 12:59:59   INFO line4

Now, the time-filtering works correctly. Similar corrections might be necessary, depending on the time format.

3.5. Other Time Formats

In general, the sed command we’ve considered above can be applied to any time format. However, we shouldn’t forget the main principles we discussed:

  1. The START_DATE and FINISH_DATE patterns should exist in the log file.
  2. The formats that contain the / symbols need to be searched in the way described above.
  3. If the START_DATE or FINISH_DATE patterns exist in the log message, we can add the ^ symbol in front of the patterns to enable matches only at the start of the line.

4. Filtering Logs by One Single Date

We’ve seen examples of filtering log entries using the sed command with two dates. Sometimes, we only want to get the logs before or after a date. Next, let’s see how to get the required data using sed.

We’ll use the file_5.log file as the input example:

$ cat file_5.log
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO line3
2021-09-22 13:59:59   INFO line4
2021-09-22 14:59:59   INFO line5
2021-09-27 02:57:59   INFO line6
2021-09-28 10:59:59   INFO line7
2022-09-29 08:00:00   INFO line8

4.1. Before a Date

Let’s say we want to get log entries before ‘2021-09-27’. In this case, we can pass a range address to the sed command: 1,  /^2021-09-22/. Here, the date pattern isn’t new to us. The ‘1‘ address tells sed to start from the first line. So, ‘1, /pattern/’ means: from the first line until the first line matching /pattern/ (inclusive).

Next, let’s test it on our input file:

$ sed -n '1, /^2021-09-27/p' file_5.log
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO line3
2021-09-22 13:59:59   INFO line4
2021-09-22 14:59:59   INFO line5
2021-09-27 02:57:59   INFO line6

4.2. After a Date

Similarly, we can use the address ‘/pattern/, $’ to get the first matched line until the last line in the input file. So, let’s get all log entries after ‘2021-09-27‘:

$ sed -n '/^2021-09-27/, $p' file_5.log 
2021-09-27 02:57:59   INFO line6
2021-09-28 10:59:59   INFO line7
2022-09-29 08:00:00   INFO line8

As the output shows, we’ve got the expected log entries.

5. A Word About the Disadvantage of Regex-Based Pattern Matching

Using the sed command with a proper address, we can obtain log entries in a time range from a sorted log file. However, as we’ve mentioned earlier, sed’s /PATTERN/ address does regex-based pattern matching. It requires the pattern to appear in the input. Otherwise, the command may produce unexpected results. Next, let’s see an example.

Let’s say we want to get all log entries after ‘2021-09-25’ from file_5.log. We’ve learned we can use the address ‘/^2021-09-25/, $‘ to get the logs. Now, let’s test it:

$ sed -n '/^2021-09-25/, $p' file_5.log
$

The sed command outputs nothing. Similarly, if we try to use the ‘1, /^2021-09-25/’ address to get logs before ‘2021-09-25’, the sed command simply outputs all lines from the input file:

$ sed -n '1, /^2021-09-25/p' file_5.log   
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO line3
2021-09-22 13:59:59   INFO line4
2021-09-22 14:59:59   INFO line5
2021-09-27 02:57:59   INFO line6
2021-09-28 10:59:59   INFO line7
2022-09-29 08:00:00   INFO line8

We’ve got these incorrect outputs since no line in the input matches the pattern /^2021-09-25/. Therefore, we need to keep in mind that when we use sed -n ‘/pattern1/, /pattern2/p’ to filter inputs, we should ensure there are lines matching the defined patterns in the input file. Or we may get unexpected results.

Finally, we’ll quickly see how to solve this problem using the awk command:

#after the date
$ awk '$1 >= "2021-09-25"' file_5.log
2021-09-27 02:57:59   INFO line6
2021-09-28 10:59:59   INFO line7
2022-09-29 08:00:00   INFO line8

#before the date
$ awk '$1 <= "2021-09-25"' file_5.log
2021-09-20 13:00:00   INFO line1
2021-09-21 13:00:00   INFO line2
2021-09-21 13:00:01   INFO line3
2021-09-22 13:59:59   INFO line4
2021-09-22 14:59:59   INFO line5

awk supports a C-like script language. Further, it ships with a rich set of built-in functions. So, it can be more flexible than sed sometimes.

However, we won’t dive into awk approaches as we focus on using sed to filter log files here. We have a collection of informative articles addressing awk’s pragmatic usages for problem-solving purposes.

6. Other Time Logging Tools

Although the sed command is a great tool to use, there are situations when we need more flexibility in printing the time range.

For example, we often may not know the exact time pattern to use. In this situation, we’d need to define an approximate time range. For such a scenario, we can use specialized time-searching tools like dategrep.

To install dategrep, we can use Git and build it ourselves:

$ git clone https://github.com/mdom/dategrep.git
...
$ cd dategrep
$ ./build-standalone
...

dategrep uses the following format:

$ dategrep --start "START_DATE" --end "FINISH_DATE" --format "TIME_FORMAT" LOG_FILE

Here, TIME_FORMAT is the time format written using the standard time symbols. For example, let’s now filter file_1.log from earlier based on the time range between 2021-09-20 12:59:59 and 2021-09-20 13:00:01:

dategrep --start "2021-09-20 12:59:59" --end "2021-09-20 13:00:01" --format "%Y-%m-%d %H:%M:%S" file_1.log
2021-09-20 13:00:00   INFO line1

As we can see, dategrep correctly printed the log from line 1 only as the other lines don’t match the time range. This example shows that, unlike the simple sed filters, dategrep can process time intervals even if their exact patterns don’t exist in the log file.

However, tools like dategrep have their limitation in the accepted time formats.

7. Conclusion

In this article, we learned how to print a log file segment based on a time range. Firstly, we looked at the sed command and applied it to different timing formats. Then, we discovered the dategrep tool, which is a good alternative if we need more flexibility.

Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.