1. Overview

File organization is important when managing our system. Numbering is a great way to store data that are consecutive in time or to represent a version.

However, it’s also essential to easily process these files and their information. For example, analyzing their differences can tell us what changed in the system. Automation is critical for this, and a quick way to find data is the first step.

In this tutorial, we’ll look at different ways to find files that end with a number in Bash. Moreover, we’ll also look at methods for filtering the results and executing commands on them. The commands presented here are written with the Bash shell in mind, so they might not work with other shells.

2. Using find

The find command is used to search for files in our system.

We select a directory for the tool to check and use flags to list the properties a file has to match. For example, we can find files based on their name, size, creation/modification dates, or type.

find can be used to list the files, execute commands, or delete them.

2.1. Matching Numbers at the End

We can use Bash wildcards with the -name flag or regular expressions (regex) with the -regex flag for pattern matching.

First, let’s match the files with the -name flag:

$ find Pictures/ -name "*[0-9]"

As we can see, we’re retrieving only the files that end with a number.

In the pattern above, the * matches every character any number of times. Then, the [] match a single character within them. The 0-9 represents the range of digits.

This way, the command returns any occurrence of characters that ends in a number.

Instead of [0-9], we can also use [:digit:]:

$ find Pictures/ -name "*[[:digit:]]"

Here, instead of specifying the range, we match the digit category.

We should be careful to always use “” to avoid shell expansion. This happens because the shell evaluates the unquoted pattern before find. The expression is replaced with the retrieved files, and errors can occur.

With the -regex flag, the pattern-matching rules are slightly different:

$ find Pictures/ -regex ".*[0-9]"

In this case, the only difference is that * is now .*, since the . and * have different functions in regex.

The dot corresponds to any character. The asterisk corresponds to the previous character any number of times.

2.2. Matching Numbers Before the File Extension

Similarly, we use the same methods for finding files with numbers before a file extension.

Let’s match the numbers on a .txt file:

$ find Downloads/ -name "*[0-9].txt"

Here, we return any file with numbers immediately before the .txt extension.

To be more general, we take advantage of the . at the end and match any file extension:

$ find Downloads/ -name "*[0-9].*"

Now, let’s look at the differences in the regex command:

$ find Downloads/ -regex ".*[0-9]\..*"

As seen in the example, we need to be careful to escape the . in the pattern.

In regex, the dot represents any character. However, we want to match the . specifically, so we need to use the \.

This way, we remove the special function of the character, and it’ll only match the dot.

2.3. Matching a Date

If the number is a date, we can use it to filter our files. This can be useful if we want to look at the latest entries or at a specific point in time.

Depending on the format, we might need to make slight adjustments. Here, we’ll follow the ISO 8601 standard for date format (YYYY-MM-DD).

Let’s start by matching all files with the date in the name:

$ find Logs/ -name "*[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]*"

The pattern consists of four digits for the year, followed by two digits for the month and two digits for the day.

We used an asterisk in the beginning to match any filename. In the end, we also use an asterisk in case our date comes before the file extension.

However, using Bash wildcards can be repetitive, as it’s a simpler form of string matching.

Let’s shorten it by using regex and additional special characters:

$ find Logs/ -regextype egrep -regex '.*[0-9]{4}-[0-9]{2}-[0-9]{2}.*'

In this pattern, instead of repeating [0-9], we use {} to indicate the number of occurrences.

We also use .* to match the filename at the beginning and the file extension at the end.

Moreover, we also need to define the regextype. This flag changes the regex syntax.

The default one is emacs,  which doesn’t have the interval function of the curly braces. It stems from the regex implementation of the text editor.

Instead, we’re using the egrep (or posix-egrep) syntax, which allows us to have a shorter pattern for matching the string.

Lastly, it’s important to note that this format will match any digit, therefore also retrieving invalid dates (e.g., 9999-99-99). This could be prevented with a more complex query.

2.4. Filtering the Files

Now, we can modify these queries to filter our files according to the number.

Let’s see how we can filter our files numbered between 18 and 21:

$ find Documents/ -regex '.*\(1[8-9]\|2[0-1]\).*'

As seen in the example, we need to specify the 1 and the 2. Although there exist ranges, they only apply to single digits (e.g., [0-9]).

We use the parenthesis to enclose a group. This way, we can then use the vertical slash. This symbol acts as an ‘OR’, matching either of the sides.

Additionally, we also need to escape the parenthesis and the vertical slash. Since we don’t define the regextype, we’re using the emacs syntax. In emacs, these symbols need to be escaped to be interpreted as their special function (and not the literal characters).

We can also use this to filter dates in the filename.

For example, let’s filter the results so our query only returns dates between days 24 and 26:

$ find Logs/ -regextype egrep -regex '.*[0-9]{4}-[0-9]{2}-2[4-6].*'

We apply this filter by specifying the numbers in the pattern. In this case, in the day column, we use the 2 and the [4-6] range for the filter.

2.5. Retrieving the Number

Besides matching the files, sometimes we need to retrieve the number on them. For example, the numbers can be used to check what’s the oldest or most recent log.

To do this, we need to pipe the output of find to the sed command. sed is a tool that can be used for pattern matching and string manipulation.

This way, we can strip the filename and be left with the number:

$ find Documents/ -name "*[0-9]*" | sed 's/[^0-9]//g'

In the command, we piped the filenames that matched the find command to sed. Then, we used the s (substitution) to replace the first pattern with the second. The / separates these patterns.

Since the character ^ represents negation, the first pattern is composed of every character that’s not within the range [0-9]. Each of these characters is replaced with a blank, as seen in the second pattern.

This transformation throughout the whole string is due to the g flag. It applies this change to every character that matches the pattern.

We can also use this to retrieve dates.

However, to keep the format intact, we also need to not match the hyphen:

$ find Logs/ -name "*[0-9]*" | sed 's/[^0-9-]//g'

As we can see, by not matching these characters, we were able to extract the dates in their original format.

Nonetheless, if there’s any other hyphen in the filename, that will be matched as well and retrieved.

3. Conclusion

In this article, we looked at ways to find files that end with a number. We explored different paths depending on the format of the filename and the format of the number.

Numbered data can be very useful for the organization of system administrators. It provides a way to order logs, versions, and backups, either with numbers or dates.

Finally, we also looked at how we could filter files and retrieve the numbers. Filtering can be useful to analyze files or execute commands on them. Retrieving the numbers can help us manipulate the file numbers or dates directly.

Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.