Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: February 21, 2025
When working with multiple files and directories, we often need to filter and select specific files, or text from those files, based on patterns. Wildcard expansion (globbing) is a powerful way to list files that match a certain pattern. In some cases, we may need to retrieve only a specific match.
In this tutorial, we’ll learn different ways to get the Nth match from wildcard expansion. The code in this tutorial underwent testing on a Debian 12 system using GNU Bash 5.1.16.
First, let’s make sure that we have all the prerequisites ready, including a directory containing multiple files with the same extension:
$ mkdir newdir && cd newdir
$ touch file{1..5}.txt
The mkdir command creates the newdir directory, while the cd command changes the current working directory to the newly created newdir directory. Next, the touch command creates five empty files using the filenames generated from the brace expansion.
Subsequently, we can check the newly created files with ls:
$ ls
file1.txt file2.txt file3.txt file4.txt file5.txt
As we can see, there are five files in the current directory. We’ll list the files in this newly created directory (newdir) and print the Nth match.
We can use Bash arrays to store multiple values and access individual values using their indices.
Let’s store the filenames in an array and extract only the third match:
$ files=(*.txt) && echo "${files[2]}"
file3.txt
This command uses *.txt as a glob pattern to match all .txt files from the current directory and stores all those values in an indexed array named files. Then, the echo command outputs the third element as the index starts with 0.
Alternatively, we can explicitly declare the array and then access the Nth match using the index:
$ declare -a files=(*.txt)
$ echo "${files[2]}"
The declare -a command defines an indexed array. Although Bash arrays don’t require explicit declaration, declare -a ensures that the variable is recognized as an array.
Moreover, we can also use the printf command to print specific array elements:
$ printf "${files[0]}"
file1.txt
Notice that the output from the echo command ends with a newline (\n), while the output from printf doesn’t add a newline at the end of the output unless explicitly specified.
We can also use sed to filter and find the Nth match from a list.
Let’s retrieve only the second match:
$ ls *.txt | sed -n '2p'
file2.txt
The first part of the command lists all the files with the .txt extension (ls *.txt) and then passes the output to the sed command, which prints only the second line from the input.
We can replace the 2 from the above command with any number to print the corresponding match from the list of .txt files.
Additionally, we can use sed with extended regular expression to retrieve the Nth match:
$ printf "%s " *.txt |sed -E 's/([^ ]+ ){2}([^ ]+).*/\2/'
file3.txt
Let’s understand the substitution pattern (s/ / /) in detail:
We can retrieve the Nth match by replacing 2 (from {2}) with N-1, ensuring that the pattern skips the first N-1 matches and outputs only the Nth match.
Alternatively, we can use the same sed command with an array:
$ declare -a files=(*.txt)
$ echo ${files[*]} | sed -E 's/([^ ]+ ){2}([^ ]+).*/\2/'
file3.txt
First, we define an indexed array with all the .txt files from the current working directory. Then, ${files[*]} concatenates all elements of files as a string with spaces between them while the echo command outputs this string.
Next, the pipe (|) takes the output of the preceding command (echo) and passes it as input to sed. Finally, the sed command uses an extended regular expression (-E) to print only the third match.
awk is a powerful command for finding and processing specific patterns.
The awk command provides a quick way to extract specific records:
$ ls *txt | awk 'NR==4'
file4.txt
This command selects the fourth record (‘NR==4’) from a list of .txt files in the current directory.
In addition, the awk command is also effective when the filenames are newline-separated:
$ ls -1 *txt | awk 'NR==4'
file4.txt
The ls -1 *txt construct lists all .txt files in the current directory, one per line. Then, awk employs the number of record (NR) variable to filter and display only the fourth line, i.e., file4.txt.
grep offers a wide range of options for pattern matching, including support for both basic and extended regular expressions.
Let’s retrieve the second match using grep:
$ printf "%s\n" *.txt | grep -n . | grep '^2:'
2:file2.txt
This command displays the second match from the list of all the .txt files from the current directory.
Let’s break down the command:
This command finds the specific match using their line numbers and prints the match in addition to the line number.
Alternatively, we can remove the line number and print only the match:
$ printf "%s\n" *.txt | grep -n . | grep '^2:' | cut -d: -f2-
file2.txt
This command extracts and prints the second match from the list of .txt files in the current directory. The cut command uses a colon (:) as the delimiter and extracts everything from the second field onward, removing the line number.
We can use a combination of head and tail, along with the find command, to retrieve the Nth match from a list:
$ find . -maxdepth 1 -name "*.txt" | sort | head -n 4 | tail -n 1
./file4.txt
This command retrieves the fourth match found from the list of .txt files.
Let’s take a closer look at the options used in this command:
We can substitute 4 in the head -n 4 construct with N to extract the Nth match.
In this article, we learned several ways to get the Nth match from wildcard expansion.
Firstly, we created a dataset and used array indexing to extract the Nth match from wildcard expansion. Then, we explored the sed command with extended regular expression, followed by the quick way with awk to get the Nth match.
Next, we used the grep command to include line numbers and retrieve matches by their specific line numbers. Finally, we combined the find command with head and tail to extract a specific match. Although we can select any method depending on our current preferences and needs, sed is often the preferred and standard way to get the Nth match from wildcard expansion.