Baeldung Pro – Linux – NPI EA (cat = Baeldung on Linux)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

Partner – Orkes – NPI EA (tag=Kubernetes)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

1. Introduction

When working with multiple files and directories, we often need to filter and select specific files, or text from those files, based on patterns. Wildcard expansion (globbing) is a powerful way to list files that match a certain pattern. In some cases, we may need to retrieve only a specific match.

In this tutorial, we’ll learn different ways to get the Nth match from wildcard expansion. The code in this tutorial underwent testing on a Debian 12 system using GNU Bash 5.1.16.

2. Sample Dataset and Toolset

First, let’s make sure that we have all the prerequisites ready, including a directory containing multiple files with the same extension:

$ mkdir newdir && cd newdir
$ touch file{1..5}.txt

The mkdir command creates the newdir directory, while the cd command changes the current working directory to the newly created newdir directory. Next, the touch command creates five empty files using the filenames generated from the brace expansion.

Subsequently, we can check the newly created files with ls:

$ ls
file1.txt  file2.txt  file3.txt  file4.txt  file5.txt

As we can see, there are five files in the current directory. We’ll list the files in this newly created directory (newdir) and print the Nth match.

3. Using Array Indexing

We can use Bash arrays to store multiple values and access individual values using their indices.

Let’s store the filenames in an array and extract only the third match:

$ files=(*.txt) && echo "${files[2]}" 
file3.txt

This command uses *.txt as a glob pattern to match all .txt files from the current directory and stores all those values in an indexed array named files. Then, the echo command outputs the third element as the index starts with 0.

Alternatively, we can explicitly declare the array and then access the Nth match using the index:

$ declare -a files=(*.txt)
$ echo "${files[2]}"

The declare -a command defines an indexed array. Although Bash arrays don’t require explicit declaration, declare -a ensures that the variable is recognized as an array.

Moreover, we can also use the printf command to print specific array elements:

$ printf "${files[0]}"
file1.txt

Notice that the output from the echo command ends with a newline (\n), while the output from printf doesn’t add a newline at the end of the output unless explicitly specified.

4. Using the sed Command

We can also use sed to filter and find the Nth match from a list.

Let’s retrieve only the second match:

$ ls *.txt | sed -n '2p'
file2.txt

The first part of the command lists all the files with the .txt extension (ls *.txt) and then passes the output to the sed command, which prints only the second line from the input.

We can replace the 2 from the above command with any number to print the corresponding match from the list of .txt files.

Additionally, we can use sed with extended regular expression to retrieve the Nth match:

$ printf "%s " *.txt |sed -E 's/([^ ]+ ){2}([^ ]+).*/\2/'
file3.txt

Let’s understand the substitution pattern (s/ / /) in detail:

  • ([^ ]+ ) matches individual words (any sequence of characters uninterrupted by a space)
  • {2} repeats the matching of the word twice
  • ([^ ]+) matches the third word
  • .* matches everything else on the line
  • \2 refers to the second captured group (the third word)

We can retrieve the Nth match by replacing 2 (from {2}) with N-1, ensuring that the pattern skips the first N-1 matches and outputs only the Nth match.

Alternatively, we can use the same sed command with an array:

$ declare -a files=(*.txt)
$ echo ${files[*]} | sed -E 's/([^ ]+ ){2}([^ ]+).*/\2/'
file3.txt

First, we define an indexed array with all the .txt files from the current working directory. Then, ${files[*]} concatenates all elements of files as a string with spaces between them while the echo command outputs this string.

Next, the pipe (|) takes the output of the preceding command (echo) and passes it as input to sed. Finally, the sed command uses an extended regular expression (-E) to print only the third match.

5. Using the awk Command

awk is a powerful command for finding and processing specific patterns.

The awk command provides a quick way to extract specific records:

$ ls *txt | awk 'NR==4'
file4.txt

This command selects the fourth record (‘NR==4’) from a list of .txt files in the current directory.

In addition, the awk command is also effective when the filenames are newline-separated:

$ ls -1 *txt | awk 'NR==4' 
file4.txt

The ls -1 *txt construct lists all .txt files in the current directory, one per line. Then, awk employs the number of record (NR) variable to filter and display only the fourth line, i.e., file4.txt.

6. Using the grep Command

grep offers a wide range of options for pattern matching, including support for both basic and extended regular expressions.

Let’s retrieve the second match using grep:

$ printf "%s\n" *.txt | grep -n . | grep '^2:' 
2:file2.txt

This command displays the second match from the list of all the .txt files from the current directory.

Let’s break down the command:

  • printf “%s\n” *.txt expands *.txt to list all .txt files in the current directory, one per line
  • | (pipe) takes the output of the preceding command (printf) and passes it as input to the following grep command
  • grep -n . adds line numbers to non-empty lines
  • grep ‘^2:’ filters out a line that starts with 2: (the second match)

This command finds the specific match using their line numbers and prints the match in addition to the line number.

Alternatively, we can remove the line number and print only the match:

$ printf "%s\n" *.txt | grep -n . | grep '^2:' | cut -d: -f2-
file2.txt

This command extracts and prints the second match from the list of .txt files in the current directory. The cut command uses a colon (:) as the delimiter and extracts everything from the second field onward, removing the line number.

7. Using find in Combination With head and tail

We can use a combination of head and tail, along with the find command, to retrieve the Nth match from a list:

$ find . -maxdepth 1 -name "*.txt" | sort | head -n 4 | tail -n 1
./file4.txt

This command retrieves the fourth match found from the list of .txt files.

Let’s take a closer look at the options used in this command:

  • find . searches for the files from the current directory
  • -maxdepth 1 limits the search to only the current directory, preventing find from looking inside subdirectories
  • -name “*.txt” filters the results to include only files with a .txt extension
  • sort sorts the filenames in lexicographical order
  • head -n 4 selects the first 4 lines from the sorted list
  • tail -n 1 extracts the last line from the provided input

We can substitute 4 in the head -n 4 construct with N to extract the Nth match.

8. Conclusion

In this article, we learned several ways to get the Nth match from wildcard expansion.

Firstly, we created a dataset and used array indexing to extract the Nth match from wildcard expansion. Then, we explored the sed command with extended regular expression, followed by the quick way with awk to get the Nth match.

Next, we used the grep command to include line numbers and retrieve matches by their specific line numbers. Finally, we combined the find command with head and tail to extract a specific match. Although we can select any method depending on our current preferences and needs, sed is often the preferred and standard way to get the Nth match from wildcard expansion.