Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: March 18, 2024
Searching text in files is a very common task. In Linux, there are multiple ways to find a specific text string or pattern within a file. We can achieve this using commands like grep, awk, and find. Each of these commands offers unique features and functionalities for text searching and pattern matching.
In this tutorial, we’ll learn different ways to find a line containing N digits using the grep command. In addition, we’ll also learn to extract the lines containing N numbers from a given input file.
To illustrate the use of grep to find lines containing N digits, we’ll use a sample dataset:
$ cat sample.txt
1 lazy dog
energetic dogs count: 2 3 5
all 345 dogs
car 3 cats 5 total 8
780 zippers 20 clippers 56
When 1002 zombies arrive
We save this sample dataset in a file named sample.txt. This will help us follow and test the commands used in the next sections.
We can extract lines with a specific number of digits (N) from a file using either grep or perl. Both of these commands provide several options to search for patterns in files and directories, enabling us to customize the search criteria and perform advanced pattern matching.
We can use grep to extract lines containing a specific number of digits:
grep -E '^[^0-9]*([0-9][^0-9]*){N}$' sample.txt
In this example, grep is the command used to find patterns in files, -E enables the extended regular expressions, ‘^[^0-9]*([0-9][^0-9]*){N}$’ is the regular expression to match the lines with N digits, and sample.txt is the file we want to search for patterns.
Let’s break down the regular expression used to extract lines with exactly N digits:
We can replace N in the above command with any number, thus searching for lines in a file that contain a specific number of digits.
Let’s try to find all the lines containing exactly 3 digits from the sample.txt file:
$ grep -E '^[^0-9]*([0-9][^0-9]*){3}$' sample.txt
energetic dogs count: 2 3 5
all 345 dogs
car 3 cats 5 total 8
In this example, we extracted two lines containing exactly 3 digits each.
In addition, we can also extract lines with N or more digits. To do so, we remove the $ at the end of the regular expression. For example, we can extract lines with at least 3 digits:
$ grep -E '^[^0-9]*([0-9][^0-9]*){3}' sample.txt
energetic dogs count: 2 3 5
all 345 dogs
car 3 cats 5 total 8
780 zippers 20 clippers 56
When 1002 zombies arrive
Moreover, we can find and extract the lines with exact N digits using other commands as well.
We can use Perl’s regular expression mechanism to find lines containing N digits in a familiar way.
For example, we can construct a Perl command to extract all lines with exactly 3 digits:
$ perl -ne 'print if s/\d/$&/g == 3' sample.txt
energetic dogs count: 2 3 5
all 345 dogs
car 3 cats 5 total 8
In the above example, perl invokes the Perl interpreter, -n iterates over the lines from the given input file, -e executes the specified Perl code on the command line, ‘print if s/\d/$&/g == 3’ is the Perl code to execute, and sample.txt is the file to search for patterns.
Let’s take a closer look at the Perl code used in the above command:
By adding a + after \d in the substitution command, we can find the lines with N numbers using the Perl command:
$ perl -ne 'print if s/\d+/$&/g == 3' sample.txt
energetic dogs count: 2 3 5
car 3 cats 5 total 8
780 zippers 20 clippers 56
The updated substitution command matches one or more digits with the \d+ option. Thus, the above example now extracts lines with 3 numbers, each having one or more digits.
Moreover, we can also change the comparison operator to match the condition. For instance, using <=, we get lines with 3 or fewer numbers:
$ perl -ne 'print if s/\d+/$&/g <= 3' sample.txt
1 lazy dog
energetic dogs count: 2 3 5
all 345 dogs
car 3 cats 5 total 8
780 zippers 20 clippers 56
When 1002 zombies arrive
In this example, we extracted lines with 3 or fewer numbers.
Additionally, we can use grep to extract lines with digit counts within a specific range:
$ grep -E '^[^0-9]*([0-9][^0-9]*){N,M}$' sample.txt
In this command, N is the lower bound and M is the upper bound of a range. Both are included in the range. For example, we can extract lines with digit counts within a range of 1 to 3:
$ grep -E '^[^0-9]*([0-9][^0-9]*){1,3}$' sample.txt
1 lazy dog
energetic dogs count: 2 3 5
all 345 dogs
car 3 cats 5 total 8
In this example, we extracted lines with digit counts within a range of 1 to 3 using grep.
We can also construct a Perl command to do the same:
$ perl -ne 'print if ( s/\d/$&/g) >= 1 && (s/\d/$&/g) <= 3' sample.txt
1 lazy dog
energetic dogs count: 2 3 5
all 345 dogs
car 3 cats 5 total 8
In the above example, we used two conditions with the logical AND operator (&&) to extract lines that match both the given conditions.
In this article, we learned how to extract lines with N digits from a file. We discussed the usage of the grep and perl commands to extract lines with exactly N digits. Moreover, we also learned how to match lines with N numbers. We also discussed different comparison operators used in the perl command. Finally, we learned how to extract lines with digit counts within a specific range.