Baeldung Pro – Linux – NPI EA (cat = Baeldung on Linux)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

Partner – Orkes – NPI EA (tag=Kubernetes)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

1. Introduction

Filtering specific data from text files or command output is often essential in Linux. To that end, locating numeric values manually can be labor-intensive, but the grep command offers an efficient way to address this issue, especially when searching for error codes or performance metrics.

In this tutorial, we’ll learn how to use grep to select only numeric values in Linux.

2. Sample Dataset

Before moving forward, let’s ensure we have a sample dataset to demonstrate different approaches for selecting only numeric values:

$ cat sample.txt
"The quick 2 brown 34 fox 54 jumps over 42 a lazy dog."
Sixty 45 zippers 76 were 6 quickly 5 picked from the woven jute bag.
"Brown jars 23" prevented the mixture 34 from 123 freezing too quickly.

In this basic dataset, we see three lines with different amounts of words, each also containing numbers and digits separate from other text.

We save this sample dataset in a file named sample.txt. This helps us follow and test the commands in the next sections. For simplicity, let’s assume we only consider whole numbers, not floating point values.

3. Using Only the grep Command

grep offers a wide range of options for pattern matching, including the ability to search for numeric values. Further, it fully supports both basic and extended regular expressions.

3.1. Using –only-matching (-o) and Basic Regular Expressions

We can use grep to select only the matching part via the –only-matching (-o) option:

$ grep -o '[0-9]\+' sample.txt 
2
34
54
42
45
76
6
5
23
34
123

This command selects only the numeric values from the input file, i.e., sample.txt.

Let’s break down the options used in this grep command:

  • -o option tells grep to display only nonempty parts of lines that match the pattern
  • ‘[0-9]\+’ is the regular expression used to match only numeric values
  • sample.txt is the input file

In a basic regular expression, each character is treated as a literal character; thus, we need to tell grep every time we use a quantifier such as + by escaping them with a backslash (\). We can get each separate digit on a new line by removing the \+ from the basic regular expression used in the above command.

3.2. Using Extended Regular Expressions

Alternatively, we can use extended regular expressions (-E) without escaping the + character:

$ grep -oE '[0-9]+' sample.txt

In this command, we achieved the same result with the -E option that enables the extended regular expression syntax.

3.3. Adding Line Numbers

Furthermore, we can get line numbers of the matches by using the -n option:

$ grep -oEn '[0-9]+' sample.txt 
1:2
1:34
1:54
1:42
2:45
2:76
2:6
2:5
3:23
3:34
3:123

This command displays the line numbers of each match, in addition to the matched part.

3.4. Changing the Output Terminator

Furthermore, we can print all numeric values together without any separator using the –null-data (-z) option:

$ grep -Eoz '[0-9]+' sample.txt 
23454424576652334123

With the -z option, each match is NULL-terminated rather than using newline-termination.

4. Using a Combination of grep and awk

We can use the awk command alongside grep to group all the matches by their respective line numbers:

$ grep -oEn '[0-9]+' sample.txt | awk -F: '{line[$1] = (line[$1] ? line[$1]" "$2 : $2)} END {for (key in line) print key":", line[key]}'
1: 2 34 54 42
2: 45 76 6 5
3: 23 34 123

This command collects all the matches from each line and joins them into a single output line, making it easier to identify numbers from different lines.

First, the grep command outputs only numeric values with their corresponding line numbers. Then, the pipe (|) takes the output of the preceding command (grep) and passes it as input to the following command, i.e., awk.

Let’s take a closer look at the awk part of this command:

  • -F: sets the field separator to a colon (:), so $1 represents the line number and $2 represents the matched numeric value
  • line[$1] creates an associative array called line with the line number as the key and the numeric matches found are the values
  • (line[$1] ? line[$1]” “$2 : $2) adds the values into the associative array by either appending new matches to the existing value with a space or by creating a new entry if no value exists
  • END {} marks the end block of the awk command and is executed after processing all the input
  • {for (key in line) print key”:”, line[key]} uses a for loop construct to loop through the line associate array and prints the line number (key) followed by a colon and the corresponding group of matched numeric values, i.e., line[key]

Additionally, we can use different separators while creating the associate array such as a tab (\t) or a hyphen (). For example, (line[$1] ? line[$1]”\t”$2 : $2) adds a tab between numeric values from the same line.

5. Conclusion

In this article, we learned how to use grep for selecting only numeric values.

Firstly, we created a dataset and used grep with both basic and extended regular expressions to select only numeric values from the dataset. Then, we explored the -n (–line-number) option to print the line numbers of the matching patterns. Next, we displayed all the numeric values together without any separator.

Lastly, we used awk along with the grep command for grouping the numeric values by their corresponding line numbers.