Baeldung Pro – Linux – NPI EA (cat = Baeldung on Linux)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

Partner – Orkes – NPI EA (tag=Kubernetes)
announcement - icon

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

 1. Introduction

When working with strings in Bash, whether parsing user input, cleaning data, or processing command output, there are times when we need to extract numbers.

For instance, we may want to calculate the sum of numbers stored in a string, or perhaps validate user input to ensure it only contains numeric characters. In such cases removing all non-numeric characters from a string becomes essential.

In this tutorial, we’ll discuss various methods to remove all non-numeric characters from a string and provide examples to illustrate each approach.

2. Problem Statement

We use a predefined string stored in the test_string variable to demonstrate how to remove all numeric characters:

$ test_string="abc123def456"

The value inside test_string is abc123def456 and the expected output in each case is 123456.

Let’s start with the first approach.

3. Using Parameter Expansion

Bash provides built-in capabilities for handling string manipulation, one of which is parameter expansion:

$ numeric_only="${test_string//[^0-9]/}"
$ echo "$numeric_only"

The double slashes (//) tell Bash to replace all occurrences of a pattern within test_string.

Specifically, the pattern [^0-9] is a regular expression designed to match any character that’s not a digit from 0 to 9. We left the replacement string blank after the second / to remove the matched non-numeric characters from the string.

Furthermore, the dollar sign ($) before test_string provides access to the stored value.

Then, we store the modified string, which now contains only digits, inside numeric_only. Finally, we output the result on the screen using the echo command.

4. Using the tr Command

The tr (translate) command is another useful tool for handling string manipulation.

Thus, we can use it to delete all non-numeric characters:

$ numeric_only=$(echo "$test_string" | tr -d -c '0-9')

Here, we first pass the output of the echo “$test_string” command to tr as an input using a pipe (|).

After that, the -d option instructs tr to delete the characters based on the given criteria. Meanwhile, -c instructs tr to delete only those characters that do not match the provided pattern ‘0-9’.

5. Using the grep Command

We commonly use grep for search patterns in text, but we can also adapt it to filter out non-numeric characters:

$ numeric_only=$(echo "$test_string" | grep -o '[0-9]\+')

In this case, The -o flag ensures that we print only the matched parts. The regex [0-9]\+ matches one or more digits. As a result, grep isolates the numeric portions of the input by removing everything else.

6. Using awk Command

The awk command is a versatile tool for text-processing and pattern scanning.

As expected, we can use awk to strip out non-numeric characters:

$ numeric_only=$(echo "$test_string" | awk '{gsub(/[^0-9]/, ""); print}')

The gsub(/[^0-9]/, “”) function globally substitutes (gsub) all non-numeric characters with an empty string. The awk command processes the string and prints the modified result, containing only digits.

7. Using sed Command

The sed (stream editor) command can be used for powerful text manipulation.

Hence, we can use sed to remove non-numeric characters in a string with a simple substitution command:

$ numeric_only=$(echo "$test_string" | sed 's/[^0-9]//g')

In this case, sed matches all non-digit characters and replaces them with nothing, effectively deleting them.

Here, s stands for substitute indicating that we’re performing a substitution operation. Additionally, the g flag ensures that substitution is applied globally and all matches should be replaced, not just the first one.

8. Using perl Command

The perl command is known for its robust text manipulation capabilities, and we can use it to quickly remove non-numeric characters:

$ numeric_only=$(echo "$test_string" | perl -pe 's/[^0-9]//g')

In the part of the command after the pipe, the -p option tells perl to process each line of input and automatically print the result. Meanwhile, -e executes a Perl script directly from the command line. Similar to sed, we use s/[^0-9]//g for global substitution.

9. Using Conditional Statement

Using conditional statements in Bash can be an effective way to implement logic and control the flow of the scripts, especially for tasks involving string handling.

However, conditional statements can sometimes be lengthy and complex. Therefore, we create a Bash script to simplify the process of extracting numeric characters from a string:

$ cat non_numeric.sh
#!/bin/bash
test_string="abc123def456"
numeric_only=""
for (( i=0; i<${#test_string}; i++ )); do
    char="${test_string:$i:1}"
    if [[ "$char" =~ [0-9] ]]; then
        numeric_only+="$char"
    fi
done
echo "$numeric_only"

In this Bash script, we first initialize an empty variable, numeric_only, to store the digits we extract.

Next, we use the for loop to iterate over each character in test_string by its index. The ${test_string:$i:1} retrieves the character at the current index, ensuring the loop processes all characters. Within the loop, we assign the current character to the char variable.

Subsequently, the if statement checks whether the current character matches the regular expression [0-9]. If the condition is true, we append the character to the numeric_only variable using the assignment operator (+=). Notably, it’s at this point in the script that we might insert extra logic based on the character, which isn’t possible with the Bash parameter expansion method.

Finally, after the loop completes, the echo command outputs the updated string containing only the numeric characters.

10. Conclusion

In this article, we explored various methods for removing all non-numeric characters from a string.

We demonstrated effective ways to achieve this task using tr, grep, awk, sed, perl, and a conditional statement loop.

Each method has its unique advantage and we can choose the best approach based on the specific requirement.