Java Top

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

When using regular expressions in Java, sometimes we need to match regex patterns in their literal formwithout processing any metacharacters present in those sequences.

In this quick tutorial, let's see how we can escape metacharacters inside regular expressions both manually and using the Pattern.quote() method provided by Java.

2. Without Escaping Metacharacters

Let's consider a string holding a list of dollar amounts:

String dollarAmounts = "$100.25, $100.50, $150.50, $100.50, $100.75";

Now, let's imagine we need to search for occurrences of a specific amount of dollars inside it. Let's initialize a regular expression pattern string accordingly:

String patternStr = "$100.50";

First off, let's find out what happens if we execute our regex search without escaping any metacharacters:

public void whenMetacharactersNotEscaped_thenNoMatchesFound() {
    Pattern pattern = Pattern.compile(patternStr);
    Matcher matcher = pattern.matcher(dollarAmounts);

    int matches = 0;
    while (matcher.find()) {
        matches++;
    }

    assertEquals(0, matches);
}

As we can see, matcher fails to find even a single occurrence of $150.50 within our dollarAmounts string. This is simply due to patternStr starting with a dollar sign which happens to be a regular expression metacharacter specifying an end of a line.

As you probably should have guessed, we'd face the same issue over all the regex metacharacters. We won't be able to search for mathematical statements that include carets (^) for exponents like “5^3“, or text that use backslashes (\) such as “users\bob“.

3. Manually Ignore Metacharacters

So secondly, let's escape the metacharacters within our regular expression before we perform our search:

public void whenMetacharactersManuallyEscaped_thenMatchingSuccessful() {
    String metaEscapedPatternStr = "\\Q" + patternStr + "\\E";
    Pattern pattern = Pattern.compile(metaEscapedPatternStr);
    Matcher matcher = pattern.matcher(dollarAmounts);

    int matches = 0;
    while (matcher.find()) {
        matches++;
    }

    assertEquals(2, matches);
}

This time, we have successfully performed our search; But this can't be the ideal solution due to a couple of reasons:

  • String concatenation carried out when escaping the metacharacters that make the code more difficult to follow.
  • Less clean code due to the addition of hard-coded values.

4. Use Pattern.quote()

Finally, let's see the easiest and cleanest way to ignore metacharacters in our regular expressions.

Java provides a quote() method inside their Pattern class to retrieve a literal pattern of a string:

public void whenMetacharactersEscapedUsingPatternQuote_thenMatchingSuccessful() {
    String literalPatternStr = Pattern.quote(patternStr);
    Pattern pattern = Pattern.compile(literalPatternStr);
    Matcher matcher = pattern.matcher(dollarAmounts);

    int matches = 0;
    while (matcher.find()) {
        matches++;
    }

    assertEquals(2, matches);
}

5. Conclusion

In this article, we looked at how we can process regular expression patterns in their literal forms.

We saw how not escaping regex metacharacters failed to provide the expected results and how escaping metacharacters inside regex patterns can be performed manually and using the Pattern.quote() method.

The full source code for all the code samples used here can be found over on GitHub.

Java bottom

Get started with Spring 5 and Spring Boot 2, through the Learn Spring course:

>> CHECK OUT THE COURSE
guest
0 Comments
Inline Feedbacks
View all comments