Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Overview

Pattern-matching functions enable us to search and filter string data based on specific patterns. PostgreSQL provides us with pattern-matching functions, which we can use to find strings matching a pattern or perform complex searches within a column.

In this tutorial, we’ll illustrate various pattern-matching functions in PostgreSQL using the Student table in the Baeldung Database Schema.

2. The Student Table

This table stores each student’s information:

SELECT * FROM Student;
  id  |      name       | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+-----------------+-------------+------------+-----------------+-----------------+------
 1001 | John Liu        |   123345566 | 2001-04-05 | 2020-01-15      | 2024-06-15      | 4.00
 1003 | Rita Ora        |   132345166 | 2001-01-14 | 2020-01-15      | 2024-06-15      | 4.20
 1007 | Philip Lose     |   321345566 | 2001-06-15 | 2020-01-15      | 2024-06-15      | 3.80
 1010 | Samantha Prabhu |  3217165566 | 2001-03-21 | 2020-01-15      | 2024-06-15      | 4.90
 1011 | Vikas Jain      |   321345662 | 2001-07-18 | 2020-01-15      |                 | 3.30
 1101 | Jia Grey        |  1345236267 | 2001-02-05 | 2020-01-15      | 2024-06-15      | 3.98
...

We’ll use it to explain how various pattern-matching functions work in PostgreSQL.

3. Basic Pattern Matching With LIKE and ILIKE

In PostgreSQL, LIKE is a case-sensitive operator that performs simple pattern matching within string values. It supports two wildcard characters; the percent sign (%) matches zero or more characters, while the underscore (_) matches exactly one character.

The ILIKE operator functions the same as LIKE, but it’s case-insensitive, meaning it doesn’t differentiate between uppercase and lowercase letters.

3.1. Using LIKE

To begin, let’s find students whose names start with a specific string:

SELECT * 
FROM Student 
WHERE name LIKE 'John%';

 id  |   name   | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+----------+-------------+------------+-----------------+-----------------+------
 1001 | John Liu |   123345566 | 2001-04-05 | 2020-01-15      | 2024-06-15      | 4.00

The above query returns all information from the Student table where the name in the name column begins with John. The wildcard character % matches any sequence of zero or more characters after the word John.

Next, let’s find students whose names have a specific number of characters. We can achieve this by using _ to represent a single character:

SELECT * 
FROM Student 
WHERE name LIKE 'P____p%';

  id  |     name     | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+--------------+-------------+------------+-----------------+-----------------+------
 1007 | Philip Lose  |   321345566 | 2001-06-15 | 2020-01-15      | 2024-06-15      | 3.80
 1617 | Philip Mohan |  3103455662 | 2002-05-15 | 2021-01-15      | 2025-06-15      | 5.00

In this example, we return all rows and columns from the Student table where the name column contains a string that starts with P, followed by exactly four characters, then p as the sixth character, and any number of additional characters.

3.2. Using ILIKE

Since the ILIKE operator performs case-insensitive matching, we can use it to find students whose names contain a specific substring, regardless of whether it’s capitalized or not:

SELECT * 
FROM Student 
WHERE name ILIKE '%rob%';

  id  |     name      | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+---------------+-------------+------------+-----------------+-----------------+------
 1619 | Sam Roberts   |  9203455662 | 2002-06-21 | 2021-01-15      | 2025-06-15      | 4.30
 2008 | Julia Roberts |  1212446677 | 2003-06-12 | 2022-01-15      | 2025-06-15      | 3.04

The above query returns all students whose names contain the substring rob in the name column, regardless of the case. For example, both Roberts and roberts would match.

In addition, we can combine ILIKE with other filtering conditions. To explain, let’s find students whose names start with a specific letter and have a certain GPA:

SELECT * 
FROM Student 
WHERE name ILIKE 's%' AND gpa > 3.5;

  id  |      name       | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+-----------------+-------------+------------+-----------------+-----------------+------
 1010 | Samantha Prabhu |  3217165566 | 2001-03-21 | 2020-01-15      | 2024-06-15      | 4.90
 1619 | Sam Roberts     |  9203455662 | 2002-06-21 | 2021-01-15      | 2025-06-15      | 4.30

Here, we return all students whose names begin with the letter s and whose gpa is greater than 3.5.

4. The SIMILAR TO Operator

In PostgreSQL, we use the SIMILAR TO operator for pattern matching within strings. It follows SQL-standard regular expressions that combine elements of LIKE, such as wildcards, with regular expression-like syntax.

This operator follows a basic syntax:

string SIMILAR TO 'pattern'

Here, string represents the value we want to evaluate, while pattern represents the pattern we want to match against the string.

4.1. Finding Students With Specific Name Patterns

To begin, let’s find students whose names start and end with a specific letter:

SELECT *  
FROM Student  
WHERE name SIMILAR TO '[JRP]%y';

  id  |     name      | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+---------------+-------------+------------+-----------------+-----------------+------
 1101 | Jia Grey      |  1345236267 | 2001-02-05 | 2020-01-15      | 2024-06-15      | 3.98
 2001 | Reena Roy     |  9023455613 | 2003-01-13 | 2022-01-15      | 2025-06-15      | 4.15
 2009 | Pollards Grey |  6503453662 | 2003-04-19 | 2022-01-15      | 2025-06-15      | 4.65
...

The above query returns students whose names start with either the letter JR, or P and end with the letter y.

In the pattern [JRP]%y, [JRP] represents a character class that allows us to match any one character from the specified set, % matches any number of characters in between, and y ensures the name ends with the letter y.

4.2. Finding Students Born in Specific Years

We can also find students who were born between 2000 and 2002:

SELECT *  
FROM Student  
WHERE birth_date::TEXT SIMILAR TO '200[0-2]-%';

  id  |      name       | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+-----------------+-------------+------------+-----------------+-----------------+------
 1001 | John Liu        |   123345566 | 2001-04-05 | 2020-01-15      | 2024-06-15      | 4.00
 1003 | Rita Ora        |   132345166 | 2001-01-14 | 2020-01-15      | 2024-06-15      | 4.20
...
 1607 | Peter Liu       |  3043455662 | 2002-04-14 | 2021-01-15      | 2025-06-15      |     
 1610 | Ritu Raj        |  3203455662 | 2002-02-05 | 2021-01-15      | 2025-06-15      |     
...

In this example, we retrieve records of all students born between 2000 and 2002. We use birth_date::TEXT to convert the birth_date column into a string, allowing us to apply string-based pattern matching.

Next, the pattern 200[0-2]-% matches the literal character 200, followed by a character class that matches any single digit from 0 to 2, then a hyphen, and any sequence of characters.

5. Advanced Pattern Matching With POSIX Regular Expressions

PostgreSQL provides us with POSIX regular expression operators, which we can use to perform advanced pattern matching. These operators allow us to perform complex searches. Furthermore, POSIX regular expressions also support several special characters we can use to create sophisticated search patterns.

5.1. The ~ and ~* Operators

The ~ operator performs a case-sensitive match based on a regular expression pattern. To demonstrate, let’s find students whose names start and end with specific letters:

SELECT * 
FROM Student 
WHERE name ~ '^R.*a$';

  id  |   name   | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+----------+-------------+------------+-----------------+-----------------+------
 1003 | Rita Ora |   132345166 | 2001-01-14 | 2020-01-15      | 2024-06-15      | 4.20

The above query returns records from the Student table, where the student’s name begins with R and ends with a.

On the other hand, the ~* operator is similar to the ~ operator but performs case-insensitive matching. For instance, let’s find students whose names contain a specific substring:

SELECT * 
FROM Student 
WHERE name ~* 'Bert';

  id  |      name      | nationala_id | birth_date | enrollment_date | graduation_date | gpa  
------+----------------+-------------+------------+-----------------+-----------------+------
 1110 | Albert Decosta |  2617897011 | 2001-02-21 | 2020-01-15      | 2024-06-15      | 4.00
 1619 | Sam Roberts    |  9203455662 | 2002-06-21 | 2021-01-15      | 2025-06-15      | 4.30
 2008 | Julia Roberts  |  1212446677 | 2003-06-12 | 2022-01-15      | 2025-06-15      | 3.04

Here, we return information on all students whose names contain the substring Bert. This query will match names containing the substring bert, BERT, or bErT, etc.

5.2. The !~ and !~* Operators

The !~ operator checks if a string doesn’t match a specific pattern. It’s also case-sensitive. Let’s use it to exclude students who were enrolled in a specific year:

SELECT * 
FROM Student 
WHERE enrollment_date::TEXT !~ '^2020';

  id  |      name      | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+----------------+-------------+------------+-----------------+-----------------+------
 1607 | Peter Liu      |  3043455662 | 2002-04-14 | 2021-01-15      | 2025-06-15      |     
 1610 | Ritu Raj       |  3203455662 | 2002-02-05 | 2021-01-15      | 2025-06-15      |     
 1617 | Philip Mohan   |  3103455662 | 2002-05-15 | 2021-01-15      | 2025-06-15      | 5.00
...

This query returns information on all students except those enrolled in the year 2020. We use ^ to match the beginning of the string.

The !~* operator is case-insensitive and also checks if a string doesn’t match a specific pattern:

SELECT * 
FROM Student 
WHERE gpa::TEXT !~* '^[4-5]\.';

  id  |     name      | national_id | birth_date | enrollment_date | graduation_date | gpa  
------+---------------+-------------+------------+-----------------+-----------------+------
 1007 | Philip Lose   |   321345566 | 2001-06-15 | 2020-01-15      | 2024-06-15      | 3.80
 1011 | Vikas Jain    |   321345662 | 2001-07-18 | 2020-01-15      |                 | 3.30
 1101 | Jia Grey      |  1345236267 | 2001-02-05 | 2020-01-15      | 2024-06-15      | 3.98
 1103 | Rose Rit      |  1323612067 | 2001-05-14 | 2020-01-15      |                 | 3.57
...

In this example, we extract all students except those with a gpa higher than 4.0.

6. Conclusion

In this article, we explored pattern-matching functions in PostgreSQL. We focused on the LIKE, ILIKE, SIMILAR TO operators, and POSIX regular expressions.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.