Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Overview

In SQL, we often need to ensure that we insert records into a table only if they don’t already exist. This operation, known as insert if not exists, helps to maintain database integrity by preventing duplicate entries. However, SQL doesn’t provide a universal syntax to perform this operation across the different database systems.

In this tutorial, we’ll discuss various approaches for performing an insert if not exists operation in MySQL, PostgreSQL, and SQL Server. To demonstrate, let’s use the Baeldung University database schema.

2. Understanding the Problem

In the Baeldung University database schema, we’ll work with the Student table, which stores each student’s id, name, national_id, birth_date, enrollment_table, graduation_date, and gpa:

+------+-----------------+-------------+------------+-----------------+-----------------+------+
| id   | name            | national_id | birth_date | enrollment_date | graduation_date | gpa  |
+------+-----------------+-------------+------------+-----------------+-----------------+------+
| 1001 | John Liu        |   123345566 | 2001-04-05 | 2020-01-15      | 2024-06-15      |    4 |
| 1003 | Rita Ora        |   132345166 | 2001-01-14 | 2020-01-15      | 2024-06-15      |  4.2 |
| 1007 | Philip Lose     |   321345566 | 2001-06-15 | 2020-01-15      | 2024-06-15      |  3.8 |
...

Above, id is unique for each student. Therefore, we can utilize it to prevent the addition of duplicates to the Student table. Let’s now discuss how to perform the insert if not exists operation.

3. Implementation in MySQL

First, let’s discuss how to apply this operation in MySQL.

3.1. Using INSERT IGNORE

With MySQL, one approach is to use the INSERT IGNORE statement to prevent duplicate records. This statement ignores inserting the new record if the operation causes a unique constraint or duplicate key violation — for instance, an existing id:

INSERT IGNORE INTO Student (id, name, national_id, birth_date, enrollment_date, graduation_date, gpa)
VALUES (2018, 'Samuel Karanja', 32201610, '2002-05-18', '2022-01-15', '2025-06-15', 4);

This command attempts to add a new student whose id is 2018. So, if a student with this id exists, MySQL ignores the insertion without throwing an error. As a result, there’s no creation of a duplicate record. To clarify, we don’t get feedback.

If we don’t want feedback, we can implement this approach.

3.2. Using INSERT … ON DUPLICATE KEY UPDATE

Another approach we can utilize in MySQL is the INSERT … ON DUPLICATE KEY UPDATE statement. It enables us to specify what should happen if the insertion causes a conflict when a duplicate key is found:

INSERT INTO Student (id, name, national_id, birth_date, enrollment_date, graduation_date, gpa)
VALUES (2018, 'Samuel Karanja', 32201610, '2002-05-18', '2022-01-15', '2025-06-15', 4)
ON DUPLICATE KEY UPDATE id = id;

Let’s break down the instructions above:

  • INSERT INTO Student (…) – specifies the addition of a new record to the Student table
  • VALUES (…) – displays the values for the preceding columns above
  • ON DUPLICATE KEY UPDATE id = id – instructs MySQL on what happens if the INSERT operation causes a conflict because a record with a similar unique key (id) already exists

Typically, we would use ON DUPLICATE KEY UPDATE to update some fields in the existing record. However, id = id means that id is set to itself, hence nothing changes in the record.

4. Implementation in PostgreSQL

Next, let’s discuss how to apply this operation in PostgreSQL.

4.1. Using INSERT … ON CONFLICT

PostgreSQL provides the INSERT … ON CONFLICT statement, enabling us to define what happens when a conflict arises during an INSERT operation:

INSERT INTO Student (id, name, national_id, birth_date, enrollment_date, graduation_date, gpa)
VALUES (2018, 'Samuel Karanja', 32201610, '2002-05-18', '2022-01-15', '2025-06-15', 4)
ON CONFLICT (id) DO NOTHING;

Here, ON CONFLICT (id) DO NOTHING handles a scenario whereby inserting a record results in a conflict. So, this conflict is compared against the unique constraint id. Then, DO NOTHING instructs PostgreSQL to ignore this insertion if there’s a conflict — in this case, if a record with the id of 2018 already exists in the Student table.

4.2. Using a DO Block

In PostgreSQL, we can utilize a DO block to define and execute multiple SQL statements in a single block:

DO $$
BEGIN
    IF NOT EXISTS (
        SELECT 1 FROM Student WHERE id = 2018
    ) THEN
        INSERT INTO Student (id, name, national_id, birth_date, enrollment_date, graduation_date, gpa)
        VALUES (2018, 'Samuel Karanja', 32201610, '2002-05-18', '2022-01-15', '2025-06-15', 4);
    ELSE
        RAISE NOTICE 'A student already exists with this id. No insertion performed!';
    END IF;
END $$;

Let’s discuss the instructions above:

  • DO $$ – the DO keyword initiates the block and the delimiter $$ marks the start and end of the block
  • BEGIN … END – specifies where to hold the SQL statements
  • IF NOT EXISTS – checks if a student with the id of 2018 exists in the Student table
  • THEN INSERT INTO … – adds the new student if the student with an id of 2018 doesn’t exist in the table
  • ELSE RAISE NOTICE … – raises a notice message if the student with an id of 2018 exists in the table

So, let’s have a look at this notice message:

NOTICE:  Student with this id already exists. No insertion performed!

The execution remains unaffected by the notice message being displayed.

Using DO blocks in PostgreSQL enables us to explore complex logic compared to the INSERT … ON CONFLICT statement.

5. Implementation in SQL Server

Further, let’s discuss how to apply this operation in SQL Server.

5.1. Using MERGE

In SQL Server, we can use the MERGE statement to insert a new record if it doesn’t exist or even update the record if it does exist:

MERGE INTO Student AS target
USING (VALUES (2018, 'Samuel Karanja', 32201610, '2002-05-18', '2022-01-15', '2025-06-15', 4))
       AS source (id, name, national_id, birth_date, enrollment_date, graduation_date, gpa)
ON target.id = source.id
WHEN NOT MATCHED BY TARGET THEN
    INSERT (id, name, national_id, birth_date, enrollment_date, graduation_date, gpa)
    VALUES (source.id, source.name, source.national_id, source.birth_date, source.enrollment_date, source.graduation_date, source.gpa);

Let’s analyze the commands:

  • MERGE INTO Student AS target – selects the Student table, which is aliased as target
  • Using … AS source – provides the row of values we want to insert, which is aliased as source
  • ON target.national_id = source.national_id – checks if a record with the same id already exists in the Student table
  • WHEN NOT MATCHED BY TARGET THEN – initiates the INSERT operation, which adds the new record if no record with the same id is found

So, if the id exists, the MERGE statement condition WHEN NOT MATCHED BY TARGET isn’t met. Meanwhile, if the id doesn’t exist, the INSERT operation adds the new student record to the Student table.

5.2. Using EXCEPT

We can use the EXCEPT clause in SQL Server to achieve a similar outcome to the MERGE statement:

INSERT INTO Student (id, name, national_id, birth_date, enrollment_date, graduation_date, gpa)
SELECT 2018, 'Samuel Karanja', 32201610, '2002-05-18', '2022-01-15', '2025-06-15', 4
EXCEPT
SELECT id, name, national_id, birth_date, enrollment_date, graduation_date, gpa
FROM Student
WHERE id = 2018;

Here’s the breakdown of the instructions:

  • INSERT INTO Student (…) – attempts to insert a new student with the data specified after the SELECT clause below it
  • SELECT … EXCEPT SELECT … – compares the new student data we want to add to the existing data in the table whereby the new data is ignored if a student with the id of 2018 already exists in the table

This command adds a new student record to the Student table only if a student with the same id of 2018 doesn’t already exist in the table.

6. Conclusion

In this article, we delved into how to perform an insert if not exists operation in SQL.

Inserting records only when they don’t exist helps to maintain data integrity in databases. To explain, this operation helps to prevent duplicate entries and ensure data consistency. We’ve covered various ways to accomplish this since every database system has its unique way of performing this operation.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.