How to Join on the Same Table Twice in SQL Baeldung on SQL

1. Introduction

In SQL, joining the same table is synonymous with joining the table to itself, a technique known as a self-join. Essentially, this technique empowers comparisons between rows within the same table.

In this tutorial, we’ll explore how to perform self-joins, understand their use cases, and provide practical examples.

2. Exploring Self-Joins

To begin with, a self-join is a regular JOIN, but the table is joined with itself. In addition, this concept relies heavily on using table aliases to differentiate the instances of the table.

Now, let’s review a common syntax that involves joining the table using an alias and then applying the join condition:

SELECT x.column1, y.column2
FROM Table_name x
JOIN Table_name y
ON x.common_field = y.common_field;

So, this SQL query selects column1 from one instance of Table_name (aliased as x) and column2 from another instance of Table_name (aliased as y), where the values in common_field are the same in both instances of the table.

Here, this simple syntax of self-join retrieves related information from a single table or from two instances of the same table based on some common criteria.

3. Simple Example

Before moving to specific use cases, let’s use the syntax we mentioned in the previous section on the table named faculty from the Baeldung schema:

SELECT x.name, x.position, x.national_id, y.name, y.position, y.national_id
FROM Faculty x
JOIN Faculty y ON x.id = y.id;

Here’s the result:

      name       |      position       | national_id |      name       |      position       | national_id
-----------------+---------------------+-------------+-----------------+---------------------+-------------
 Anubha Gupta    | Professor           |  1018901231 | Anubha Gupta    | Professor           |  1018901231
 Anubha Gupta    | Associate Professor |  1018901231 | Anubha Gupta    | Associate Professor |  1018901231
 Anubha Gupta    | Assistant Professor |  1018901231 | Anubha Gupta    | Assistant Professor |  1018901231
 Peter Pan       | Professor           |  2130989011 | Peter Pan       | Professor           |  2130989011
 Peter Pan       | Associate Professor |  2130989011 | Peter Pan       | Associate Professor |  2130989011

As shown above, the query used two instances of the same Faculty table to self-join. Now, the table had duplicate columns, each picked from different instances.

As such, the query above increased the columns on the same table by joining them twice.

4. Use Cases for Self-Joins

Let’s consider the common use cases for self-joins.

4.1. Querying Hierarchical Data

In hierarchical data, each row is related to another row in the same table; for example, organizational and extended family datasets. Now, let’s explore the use case on the Faculty table to track the career progression of faculty members:

SELECT 
    f1.name AS faculty_name,
    f1.position AS current_position,
    f1.start_date AS current_start_date,
    f1.end_date AS current_end_date,
    f2.position AS previous_position,
    f2.start_date AS previous_start_date,
    f2.end_date AS previous_end_date
FROM 
    Faculty f1
LEFT JOIN 
    Faculty f2 
ON 
    f1.national_id = f2.national_id
    AND f1.start_date > f2.end_date
ORDER BY 
    f1.name, f1.start_date;

Let’s take a look at the result:

  faculty_name   |  current_position   | current_start_date | current_end_date |  previous_position  | previous_start_date | previous_end_date
-----------------+---------------------+--------------------+------------------+---------------------+---------------------+-------------------
 Ajit Singh      | Teaching Assistant  | 2009-05-11         | 2012-05-10       |                     |                     |
 Ajit Singh      | Assistant Professor | 2012-05-11         | 2024-06-08       | Teaching Assistant  | 2009-05-11          | 2012-05-10
 Anubha Gupta    | Assistant Professor | 2004-05-11         | 2007-01-08       |                     |                     |
 Anubha Gupta    | Associate Professor | 2007-01-09         | 2010-01-10       | Assistant Professor | 2004-05-11          | 2007-01-08
 Anubha Gupta    | Professor           | 2010-01-11         | 2027-03-11       | Assistant Professor | 2004-05-11          | 2007-01-08
 Anubha Gupta    | Professor           | 2010-01-11         | 2027-03-11       | Associate Professor | 2007-01-09          | 2010-01-10
 AV Subramanium  | Assistant Professor | 2011-05-11         |                  |                     |                     |
 Badrinath Ho    | Professor           | 2016-03-11         |                  |                     |                     |
 Ballu Singh     | Teaching Assistant  | 2022-12-11         | 2024-02-02       |                     |                     |

Now, let’s proceed to explain the query and review the result. So, we queried the Faculty table by creating two instances using aliases represented by f1 and f2. Furthermore, we used the SELECT clause to pick the columns we needed from each instance of the table.

Subsequently, the LEFT JOIN clause is applied to the second instance of the Faculty table with ON clause to specify conditions that the table be joined on national_id columns. The condition is that f1.start_date (current_start_date) be greater than f2.end_date (previous_end_date).

As shown in the result, we’re able to join a single table twice and apply conditions to perform the task of obtaining the career progression of the faculty member based on their current and previous jobs.

4.2. Finding Duplicates

Next, let’s use self-joins to find duplicate values within our table by joining the table on the column that needs to be checked for duplicates.

Let’s proceed to join the table by national_id and exclude rows with the same id from joining with themselves:

SELECT 
    f1.id AS id1, 
    f1.name AS name1, 
    f1.national_id AS national_id1, 
    f1.position AS position1, 
    f2.id AS id2, 
    f2.name AS name2, 
    f2.national_id AS national_id2, 
    f2.position AS position2
FROM 
    Faculty f1
JOIN 
    Faculty f2 
ON 
    f1.national_id = f2.national_id 
    AND f1.id <> f2.id
ORDER BY 
    f1.national_id, f1.start_date;

Here’s the result of the query:

 id1 |      name1      | national_id1 |      position1      | id2 |      name2      | national_id2 |      position2
-----+-----------------+--------------+---------------------+-----+-----------------+--------------+---------------------
 134 | Wlliam Liu      |    231456789 | Teaching Assistant  | 131 | Wlliam Liu      |    231456789 | Professor
 134 | Wlliam Liu      |    231456789 | Teaching Assistant  | 133 | Wlliam Liu      |    231456789 | Assistant Professor
 134 | Wlliam Liu      |    231456789 | Teaching Assistant  | 132 | Wlliam Liu      |    231456789 | Associate Professor
 133 | Wlliam Liu      |    231456789 | Assistant Professor | 134 | Wlliam Liu      |    231456789 | Teaching Assistant
 133 | Wlliam Liu      |    231456789 | Assistant Professor | 132 | Wlliam Liu      |    231456789 | Associate Professor
 133 | Wlliam Liu      |    231456789 | Assistant Professor | 131 | Wlliam Liu      |    231456789 | Professor
 132 | Wlliam Liu      |    231456789 | Associate Professor | 131 | Wlliam Liu      |    231456789 | Professor
 132 | Wlliam Liu      |    231456789 | Associate Professor | 134 | Wlliam Liu      |    231456789 | Teaching Assistant
 132 | Wlliam Liu      |    231456789 | Associate Professor | 133 | Wlliam Liu      |    231456789 | Assistant Professor
 131 | Wlliam Liu      |    231456789 | Professor           | 133 | Wlliam Liu      |    231456789 | Assistant Professor
 131 | Wlliam Liu      |    231456789 | Professor           | 134 | Wlliam Liu      |    231456789 | Teaching Assistant
 131 | Wlliam Liu      |    231456789 | Professor           | 132 | Wlliam Liu      |    231456789 | Associate Professor

The table shows the pairwise combinations of positions, confirming that the same national_id is associated with different id and position values.

This demonstrates how the same individual has moved through various roles within the institution, providing a clear example of how to detect duplicate values in SQL by joining the same table.

5. Conclusion

In this article, we’ve explored joining the same table twice in SQL, which is powerful for comparing rows within the same table, querying hierarchical data, and detecting duplicates.

So, by mastering this, we can effectively manage and analyze complex data relationships within a single table.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.

Full Archive

About Baeldung