Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

Joining two SELECT statement results in SQL is a fundamental operation for combining data from multiple tables based on common columns or conditions.

In this tutorial, we’ll explore the concept of SQL joins step-by-step, starting with the basics of SELECT statements and gradually moving into the syntax and applications of joining techniques.

2. Basic SQL SELECT Statements

Before delving into SQL joins, it’s essential to grasp the basics of SELECT statements. SELECT statements form the foundation of querying data from relational databases.

A typical SELECT statement consists of the following components:

SELECT column1, column2, ...
FROM table_name
WHERE condition;

SELECT specifies the columns or expressions that we want to retrieve from the database. FROM indicates the table or tables from which to fetch the data and WHERE optionally filters rows based on specified criteria.

We can consider a table named Student with columns id, name, national_id, and gpa from our University database. To retrieve all students from the table, we would use the following SELECT statement:

SELECT id, name, national_id, gpa
FROM Student;

This query selects all columns (id, name, national_id, gpa) from the Student table.

Besides retrieving individual rows, SELECT statements can also perform aggregate functions like COUNT, SUM, AVG, etc., to calculate summary information across multiple rows:

SELECT national_id, COUNT(*) AS num_students
FROM Student
GROUP BY national_id;

This query counts the number of students of each nation using the COUNT(*) aggregate function and groups the results by national_id.

3. Understanding JOIN Queries

In SQL, joins are used to combine rows from two or more tables based on related columns. This allows the retrieval of data that spans multiple tables and establishes relationships between them. There are several types of joins, each serving different purposes.

3.1. INNER JOIN

Retrieves records that have matching values in both tables based on a specified condition:

SELECT *
FROM table1
INNER JOIN table2 ON table1.column_name = table2.column_name;

3.2. LEFT and RIGHT JOIN

Retrieves all records from the left table and the matched records from the left or right table respectively. If there’s no match, NULL values are returned for the right table columns:

SELECT *
FROM table1
LEFT JOIN table2 ON table1.column_name = table2.column_name;
RIGHT JOIN table2 ON table1.column_name = table2.column_name;

3.3. FULL OUTER JOIN

Retrieves all records when there is a match in either the left or right table. If there’s no match, NULL values are returned for the opposite table’s columns:

SELECT *
FROM table1
FULL OUTER JOIN table2 ON table1.column_name = table2.column_name;

3.4. JOIN Conditions

Joins are typically specified using the ON keyword, where we define the columns from each table that should be used to match rows:

SELECT *
FROM table1
INNER JOIN table2 ON table1.column_name = table2.column_name;

Alternatively, we can specify join conditions in the WHERE clause:

SELECT *
FROM table1, table2
WHERE table1.column_name = table2.column_name;

Moreover, we can add additional conditions to our JOIN using the AND keyword:

SELECT columns
FROM table1
INNER JOIN table2 ON table1.column_name1 = table2.column_name1
  AND table1.column_name2 = table2.column_name2;

To join more than two tables, we can chain multiple JOIN statements together:

SELECT columns
FROM table1
INNER JOIN table2 ON table1.column_name = table2.column_name
INNER JOIN table3 ON table2.column_name = table3.column_name;

4. Common Use Cases

SQL joins are powerful tools for combining data from multiple tables. Here are some common scenarios where joins are used:

In addition to the Student table, let’s consider the Course table that stores details about courses, including id (primary key) and name.

Joins allow us to fetch related information from different tables. For example, to retrieve students along with their course names:

SELECT s.id, s.name, c.name AS course_name
FROM Registration r
INNER JOIN Student s ON r.student_id = s.id
INNER JOIN Course c ON r.course_id = c.id;

4.2. Aggregating Data

We can use joins with aggregate functions (e.g., COUNT, SUM, AVG) to calculate summary information across related tables. For instance, to count the number of students in each course:

SELECT c.name AS course_name, COUNT(r.student_id) AS num_students
FROM Course c
LEFT JOIN Registration r ON c.id = r.course_id
GROUP BY c.name;

4.3. Filtering Results

Joins can be used to filter results based on specific conditions across tables. For example, to retrieve only students who are enrolled in the “Database Systems” course:

SELECT s.id, s.name, c.name AS course_name
FROM Registration r
INNER JOIN Student s ON r.student_id = s.id
INNER JOIN Course c ON r.course_id = c.id
WHERE c.name = 'Database Systems';

4.4. Handling NULL Values

Different types of joins (e.g., LEFT JOIN, RIGHT JOIN) help manage NULL values when there are no matching rows in the joined table. This is useful for scenarios where we want to include all records from one table, regardless of whether there’s a match in the other table.

5. Advanced SELECT Techniques

In SQL, the SELECT statement is versatile and supports various advanced techniques beyond basic column selection. These techniques enhance query capabilities and provide more flexibility in data retrieval and manipulation.

5.1. Subqueries

Subqueries, also known as nested queries or inner queries, are SELECT statements embedded within another SQL statement, such as SELECT, INSERT, UPDATE, or DELETE.

They are useful for performing operations that require data from multiple tables or complex filtering conditions:

SELECT column1, column2
FROM table1
WHERE column1 IN (SELECT column1 FROM table2 WHERE condition);

In this example, the subquery (SELECT column1 FROM table2 WHERE condition) retrieves values from table2 based on a specified condition, which is then used to filter rows from table1.

5.2. Common Table Expressions (CTEs)

CTEs allow us to define temporary result sets that we can reference within a SELECT, INSERT, UPDATE, or DELETE statement.

They improve the readability and maintainability of complex queries by breaking them into logical blocks:

WITH cte_name AS (
    SELECT column1, column2
    FROM table1
    WHERE condition
)
SELECT *
FROM cte_name;

Here, cte_name represents a temporary result set defined by the SELECT statement inside the CTE. Subsequent queries within the same session can reference and manipulate it further.

5.3. Window Functions

Window functions allow us to perform calculations across a set of rows related to the current row. They provide a way to compute ranks, percentiles, cumulative totals, and other aggregations without grouping the rows into a single output row.

SELECT column1, column2, 
    ROW_NUMBER() OVER (PARTITION BY column1 ORDER BY column2 DESC) AS row_num
FROM table1;

In this example, ROW_NUMBER() is a window function that assigns a sequential integer to each row within its partition defined by PARTITION BY column1, ordered by column2 DESC.

5.4. Conditional Expressions

Conditional expressions allow us to control the flow of data retrieval based on specified conditions within the SELECT statement.

This includes CASE statements, which provide conditional logic similar to IF-THEN-ELSE constructs in programming languages:

SELECT column1, column2,
    CASE
        WHEN condition1 THEN result1
        WHEN condition2 THEN result2
        ELSE default_result
    END AS result_column
FROM table1;

Here, the CASE statement evaluates condition1 and condition2, returning result1 or result2 based on the condition met, or default_result if neither condition is true.

5.5. Pivot and Unpivot Operations

Pivoting and unpivoting transform rows into columns (pivot) or columns into rows (unpivot), respectively:

SELECT *
FROM (
    SELECT category, value
    FROM table1
) AS src
PIVOT (
    SUM(value)
    FOR category IN ('Category1', 'Category2', 'Category3')
) AS pivoted_table;

These operations are useful for reshaping data and summarizing information across different categories. In this example, PIVOT aggregates value based on distinct category values (‘Category1’, ‘Category2’, ‘Category3’) and transforms them into separate columns in the result set.

6. Example Joining Results from Multiple SELECT Statements

We can consider a university database where each record represents a student and their courses. We can create a summary table that includes each student’s name, the total number of courses they are enrolled in, and the number of courses they have passed (where gpa exceeds 2.0) by joining the results of two SELECT statements.

First, we can use CTE to calculate the total number of courses and passed courses for each student separately, and then combine these results into a single query:

WITH TotalCourses AS (
    SELECT student_id, COUNT(*) AS total_courses
    FROM Registration
    GROUP BY student_id
), 
PassedCourses AS (
    SELECT student_id, COUNT(*) AS passed_courses
    FROM Registration r
    INNER JOIN Student s ON r.student_id = s.id
    WHERE s.gpa > 2.0
    GROUP BY student_id
)
SELECT s.name, t.total_courses, COALESCE(p.passed_courses, 0) AS passed_courses
FROM TotalCourses t
LEFT JOIN PassedCourses p ON t.student_id = p.student_id
INNER JOIN Student s ON t.student_id = s.id;

This approach first calculates the total number of courses and passed courses for each student using two CTEs, TotalCourses and PassedCourses. It then uses a LEFT JOIN to combine these results, ensuring representation for every student, even if they have not passed courses.

The COALESCE function displays students with no passed courses, showing a count of zero.

7. Including UNION and UNION ALL

While not technically joins, it’s worth mentioning UNION and UNION ALL as operations that combine results from multiple SELECT statements:

SELECT column1, column2
FROM table1
UNION
SELECT column1, column2
FROM table2;

8. Conclusion

Mastering the art of joining two SELECT statement results in SQL empowers us to manipulate and combine data from multiple tables effectively. By understanding the syntax of JOIN statements, common use cases, and advanced SELECT techniques, we can write efficient queries that meet specific business requirements and analytical needs.