Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

In SQL, summing distinct rows is essential when working with datasets containing duplicate entries that may arise from joins or redundant data. Specifically, the accuracy and integrity of the aggregated results both rely on the uniqueness of rows when summing.

In this tutorial, we’ll learn how to SUM DISTINCT rows in SQL. The examples in this article are based on the Baeldung University schema and are compatible with MySQL, PostgreSQL, and SQL Server.

2. Using Subqueries

A subquery, or an inner query, enables the execution of intermediate queries that help filter, compute, or compare values in the outer query. Subqueries are also useful for breaking down complex logic into simpler, more readable components.

2.1. Using DISTINCT Within Subquery

The DISTINCT keyword removes duplicate rows from the result set:

SELECT SUM(credits) 
FROM (
SELECT DISTINCT textbook, credits FROM Course
) AS distinct_rows;

To begin with, the subquery selects the DISTINCT combinations of textbook and credits from the Course table, removing any duplicate rows. Then, the outer query calculates the SUM of the credits from the resulting distinct rows.

So, let’s see the results:

+--------------+
| SUM(credits) |
+--------------+
|          345 |
+--------------+
1 row in set (0.01 sec)

Additionally, we can assign a specific name to the column returned by the outer query using the AS keyword:

SELECT SUM(credits) AS distinct_credits 
FROM (
SELECT DISTINCT textbook, credits FROM Course
) AS distinct_rows;

Consequently, this query returns the same result, with distinct_credits as the alias for SUM(credits).

Similarly, we can add other columns in the subquery to find DISTINCT rows based on multiple columns.

2.2. Using UNION Within Subquery

The UNION operation combines the result sets of two or more SELECT statements, returning unique rows by default.

Thus, we can use UNION to retrieve DISTINCT rows in a subquery:

SELECT SUM(credits) AS distinct_credits 
FROM (
SELECT textbook, credits FROM Course UNION SELECT textbook, credits FROM Course
) AS combined;
+------------------+
| distinct_credits |
+------------------+
|              345 |
+------------------+
1 row in set (0.00 sec)

The subquery removes any duplicates from the Course table based on identical combinations of textbook and credits, ensuring that only unique pairs are considered in the SUM calculation of the outer query.

Furthermore, we can verify that the rows are distinct by executing another subquery:

SELECT textbook, credits FROM Course UNION SELECT textbook, credits FROM Course;
+------------------------------------------------------------------------+---------+
| textbook                                                               | credits |
+------------------------------------------------------------------------+---------+
| Structures : Or Why Things Don't Fall Down by Gordon                   |       7 |
| Introduction to Geotechnical Engineering by Das and Sivakugan          |       7 |
| Mechanics of Solids by Ross                                            |       7 |
| Mechanics of Fluids by Smith                                           |       7 |
| Modeling Tools                                                         |       7 |
| Advanced Statistics                                                    |       7 |
| Geotechnical Engineering: Principles and Practices by Coduto           |       7 |
| Engineering Mechanics: Solids by Egor                                  |       7 |
...
| Advanced Dynamics by Donald                                            |       5 |
| Operations Management: Processes and Supply Chains                     |       5 |
+------------------------------------------------------------------------+---------+
55 rows in set (0.00 sec)

This confirms that the subquery returns only the distinct rows.

3. Using Common Table Expression (CTE)

A Common Table Expression (CTE) represents a temporary result set generated from a query for use within the scope of a larger SQL statement.

Let’s see it in action:

WITH unique_textbooks AS 
(
SELECT DISTINCT textbook, credits FROM Course
) 
SELECT SUM(credits) FROM unique_textbooks;
+--------------+
| SUM(credits) |
+--------------+
|          345 |
+--------------+
1 row in set (0.00 sec)

The WITH keyword defines a CTE named unique_textbooks and selects DISTINCT rows from the Course table based on the combination of two rows, i.e., textbook and credits. The main SELECT query sums the credits column from the records defined in the CTE.

4. Conclusion

In this article, we learned different ways to SUM distinct rows in SQL.

First, we briefly discussed the subquery and its usage coupled with the DISTINCT and UNION clauses to sum the distinct rows. Then, we used a Common Table Expression (CTE) to select distinct rows, and subsequently calculated their sum.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.