Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

The aggregation functions, like SUM in SQL, are important for summarizing data. These functions can average, total, and count attributes, affording useful insights over many rows of data.

The SUM function adds values across a column. It’s widely applicable in financial reporting, inventory, and academic records. That said, real-world analysis requires more than adding up the values in a column. For example, we might want to sum only active records or separate the records into groups.

In this tutorial, we’ll consider the SUM function with conditions using the WHERE clause, GROUP BY, and other more advanced techniques like the CASE function for customized summing.

Examples in this article use the Baeldung sample schema and will work in MySQL, PostgreSQL, and SQL Server.

2. The Basics of the SUM Function

SUM returns the total of a numeric column. Whether we are summing up sales, calculating total course credits, or tracking inventory, the SUM function effectively aggregates data across multiple rows. Its versatility has made SUM one of the more commonly used SQL functions.

Let’s count the overall number of credits available in all the courses:

SELECT SUM(credits) FROM Course;

The result of the query is as follows when we run it:

 SUM(credits)
--------------
          517
(1 row)

Here, the SUM function adds together every value in the credits column of the Course table. This is useful in giving a quick summary of all unfiltered data.

3. Using the WHERE Clause

The WHERE clause is useful when we want to filter the rows. We can combine this with SUM to calculate the total of rows that meet certain criteria.

For example, let’s get the total number of credits only active courses:

SELECT SUM(credits) FROM Course WHERE is_active = 'Yes';

When we run the query, we get the result as:

 SUM(credits)
--------------
          483
(1 row)

Here, the WHERE clause filters the query to include only rows for which is_active equals ‘Yes‘. When combined with SUM, we can know the number of credit hours being actively offered by the university.

4. Using the GROUP BY Clause

The GROUP BY function allows us to group the data first and then apply aggregate functions like SUM. This is useful when summarizing the total by department, date, or any other grouping we determine.

For instance, if we wanted the total number of credits per department, we could indicate that the summations should be segmented by department_id:

SELECT department_id, SUM(credits) 
FROM Course 
GROUP BY department_id; Running this query, the outcome is as follows:
 department_id | SUM(credits)
---------------+--------------
             1 |          195
             2 |           78
             3 |           71
             4 |           83
             5 |           90
(5 rows)

In this query, the GROUP BY clause arranges the result from the query based on department_id. SUM then tells us how many credits are offered by each department.

5. Merging GROUP BY and WHERE Clauses

These GROUP BY and WHERE clauses can be combined to enable the filtering of data before the categorization takes place. This is very helpful when one wants to summarize certain groups’ data while wanting to include only rows that meet certain conditions.

For example, let’s now combine the previous two queries to calculate how many credits are actively offered by each department:

SELECT department_id, SUM(credits) 
FROM Course 
WHERE is_active = 'Yes' 
GROUP BY department_id;

This is what happens when we execute the query:

 department_id | SUM(credits)
---------------+--------------
             1 |          178
             2 |           78
             3 |           71
             4 |           78
             5 |           78
(5 rows)

In this query, the WHERE clause provides the conditions for filtering; in this case, is_active = ‘Yes’. Then, GROUP BY department_id groups it into respective departments, and SUM(credits) adds up each active department.

6. Combining Multiple Conditions

SQL frequently requires that data be summed based on multiple conditions. We can filter on multiple conditions at once by using logical operators such as AND or OR in the WHERE clause of our query. This is useful when we have to sum the values that correspond to more than one criterion.

Let’s find out the number of credits for a specific department_id only for the active courses:

SELECT SUM(credits) 
FROM Course 
WHERE is_active = 'Yes' AND department_id = 4;

When we run the query, we get the following result:

 SUM(credits)
--------------
          78
(1 row)

This query has two conditions: One is is_active = ‘Yes’, which will filter in only those active courses; the other one is department_id = 4. AND ensures that for a row to be part of the sum, both conditions need to be met.

7. Using CASE Clause

The CASE statement in SQL allows for more flexible conditional logic. We can sum on a condition that may vary within a query. This is quite useful in applying rules for different subsets of data.

For instance, the sum of credits can be done for courses that are active and not active as follows:

SELECT 
SUM(CASE WHEN is_active = 'Yes' THEN credits ELSE 0 END) AS active_credits, 
SUM(CASE WHEN is_active = 'No' THEN credits ELSE 0 END) AS inactive_credits 
FROM Course;

When we execute this query, the sums are separated:

 active_credits | inactive_credits
----------------+-------------------
            483 |                34
(1 row)

In the above query, the first CASE statement sums credits where is_active = ‘Yes’, and the second sums credits where is_active = ‘No’. The ELSE 0 clause ensures that non-matching rows are excluded from the sum.

We can calculate as many conditional sums as we like in one query, which is especially useful if we want to compare and analyze data from different subsets.

8. Conclusion

In this article, we discussed the fundamentals of the SUM function, a valuable tool in data analysis. We looked at how SUM combines with WHERE to apply conditions and how GROUP BY can group summations into buckets. Finally, we looked at the CASE statement which allows for multiple conditional summations to be in the same query.

These are techniques that enable us to have not only more efficient queries but also personalized reports as datasets grow bigger and more complicated.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.