Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: November 30, 2024
When analyzing time series data, a 7-day rolling average smooths out daily fluctuations, giving us a better view of trends. In addition, we commonly use rolling averages in metrics like daily sales, temperatures, and grades.
In this tutorial, we’ll cover creating an SQL query for calculating a 7-day rolling average. First, we’ll discuss a general approach to calculating a 7-day rolling average using OVER. After that, we’ll explore how to achieve this using indexed views in MSSQL Server, materialized views in MySQL, and materialized views with refresh in PostgreSQL.
Note that we’ll be referring to the Baeldung University database schema for this article.
To get started, we’ll create a new table called DailyEnrollment which tracks daily enrollments in a course. The table has two columns:
This will be the base data for our rolling average calculation.
First, let’s create the table:
CREATE TABLE DailyEnrollment (
date DATE PRIMARY KEY,
total_registrations INT
);
After that, we insert some sample data for later analysis:
INSERT INTO DailyEnrollment (date, total_registrations) VALUES
('2024-10-01', 15),
('2024-10-02', 20),
('2024-10-03', 25),
('2024-10-04', 22),
('2024-10-05', 18),
('2024-10-06', 30),
('2024-10-07', 28)
The general approach to calculating a 7-day rolling average uses window functions. We can calculate rolling data across a given window with an OVER clause and a window function like AVG.
For instance, to calculate a rolling average for the DailyEnrolment table, we can do the following:
SELECT date, total_registrations, AVG(total_registrations) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_avg_7_days
FROM DailyEnrollment
ORDER BY date;
This query produces the following output:
date total_registrations rolling_avg_7_days
2024-10-01 15 15.00
2024-10-02 20 17.50
2024-10-03 25 20.00
2024-10-04 22 20.50
2024-10-05 18 20.00
2024-10-06 30 21.67
2024-10-07 28 22.57
In the SQL query, the OVER clause makes it possible to order the data by date. ROWS BETWEEN 6 PRECEDING AND CURRENT ROW is what defines the window of rows to calculate the rolling 7-day average for total_registrations.
As we can also see, this approach alleviates what would otherwise be a complex set of subqueries.
Another approach to calculating the rolling average in MSSQL Server is to use indexed views. Indexed views are useful for enhancing performance when calculating rolling averages. By creating an indexed view, we precompute and store our rolling average, speeding up query times for frequent use.
To create an indexed view in MSSQL, we use the CREATE VIEW clause in our SQL query:
CREATE VIEW v_DailyEnrollmentWithAvg
WITH SCHEMABINDING
AS
SELECT date, total_registrations, AVG(total_registrations) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_avg_7_days
FROM
DailyEnrollment;
GO
CREATE UNIQUE CLUSTERED INDEX idx_DailyEnrollmentWithAvg
ON v_DailyEnrollmentWithAvg (date);
Let’s break down the query:
To view the output of v_DailyEnrollmentWithAvg, we can execute a typical SELECT statement as we’ll see the same result as before.
In MySQL, we can create materialized views to optimize the query for calculating a 7-day rolling average. While MySQL doesn’t have native materialized views, we can simulate them by creating a table that holds the precomputed averages and setting up a scheduled job to update it periodically.
Let’s create materialized views for the DailyEnrollment table:
CREATE EVENT update_rolling_avg
ON SCHEDULE EVERY 1 DAY
DO
REPLACE INTO DailyEnrollmentRollingAvg
SELECT date, total_registrations, AVG(total_registrations) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_avg_7_days
FROM DailyEnrollment;
And now let’s look at the explanation of this query:
To manually test the query, we copy the REPLACE INTO query portion from the CREATE EVENT statement and run it directly. This will insert the rolling average data into DailyEnrollmentRollingAvg, just as the scheduled event would, without needing to wait for the scheduled time. This way, we can verify that the query performs as expected.
Lastly, PostgreSQL enables us to create materialized views with refresh. For instance, to create a materialized view that calculates a rolling average for total_registrations in the DailyEnrollment table we use the CREATE MATERIALIZED VIEW clause:
CREATE MATERIALIZED VIEW RollingAvgEnrollment AS
SELECT date, total_registrations,
ROUND(AVG(total_registrations) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
), 2) AS rolling_avg_7_days
FROM DailyEnrollment
ORDER BY date;
Since materialized views in PostgreSQL store data, they don’t update automatically. We need to refresh them periodically using the REFRESH MATERIALIZED VIEW clause:
REFRESH MATERIALIZED VIEW RollingAvgEnrollment;
The materialized view calculates the rolling averages and stores them. By refreshing the view, we ensure that new data is incorporated into the average calculation without needing to rerun the query for each user request.
In this article, we covered various methods to calculate a 7-day rolling average in SQL.
Initially, we discussed a general query to calculate a 7-day rolling average. Subsequently, we explored performance optimizations in SQL Server, MySQL, and PostgreSQL.