Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

When analyzing time series data, a 7-day rolling average smooths out daily fluctuations, giving us a better view of trends. In addition, we commonly use rolling averages in metrics like daily sales, temperatures, and grades.

In this tutorial, we’ll cover creating an SQL query for calculating a 7-day rolling average. First, we’ll discuss a general approach to calculating a 7-day rolling average using OVER. After that, we’ll explore how to achieve this using indexed views in MSSQL Server, materialized views in MySQL, and materialized views with refresh in PostgreSQL.

Note that we’ll be referring to the Baeldung University database schema for this article.

2. Table Structure

To get started, we’ll create a new table called DailyEnrollment which tracks daily enrollments in a course. The table has two columns:

  • date – the primary key, and
  • total_registrations – the total registrations on that date

This will be the base data for our rolling average calculation.

First, let’s create the table:

CREATE TABLE DailyEnrollment (
date DATE PRIMARY KEY,
total_registrations INT
);

After that, we insert some sample data for later analysis:

INSERT INTO DailyEnrollment (date, total_registrations) VALUES
('2024-10-01', 15),
('2024-10-02', 20),
('2024-10-03', 25),
('2024-10-04', 22),
('2024-10-05', 18),
('2024-10-06', 30),
('2024-10-07', 28)

3. General Approach

The general approach to calculating a 7-day rolling average uses window functions. We can calculate rolling data across a given window with an OVER clause and a window function like AVG.

For instance, to calculate a rolling average for the DailyEnrolment table, we can do the following:

SELECT date, total_registrations, AVG(total_registrations) OVER (
ORDER BY date
ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_avg_7_days
FROM DailyEnrollment
ORDER BY date;

This query produces the following output:

date       total_registrations rolling_avg_7_days
2024-10-01         15            15.00
2024-10-02         20            17.50
2024-10-03         25            20.00
2024-10-04         22            20.50
2024-10-05         18            20.00
2024-10-06         30            21.67
2024-10-07         28            22.57

In the SQL query, the OVER clause makes it possible to order the data by date. ROWS BETWEEN 6 PRECEDING AND CURRENT ROW is what defines the window of rows to calculate the rolling 7-day average for total_registrations.

As we can also see, this approach alleviates what would otherwise be a complex set of subqueries.

4. Using Indexed Views in MSSQL Server

Another approach to calculating the rolling average in MSSQL Server is to use indexed views. Indexed views are useful for enhancing performance when calculating rolling averages. By creating an indexed view, we precompute and store our rolling average, speeding up query times for frequent use.

To create an indexed view in MSSQL, we use the CREATE VIEW clause in our SQL query:

CREATE VIEW v_DailyEnrollmentWithAvg
WITH SCHEMABINDING
AS
SELECT date, total_registrations, AVG(total_registrations) OVER (
    ORDER BY date
    ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_avg_7_days
FROM
DailyEnrollment;
GO
CREATE UNIQUE CLUSTERED INDEX idx_DailyEnrollmentWithAvg
ON v_DailyEnrollmentWithAvg (date);

Let’s break down the query:

  • WITH SCHEMABINDING binds the view to the schema of the underlying tables
  • GO indicates the end of the SQL statement and allows for the execution of the preceding statements in the batch
  • CREATE UNIQUE CLUSTERED INDEX creates a unique clustered index named idx_DailyEnrollmentWithAvg
  • ON v_DailyEnrollmentWithAvg (date) specifies that the index will be created on the date column of the view

To view the output of v_DailyEnrollmentWithAvg, we can execute a typical SELECT statement as we’ll see the same result as before.

5. Using Manual Materialized Views in MySQL

In MySQL, we can create materialized views to optimize the query for calculating a 7-day rolling average. While MySQL doesn’t have native materialized views, we can simulate them by creating a table that holds the precomputed averages and setting up a scheduled job to update it periodically.

Let’s create materialized views for the DailyEnrollment table:

CREATE EVENT update_rolling_avg
ON SCHEDULE EVERY 1 DAY
DO
REPLACE INTO DailyEnrollmentRollingAvg
SELECT date, total_registrations, AVG(total_registrations) OVER (
   ORDER BY date
   ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) AS rolling_avg_7_days
FROM DailyEnrollment;

And now let’s look at the explanation of this query:

  • CREATE EVENT update_rolling_avg creates a scheduled event named update_rolling_avg
  • ON SCHEDULE EVERY 1 DAY specifies that the event runs once a day
  • DO marks the start of the actions to be performed by the event
  • REPLACE INTO DailyEnrollmentRollingAvg inserts data into the DailyEnrollmentRollingAvg table, replacing any existing rows with the same primary key value

To manually test the query, we copy the REPLACE INTO query portion from the CREATE EVENT statement and run it directly. This will insert the rolling average data into DailyEnrollmentRollingAvg, just as the scheduled event would, without needing to wait for the scheduled time. This way, we can verify that the query performs as expected.

6. Using Materialized Views with Refresh in PostgreSQL

Lastly, PostgreSQL enables us to create materialized views with refresh. For instance, to create a materialized view that calculates a rolling average for total_registrations in the DailyEnrollment table we use the CREATE MATERIALIZED VIEW clause:

CREATE MATERIALIZED VIEW RollingAvgEnrollment AS
SELECT date, total_registrations,
ROUND(AVG(total_registrations) OVER (
      ORDER BY date
      ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
), 2) AS rolling_avg_7_days
FROM DailyEnrollment
ORDER BY date;

Since materialized views in PostgreSQL store data, they don’t update automatically. We need to refresh them periodically using the REFRESH MATERIALIZED VIEW clause:

REFRESH MATERIALIZED VIEW RollingAvgEnrollment;

The materialized view calculates the rolling averages and stores them. By refreshing the view, we ensure that new data is incorporated into the average calculation without needing to rerun the query for each user request.

7. Conclusion

In this article, we covered various methods to calculate a 7-day rolling average in SQL.

Initially, we discussed a general query to calculate a 7-day rolling average. Subsequently, we explored performance optimizations in SQL Server, MySQL, and PostgreSQL.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.