Clustered and Non-Clustered Indexes in SQL Baeldung on SQL

1. Overview

Indexes in SQL are critical for enhancing the performance and efficiency of database queries. They provide a structured way to quickly retrieve data without scanning the entire table.

There are two primary types of indexes: clustered and non-clustered. Understanding their differences and use cases can significantly impact database performance and query optimization.

In this tutorial, we’ll explore clustered and non-clustered indexes, discussing their characteristics, practical examples, and key differences. We’ll conclude with a comparative analysis to highlight their distinct features.

2. Clustered Index

A clustered index determines the physical order of data in a table. It sorts and stores the data rows in the table based on the index key. Furthermore, each table can only have one clustered index because it can sort data rows in only one order. We often refer to a table with a clustered index as a clustered table.

Let’s examine some of the characteristics of a clustered index:

Determines the physical order of data in the table
Each table can have only one clustered index
Provides faster data retrieval for the indexed column(s)
May result in slower data modification operations (INSERT, UPDATE) due to the need to maintain the order

2.1. Example

Here, we’ll consider the Student table in the Baeldung University database schema. Let’s create a clustered index on the id column:

CREATE TABLE Student (
    id INT PRIMARY KEY NOT NULL, -- clustered index
    name VARCHAR(60),
    national_id BIGINT NOT NULL, 
    birth_date DATE,
    enrollment_date DATE,
    graduation_date DATE,
    UNIQUE (id)
);

By creating the table with the id column as the primary key, we automatically create a clustered index on the id column. This clustered index maintains the order of the records based on the id column.

Next, let’s perform a SELECT * query to observe the physical order of the data:

SELECT * FROM Student LIMIT 5;
+------+-----------------+-------------+------------+-----------------+-----------------+
| id   | name            | national_id | birth_date | enrollment_date | graduation_date |
|------+-----------------+-------------+------------+-----------------+-----------------|
| 1001 | John Liu        | 123345566   | 2001-04-05 | 2020-01-15      | 2024-06-15      |
| 1003 | Rita Ora        | 132345166   | 2001-01-14 | 2020-01-15      | 2024-06-15      |
| 1007 | Philip Lose     | 321345566   | 2001-06-15 | 2020-01-15      | 2024-06-15      |
| 1010 | Samantha Prabhu | 3217165566  | 2001-03-21 | 2020-01-15      | 2024-06-15      |
| 1011 | Vikas Jain      | 321345662   | 2001-07-18 | 2020-01-15      | <null>          |
+------+-----------------+-------------+------------+-----------------+-----------------+
SELECT 5
Time: 0.022s

Notably, the data is in ascending order based on the id column. This change in the physical order of data occurs because the clustered index maintains the order of the records based on the id column.

3. Non-Clustered Index

A non-clustered index, on the other hand, doesn’t alter the physical order of the data. Instead, it creates a separate data structure that contains sorted values of the indexed column(s) along with pointers to the corresponding rows in the original table. A table can have multiple non-clustered indexes, each providing a different way to look up data.

Let’s review some characteristics of a non-clustered index:

Doesn’t change the physical order of data in the table
Allows multiple non-clustered indexes per table
Results in slower data retrieval compared to a clustered index for indexed columns
Offers faster data modification operations compared to clustered indexes

3.1. Example

Let’s use the Student table again, but this time, let’s create a non-clustered index on the name column:

CREATE INDEX idx_student_name ON Student(name);

This index allows for faster retrieval of records based on the name column without affecting the physical order of the data in the table.

Next, let’s perform a query that utilizes the non-clustered index:

SELECT * FROM Student WHERE name = 'Rita Ora';
+------+----------+-------------+------------+-----------------+-----------------+
| id   | name     | national_id | birth_date | enrollment_date | graduation_date |
|------+----------+-------------+------------+-----------------+-----------------|
| 1003 | Rita Ora | 132345166   | 2001-01-14 | 2020-01-15      | 2024-06-15      |
+------+----------+-------------+------------+-----------------+-----------------+
SELECT 1
Time: 0.010s

This query uses the non-clustered index idx_student_name to quickly find the record for Rita Ora without scanning the entire table.

4. Differences Between Clustered and Non-Clustered Index

Let’s look at the key differences between clustered and non-clustered indexes:

Feature	Clustered Index	Non-Clustered Index
Definition	Determines the physical order of data in a table	Creates a separate structure pointing to the data rows
Number per Table	Only one per table	Multiple indexes allowed per table
Speed for Range Queries	Faster, as data is stored in sorted order	Slower, as it requires additional lookups
Use Case	Optimal for columns frequently used in range queries and sorting operations	Ideal for columns frequently used in search conditions but not sorted
Performance	Faster retrieval of data	Slower retrieval compared to clustered indexes
Memory Usage	Requires less memory for operations	Requires more memory for operations
Data Structure	Stores and sorts the data rows	Stores indexed column values and pointers to actual data
Storage of Pointers	Stores actual data at leaf level	Stores pointers to data rows
Order of Data	Clustered key defines the order of data within a table	Index key defines the order of data within the index
Size	Larger primary clustered index	Smaller non-clustered index
Example	PRIMARY KEY(id)	CREATE INDEX idx_student_name ON Student(name)

5. Conclusion

In this article, we’ve explored the fundamental differences between clustered and non-clustered indexes in SQL. Understanding these differences is crucial for optimizing database performance and making informed decisions about index implementation.

By leveraging the right type of index, we can significantly enhance the efficiency of our database queries and overall data management.

Full Archive

About Baeldung