Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Overview

Indexes in SQL are critical for enhancing the performance and efficiency of database queries. They provide a structured way to quickly retrieve data without scanning the entire table.

There are two primary types of indexes: clustered and non-clustered. Understanding their differences and use cases can significantly impact database performance and query optimization.

In this tutorial, we’ll explore clustered and non-clustered indexes, discussing their characteristics, practical examples, and key differences. We’ll conclude with a comparative analysis to highlight their distinct features.

2. Clustered Index

A clustered index determines the physical order of data in a table. It sorts and stores the data rows in the table based on the index key. Furthermore, each table can only have one clustered index because it can sort data rows in only one order. We often refer to a table with a clustered index as a clustered table.

Let’s examine some of the characteristics of a clustered index:

  • Determines the physical order of data in the table
  • Each table can have only one clustered index
  • Provides faster data retrieval for the indexed column(s)
  • May result in slower data modification operations (INSERT, UPDATE) due to the need to maintain the order

2.1. Example

Here, we’ll consider the Student table in the Baeldung University database schema. Let’s create a clustered index on the id column:

CREATE TABLE Student (
    id INT PRIMARY KEY NOT NULL, -- clustered index
    name VARCHAR(60),
    national_id BIGINT NOT NULL, 
    birth_date DATE,
    enrollment_date DATE,
    graduation_date DATE,
    UNIQUE (id)
);

By creating the table with the id column as the primary key, we automatically create a clustered index on the id column. This clustered index maintains the order of the records based on the id column.

Next, let’s perform a SELECT * query to observe the physical order of the data:

SELECT * FROM Student LIMIT 5;
+------+-----------------+-------------+------------+-----------------+-----------------+
| id   | name            | national_id | birth_date | enrollment_date | graduation_date |
|------+-----------------+-------------+------------+-----------------+-----------------|
| 1001 | John Liu        | 123345566   | 2001-04-05 | 2020-01-15      | 2024-06-15      |
| 1003 | Rita Ora        | 132345166   | 2001-01-14 | 2020-01-15      | 2024-06-15      |
| 1007 | Philip Lose     | 321345566   | 2001-06-15 | 2020-01-15      | 2024-06-15      |
| 1010 | Samantha Prabhu | 3217165566  | 2001-03-21 | 2020-01-15      | 2024-06-15      |
| 1011 | Vikas Jain      | 321345662   | 2001-07-18 | 2020-01-15      | <null>          |
+------+-----------------+-------------+------------+-----------------+-----------------+
SELECT 5
Time: 0.022s

Notably, the data is in ascending order based on the id column. This change in the physical order of data occurs because the clustered index maintains the order of the records based on the id column.

3. Non-Clustered Index

A non-clustered index, on the other hand, doesn’t alter the physical order of the data. Instead, it creates a separate data structure that contains sorted values of the indexed column(s) along with pointers to the corresponding rows in the original table. A table can have multiple non-clustered indexes, each providing a different way to look up data.

Let’s review some characteristics of a non-clustered index:

  • Doesn’t change the physical order of data in the table
  • Allows multiple non-clustered indexes per table
  • Results in slower data retrieval compared to a clustered index for indexed columns
  • Offers faster data modification operations compared to clustered indexes

3.1. Example

Let’s use the Student table again, but this time, let’s create a non-clustered index on the name column:

CREATE INDEX idx_student_name ON Student(name);

This index allows for faster retrieval of records based on the name column without affecting the physical order of the data in the table.

Next, let’s perform a query that utilizes the non-clustered index:

SELECT * FROM Student WHERE name = 'Rita Ora';
+------+----------+-------------+------------+-----------------+-----------------+
| id   | name     | national_id | birth_date | enrollment_date | graduation_date |
|------+----------+-------------+------------+-----------------+-----------------|
| 1003 | Rita Ora | 132345166   | 2001-01-14 | 2020-01-15      | 2024-06-15      |
+------+----------+-------------+------------+-----------------+-----------------+
SELECT 1
Time: 0.010s

This query uses the non-clustered index idx_student_name to quickly find the record for Rita Ora without scanning the entire table.

4. Differences Between Clustered and Non-Clustered Index

Let’s look at the key differences between clustered and non-clustered indexes:

Feature Clustered Index Non-Clustered Index
Definition Determines the physical order of data in a table Creates a separate structure pointing to the data rows
Number per Table Only one per table Multiple indexes allowed per table
Speed for Range Queries Faster, as data is stored in sorted order Slower, as it requires additional lookups
Use Case Optimal for columns frequently used in range queries and sorting operations Ideal for columns frequently used in search conditions but not sorted
Performance Faster retrieval of data Slower retrieval compared to clustered indexes
Memory Usage Requires less memory for operations Requires more memory for operations
Data Structure Stores and sorts the data rows Stores indexed column values and pointers to actual data
Storage of Pointers Stores actual data at leaf level Stores pointers to data rows
Order of Data Clustered key defines the order of data within a table Index key defines the order of data within the index
Size Larger primary clustered index Smaller non-clustered index
Example PRIMARY KEY(id) CREATE INDEX idx_student_name ON Student(name)

5. Conclusion

In this article, we’ve explored the fundamental differences between clustered and non-clustered indexes in SQL. Understanding these differences is crucial for optimizing database performance and making informed decisions about index implementation.

By leveraging the right type of index, we can significantly enhance the efficiency of our database queries and overall data management.