Baeldung Pro – SQL – NPI EA (cat = Baeldung on SQL)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

Structured Query Language (SQL) is the backbone of relational database management systems (RDBMS). It provides a powerful and efficient way to store, manipulate, and retrieve data. Among the various components that make up a relational database, keys play a crucial role in ensuring the integrity, consistency, and accessibility of data.

Keys in SQL are special fields or sets of fields that help identify and establish relationships between records in different tables. They enforce uniqueness, maintain data integrity, and enable efficient querying and data manipulation. So, understanding the different types of keys and their specific purposes is essential for database design and optimization.

In this article, we’ll delve into the various types of keys used in SQL, including primary, foreign, unique, and composite keys, and more. Also, we’ll explore their definitions, purposes, characteristics, and practical applications, looking at examples to illustrate their use.

2. Primary Key

The primary key is a column or a set of columns in a table that uniquely identifies each row in that table.

It’s a critical component of a database’s structure, and it ensures unique retrieval, updating, or deletion of each record to enforce entity integrity. Thus, the primary key helps in establishing relationships between tables in the database, making it easier to link records across different tables.

For example, in a table of employees, the employee_id column can be designated as the primary key, ensuring each employee has a unique identifier:

CREATE TABLE Employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    hire_date DATE
);

3. Foreign Key

A foreign key is a column or a set of columns in one table that references the primary key columns of another table.

The purpose of a foreign key is to ensure that the values in the referencing table correspond to valid rows in the referenced table, thereby establishing a relationship between the two tables. Therefore, foreign keys help maintain consistency and integrity by ensuring that references between tables are valid and that operations on the data don’t break these relationships.

We can consider a database with two tables, Orders and Customers. So, the Orders table contains a foreign key column customer_id that references the customer_id column in the Customers table. This relationship ensures that each order is associated with a valid customer:

CREATE TABLE Customers (
    customer_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100)
);

CREATE TABLE Orders (
    order_id INT PRIMARY KEY,
    order_date DATE,
    customer_id INT,
    FOREIGN KEY (customer_id) REFERENCES Customers(customer_id)
);

4. Unique Key

A unique key is a column or a set of columns in a table that uniquely identifies each record within that table.

Unlike primary keys, which also enforce uniqueness, unique keys allow NULL values (though only one NULL value per column). The main purpose of a unique key is to ensure that the values in the specified columns are distinct across all records in the table.

For instance, in a table representing user logins, a composite unique key can be created using both username and email to ensure that each combination is unique:

CREATE TABLE UserLogins (
    login_id INT PRIMARY KEY,
    username VARCHAR(50),
    email VARCHAR(100),
    UNIQUE (username, email)
);

5. Composite Key

A composite key is a primary key that consists of two or more columns used together to guarantee the uniqueness of each row in a table. This combination of columns ensures that each record can be uniquely identified by the combined values, even if individual columns may contain duplicate values.

This distinction from a unique key, which ensures column uniqueness but not primary identification, is crucial. Unlike a unique key, which can allow NULL values (although only one NULL per column), a composite key enforces entity integrity by serving as the primary key.

For example, we can consider a table of Enrollments that tracks student enrollment in courses:

CREATE TABLE Enrollments (
    student_id INT,
    course_id INT,
    enrollment_date DATE,
    PRIMARY KEY (student_id, course_id)
);

In this example, the combination of student_id and course_id is a composite primary key for the Enrollments table. This ensures each enrollment record uniquely identifies the student and their course. Here, the composite key is integral to the table’s structure and primary identification.

6. Candidate Key

A candidate key is a set of columns within a table that can uniquely identify each row without any redundant or unnecessary columns. It serves as a potential candidate for the primary key of the table. Although a table may have multiple candidate keys, typically only one is selected as the primary key.

We can consider a table Students with the following columns: student_id, email, and student_code. Each of these columns can potentially serve as a candidate key:

CREATE TABLE Students ( 
    student_id INT PRIMARY KEY, 
    email VARCHAR(100) UNIQUE, 
    student_code VARCHAR(20) UNIQUE 
);

In this example, we select student_id as the primary key and hence a candidate key. Also, email and student_code are candidate keys because each can uniquely identify a student.

7. Surrogate Key

In database design, the database management system (DBMS) assigns a unique identifier, called a surrogate key, to each record in a table. Unlike natural keys, which rely on existing data attributes, surrogate keys specifically and artificially identify records.

For example, consider a table Customers with a surrogate key defined using auto-increment:

CREATE TABLE Customers (
    customer_id INT PRIMARY KEY AUTO_INCREMENT,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100) UNIQUE
);

Here, the database system automatically assigns customer_id as a surrogate key. It uniquely identifies each customer record, regardless of changes in their names or email addresses. Moreover, we typically define surrogate keys as an integer field with the AUTO_INCREMENT property. This definition ensures that each new record receives a unique identifier, independent of natural data attributes.

8. Secondary Key

A secondary key indexes columns to boost query performance for criteria beyond the primary key.

We can consider a table Employees with the following columns:

CREATE TABLE Employees (
    employee_id INT PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    department_id INT,
    INDEX (department_id)
);

Here, employee_id serves as the primary key. You can create an index on department_id as a secondary key to optimize queries that filter employees by department.

9. Best Practices in SQL Key Management

To effectively manage SQL database keys, we should choose stable, unique surrogate keys over changing natural keys. Furthermore, we should use primary keys for simplicity and enforce data integrity with foreign keys. Also, normalization reduces redundancy, indexing frequently queried columns enhance performance, and clear documentation is crucial.

This hierarchy clarifies key relationships and dependencies in SQL databases, focusing on data integrity, relationships, and performance:

SQL Hierarchy of Keys

10. Conclusion

In conclusion, effective key management is fundamental to maintaining a well-structured and efficient SQL database. Choosing keys enforces data integrity constraints, while indexing optimizes reliable data storage and retrieval.