Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: July 15, 2024
Efficient querying is vital for application performance and data security. SELECT * misuse can cause inefficiencies, exposure of sensitive data, and database schema complexities. As databases grow, it’s important to understand when to avoid this practice.
In this tutorial, we’ll show how SELECT * in SQL queries affects performance, security, and code readability.
SELECT * in SQL gets every column from a table. When we use it, the RDBMS returns all columns from the specified table.
This works well for data exploration or ad-hoc queries in development.
Using SELECT * retrieves all columns from a table, even if they are unnecessary for the task. This increases data retrieval and processing time. To fix this problem, we should specify only the columns we need. That way, our database system will work better and run more smoothly.
We’ll consider our university database with its Student table. This table has several columns. Those are id, name, national_id, birth_date, enrollment_date, graduation_date and GPA, e.g.:
| id | name | national_id | birth_date | enrollment_date | graduation_date | GPA |
| 1001 | John Liu | 123345566 | 2001-04-05 | 2020-01-15 | 2024-06-15 | 4 |
| 1003 | Rita Ora | 132345166 | 2001-01-14 | 2020-01-15 | 2024-06-15 | 4.2 |
To retrieve the list of students, we can execute the following command:
SELECT * FROM Student;
It will show all the columns related to each student. However, what if we only need name, enrollment_date, and GPA for a specific task? SELECT * will return more data than we need, causing an unnecessary data transfer. To avoid this, we should specify only the columns we’ll work with:
SELECT name, enrollment_date, GPA FROM Student;
That way, we transfer less data.
Using SELECT * in SQL queries can be harmful because it might expose private data. With this method, all columns of a table are shown, which could include secret information like passwords or financial details. This mistake makes it easier for fraudsters to access user files and sensitive data.
To prevent this from happening, developers should select only the required columns in the queries. This ensures that only necessary information gets shared with users.
The utilization of SELECT * in SQL queries is associated with a major drawback, namely vulnerability to schema changes. Such alterations can result in various issues related to compatibility and maintenance. A table schema can undergo modifications through column additions, removals, or renaming. So, there may be unexpected changes in the query results.
If we add a new column to an existing table, any SELECT * query will also include this new column. This can cause problems for applications that are not well-equipped to handle additional data.
Moreover, there is another issue to consider. Some applications might expect some columns that aren’t there anymore. They may also access the returned data by column indices, which have now changed.
We’ll consider the table with columns for student details. To access the birth_date column of a row by index, we might use row[4] in our code. If we remove one column (like national_id), row[4] will now refer to enrollment_date. This can break the application logic.
Even when using SELECT *, the code that uses the results should be robust to schema changes.
Using SELECT * in SQL queries can make the code hard to read. This lack of clarity isn’t good for applications that rely on databases. Developers may not know what data is being used when columns aren’t clearly stated. This can cause mistakes or confusion among team members who work on shared projects.
Specifying the column list in SQL queries makes them easier to read and understand. For example, if we want to access id, name, and national_id from the Student table, we should include those column names in our query. That way, it will be clear which data the query returns. This will also make debugging easier because developers will know columns are involved and won’t have to keep checking back on a table schema.
Though not always advisable, SELECT * can have benefits in specific cases:
In this article, we explained how specifying necessary columns in SQL queries improves efficiency, security, and code readability.
SELECT * can be used for development, testing, ad-hoc queries, data exploration, and small tables.
Although SELECT * is sometimes convenient, it has limitations and risks. It can negatively affect the performance, security, and maintainability of the applications relying on its results.