Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: August 5, 2024
MySQL has ranked as one of the leading open-source relational database management systems (RDBMS) for many years. It has various storage engines tailored to different performance, reliability, and functionality requirements.
Understanding these engines is crucial for database administrators, developers, and IT professionals.
In this tutorial, we’ll explain MySQL storage engines, their features, advantages, and applications.
A storage engine is a component of an RDBMS that stores, retrieves, and manipulates data.
Storage engines define how to organize data on disk and execute queries. They oversee and process transactions, ensuring data security and integrity.
Storages can be optimized for various purposes, such as high-speed data access, transaction processing, or data archiving.
The following command displays a list of available storage engines and their statuses:
SHOW ENGINES;
The result shows us the existing engines and indicates the available ones:
| Engine | Support | Comment | Transactions | XA | Savepoints |
|---|---|---|---|---|---|
FEDERATED |
NO | Federated MySQL storage engine | NULL | NULL | NULL |
MEMORY |
YES | Hash-based, stored in memory, useful for temporary tables | NO | NO | NO |
InnoDB |
DEFAULT | Supports transactions, row-level locking, and foreign keys | YES | YES | YES |
| PERFORMANCE_SCHEMA | YES | Performance Schema | NO | NO | NO |
| MyISAM | YES | MyISAM storage engine | NO | NO | NO |
| MRG_MYISAM | YES | collection of identical MyISAM tables | NO | NO | NO |
| BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears.) | NO | NO | NO |
| CSV | YES | CSV storage engine | NO | NO | NO |
| ARCHIVE | YES | Archive storage engine | NO | NO | NO |
When creating a table in MySQL, use the keyword ENGINE to select the storage engine:
CREATE TABLE table_name (
<column_definition>
) ENGINE=<storage_engine>;
Here, <column_definition> outlines the attributes of each table’s columns: their names, data types, constraints, or other properties. ENGINE=<storage_engine> specifies the particular storage engine we want to use for the table.
MySQL uses the default storage engine if we don’t specify the engine in the CREATE TABLE statement.
In MySQL, we can use different storage engines for different tables in the same database. This option provides flexibility to optimize performance according to each table’s needs.
If our application requires the same features, we may not need different engines dedicated to specific tables. For example, if all tables need strong transactional support, using InnoDB for every table will simplify database management.
FEDERATED tables can access data on remote MySQL servers, simplifying data integration across distributed systems. They are perfect for cases requiring data distribution without replication mechanisms.
We can leverage it to combine information from regional databases into a centralized reporting system.
The MEMORY storage engine provides high-speed data access. It stores data in RAM rather than on a disk. Therefore, we can read and write data amazingly fast.
However, data in MEMORY tables are lost when we restart the server. This volatility makes MEMORY unsuitable for permanent storage. This engine is suitable only for data we can regenerate easily or when data loss isn’t fatal for our application.
For example, we can use MEMORY for a real-time analytics dashboard that reports live metrics and statistics. Incoming data streams through MEMORY tables are stored and processed quickly. Hence, users can instantly access up-to-date information. If we restart the server and lose the current data, it doesn’t matter, as the new live data will come soon.
Since version 5.5, InnoDB has been the default storage engine for MySQL. It’s widely recognized for its robustness and flexibility. InnoDB provides full ACID compliance in transactions. This guarantees a high level of data integrity. It also guarantees users a consistent view of the data at any given time.
InnoDB supports row-level locking, which can help significantly reduce contention and enhance concurrency. It also supports foreign keys, which enforce referential integrity and maintain relationships between tables.
InnoDB’s fault-recovery mechanism ensures no data will be lost if the server fails. This reliability is important in production environments.
For instance, this engine is useful for online stores that handle many sales. It ensures all orders are processed correctly and keeps inventory counts accurate and consistent to prevent issues like double orders.
MySQL has a storage engine known as PERFORMANCE_SCHEMA. It’s a monitoring tool that offers a full picture of the server’s performance. This engine gathers performance data such as queries, transactions, I/O operations, and memory usage. It doesn’t store users or data, unlike the other storage engines.
PERFORMANCE_SCHEMA allows the database administrator to evaluate and optimize performance.
Before InnoDB, MySQL’s default storage engine was MyISAM. MyISAM is still available and can be used, but it’s not as complete and secure as InnoDB, which is ideal for many applications.
MyISAM uses table-level locking. This ensures faster reading. It can also be used for full-text indexing, which enables effective full-text searches on large datasets. It’s also helpful for applications with content management systems and document databases.
MyISAM is easy to set up but doesn’t support transactions or foreign keys. This restricts its potential for applications that need strong data integrity. A blog platform can benefit from its fast reading and full-text indexing capabilities.
MRG_MYISAM storage in MySQL manages multiple similar MyISAM tables as a single table (by merging them). For merging, the tables must share identical column structures. This means they should have the same column names, data types, and column order.
This engine is a good choice for managing large datasets partitioned across tables but queried as one.
The MRG_MYISAM engine supports MyISAM features such as fast read performance and table-level locking. It also enables the data to be distributed across several files for improved organization and is useful in applications requiring efficient data processing, such as logging systems or data warehousing.
However, MHG_MYISAM doesn’t support transactions or foreign key constraints. This is problematic for applications that require data integrity and transactional consistency. Additionally, since the engine uses table-level locking, write operations can become a bottleneck under high concurrency.
The BLACKHOLE storage engine discards any data that’s written to it.
BLACKHOLE is an excellent choice for testing and debugging. It’s often used in replication setups and filters out unwanted data. We can also use it to produce dummy tables for testing purposes.
The CSV storage engine is designed for simplicity. It stores data in the comma-separated values (CSV) format.
CSV tables are easy to create and manipulate, so they’re ideal for quick data import and export tasks. However, CSV doesn’t support indexes or transactions. It’s limited to simple data storage and manipulation tasks.
Alternatively, when migrated from one system to another, a CSV table can serve to temporarily store data. The simplicity makes it easy to pull data from the source and import them into the target system.
This storage engine is suitable for large datasets. It stores them automatically in a compressed format. Compression reduces storage requirements and enhances I/O performance.
ARCHIVE is the ideal choice for historical data recording. It’s also optimized for mass data insertion, batch data processing, or data warehousing. However, ARCHIVE doesn’t support indexes. As a result, its design is simpler, but query performance isn’t as high as compared to other systems like MyISAM.
A practical use case for ARCHIVE is log storage since ARCHIVE can handle large volumes of data with little overhead.
The table below summarizes MySQL storage engines.
| Storage engine | Main Characteristics | Use Cases |
|---|---|---|
| FEDERATED | Accesses data on remote MySQL servers, no data replication | Distributed data integration and centralized reporting systems. |
| MEMORY | Data stored in RAM, high-speed access, volatility | Temporary data storage, real-time analytics dashboards |
| InnoDB | Default engine, supports transactions, row-level locking, foreign keys, ACID compliance, fault-recovery | Transactional applications, e-commerce sites |
| PERFORMANCE_SCHEMA | Monitoring server performance, doesn’t store user data | Database performance evaluation and optimization |
| MyISAM | Table-level locking, no transactions, supports full-text indexing | Read-heavy applications, content management systems, document databases, blog platforms |
| MRG_MyISAM | Merges multiple MyISAM tables, table-level locking, no transactions | Large datasets partitioned across tables, logging systems, data warehousing |
| BLACKHOLE | Discards written data, used for testing and debugging | Replication setups, filtering unwanted data, creating dummy tables |
| CSV | Stores data in CSV format, no indexes or transactions | Quick data import/export tasks, temporary data storage during system migrations |
| ARCHIVE | Compressed storage, for large volumes, no indexes | Historical data recording, mass insertion, batch processing, log storage |
Users often prefer InnoDB and MyISAM storage engines. InnoDB supports transactions, row-level locking, and other features such as foreign keys. These features maintain data integrity and reliability. On the other hand, MyISAM is most suitable for query performance. It offers full-text indexing and table-level locking for faster operations.
In this article, we explained the storage engines we can use in MySQL.
There are several engines in MySQL we can use. They serve different purposes, and choosing the most suitable one optimizes performance and ensures data integrity at the desired level.
InnoDB is usually a good choice for most cases because of its features and reliability. It’s also the default engine.