Learn through the super-clean Baeldung Pro experience:
>> Membership and Baeldung Pro.
No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.
Last updated: June 13, 2024
In today’s tech landscape, distributed systems are trending due to their advantages over monolithic systems. However, everything in software architecture is a trade-off, and neither solution is bulletproof. One common challenge when it comes to distributed systems is ensuring data consistency across multiple nodes.
In this tutorial, we’ll analyze the differences between two different approaches to managing distribution transactions: Two-Phase Commit and Saga Pattern.
A transaction is a set of operations we want to perform on our data. Typically, transactions exhibit an all-or-nothing behavior: the transaction is committed if all operations succeed, or it is rolled back if any operation fails. In a monolithic system, this behavior is typically managed by the database.
A distributed transaction also involves a set of operations on data but across multiple services. These transactions are more complex than regular transactions because they require ensuring that each service remains consistent with the others. This involves coordinating the actions and states of multiple independent services to maintain overall data consistency.
As an example, let’s consider the following orchestrated scenario. This is a simulation of a dummy workflow for an online shop:
In this workflow, we can see that placing an order triggers updates in three other services: Payment Service, Inventory Service, and Delivery Service. We’re using the Order Placement Service as an orchestrator and it sends the requests sequentially to our three domain services. Therefore, we have a distributed transaction because we’re updating three different databases.
The Two-Phase Commit (2PC) protocol is designed to ensure all nodes in a distributed system either commit or roll back a transaction. Therefore, we can achieve atomicity across multiple database nodes in the context of a distributed transaction.
One node would act as a coordinator, also known as a transaction manager, to initiate the 2PC. It consists, unsurprisingly, of two phases: the preparation phase and the commit phase.
In this phase, the transaction coordinator starts the process by sending a prepared request to all participating nodes. Each participant checks if they can complete the transaction with its current state and resources.
After that, each participant responds to the coordinator with a vote. There are two possible responses:
Each participant must ensure the durability of their decision. Therefore, using a pattern like Write-Ahead Log is needed for fault tolerance:
Here, the coordinator collects the votes from all participants. If all participants voted yes, the coordinator decides to commit the transaction. If any participant voted no, the coordinator decides to abort the transaction.
Based on its decision, the coordinator sends commit or abort requests to all the participants. Each participant performs the required action and releases any acquired locks. After that, each participant has to inform the coordinator that it has completed the commit or abort operation:
The Saga Pattern is an alternative approach to managing distributed transactions, especially useful in microservices architecture where long-lived transactions are involved.
Instead of ensuring atomicity through a single, all-or-nothing transaction, this pattern decomposes a transaction into a series of smaller, independent sub-transactions, also called local transactions. Each sub-transaction is managed by a separate service, and together, they form a saga. If a local transaction fails, the saga executes a series of compensating transactions to undo the changes that were made by the preceding local transactions.
In sagas, we have three different types of transactions:
There are two common saga implementation approaches: choreography and orchestration.
In a choreography-based approach, each local transaction publishes events that trigger local transactions in other services. Therefore, no central coordinator is telling the saga participants what to do.
Example Workflow:
If any step fails, each service involved must execute a compensating transaction to revert its changes.
In an orchestration-based saga, a central orchestrator (or coordinator) manages the entire transaction. The orchestrator sends commands to each service to perform their local transactions. It also handles any failures by sending commands to execute compensating transactions as necessary.
Example Workflow:
If any sub-transaction fails, the orchestrator sends commands to undo the preceding steps.
There are multiple distinctions between 2PC and Saga Pattern, which should be carefully considered before choosing one over another.
Two-Phase Commit ensures strong consistency by maintaining atomicity across distributed transactions. All participating nodes either commit or roll back the transaction, leading to a consistent state across the system.
On the other side, Saga Pattern ensures eventual consistency rather than strong consistency. Each sub-transaction is committed independently, and if a failure occurs, compensating transactions are executed to revert the changes. This approach may result in temporary inconsistencies.
The 2PC protocol is best suited for short-lived transactions. This approach requires locks to be held until the transaction is either committed or aborted, which can lead to performance issues in long-running transactions.
Saga Pattern is more suitable for long-lived transactions since each local transaction is independent. Locks are not held for the entire duration of the saga, minimizing performance bottlenecks.
Two-Phase Commit is simpler to implement in terms of logic since it relies on a single atomic operation. However, it can be challenging to scale and manage due to the need for coordination and the risk of blocking in case of failures.
Sagas are more complex to implement because they require defining compensating transactions and managing partial failures.
In 2PC we have a central coordinator to manage the whole transaction, making it a single point of failure. Also, a saga can be implemented quite similarly, if we’re talking about an orchestration. Also, when we implement the Saga Pattern, we have the alternative of a more decentralized approach, the choreography.
Usually, strong consistency affects scalability. Therefore, 2PC is less scalable due to the need for coordination and the potential locks to be held across distributed nodes.
On the other hand, the Saga Pattern is more scalable because each service manages its transactions independently.
As we’ve seen throughout this article, both approaches address the problem of managing distributed transactions. Each approach has its advantages and disadvantages, that should be carefully considered before choosing one approach. Therefore, the choice may depend on the specific requirements of consistency, scalability, and fault tolerance of the system.