Distributed Systems: The Byzantine Generals Problem

1. Introduction

In this tutorial, we’ll learn what the Byzantine generals problem is. We’ll explain the problem in detail and discuss various available algorithms to solve it.

2. The Byzantine Generals Problem

The Byzantine generals problem is a well-known concept in distributed computing and computer science that describes the difficulty of coordinating the actions of several independent parties in a distributed system.

Marshall Pease, Robert Shostak, and Leslie Lamport developed the idea in 1982.

We often frame the problem as a military metaphor, in which a group of generals are camping around a city and must decide whether to attack or retreat. The generals can only communicate with one another by sending messages, but they cannot be sure that the messages are authentic. Therefore, the generals must find a way to reach a consensus despite the possibility of deception and betrayal.

The metaphor describes a group of generals encamped around a city who must decide whether to attack or retreat. The generals can only communicate with one another by sending messages, but they cannot be sure that the messages are real.

Therefore, the generals must find a way to reach a consensus despite the possibility of deception and betrayal. The metaphor highlights the problem of achieving consensus in a decentralized environment, where the parties cannot trust one another and must rely on communication to coordinate their actions.

3. The Byzantine Generals Problem as a Distributed Systems Challenge

We use the military metaphor presented above to illustrate the difficulties of achieving consensus in a decentralized environment and the importance of finding solutions to the problem to maintain the integrity of distributed systems.

We often consider this problem a fundamental challenge in distributed systems, as it illustrates the difficulties of achieving consensus in a decentralized environment. The problem is particularly relevant in distributed systems such as blockchain, where multiple parties must reach a consensus on the system’s state to maintain its integrity.

The following diagram depicts a group of nodes representing the generals, connected by lines representing the communication channels:

The diagram illustrates these nodes in different colors to represent the loyal generals, the ones who are Byzantine (mischievous) or those who failed. The diagram also shows the decision-making process, where each general sends their vote (attack or retreat) to the other generals and decides based on the majority vote.

A label on each circle indicates whether the general is loyal, Byzantine, or failed. The diagram uses lines to show the communication channels between the generals, with some lines indicating unreliable communication.

The diagram includes arrows indicating the message that is being sent. A red or green arrow indicates the final decision (attack or retreat), and a star with the arrow indicates whether the decision was correct (finally).

4. Key Parts of the Byzantine Generals Problem

The Byzantine generals problem has several key parts that are important to understand:

there are multiple independent parties (generals in the military metaphor) that must coordinate their actions
the parties cannot rely on a central authority and must coordinate their actions in a decentralized environment
we can only communicate with one another by sending messages
we cannot ensure that the messages are authentic and, therefore, cannot fully trust one another
the parties must reach a consensus despite the possibility of deception and betrayal from other parties

To summarize, the problem illustrates the difficulty of achieving consensus in a decentralized environment where parties cannot trust one another and must rely on communication to coordinate their actions.

5. Common Solutions to the Byzantine Generals Problem

There are several common solutions to the Byzantine generals problem.

One of the earliest solutions proposed to the Byzantine generals problem is the Byzantine fault tolerance (BFT) algorithm, which is based on the concept of a consensus protocol. BFT is a mechanism that allows a group of parties to reach a consensus despite faulty actors. A supermajority of parties must agree on a decision before implementing it.

Another solution is the practical Byzantine fault tolerance (PBFT) algorithm, a variation of BFT. PBFT is more efficient and scalable than BFT and widely used in blockchain systems such as Ethereum.

A more recent solution is the Tendermint consensus algorithm, a form of BFT designed to be more efficient and secure than traditional BFT algorithms. Tendermint uses a combination of voting and proof-of-stake mechanisms to achieve consensus, and we use it in several blockchain projects such as Cosmos and Terra.

Raft Consensus and Paxos Algorithms are other solutions to the Byzantine generals problem based on network par. The design of the raft consensus algorithm handles network partition, a situation where there are disruptions in communication channels between parties. The design of the Paxos algorithm handles network partition and ensures that all parties agree on a single outcome.

Delegated Proof of Stake (DPoS), Proof of Authority (PoA), and HoneyBadger BFT Algorithms are other commonly used algorithms for solving the Byzantine generals problem.

6. Real-Life Situations Where the Byzantine Generals Problem Is Applicable

The Byzantine generals problem is a widely applicable concept in distributed systems and computer science and can be found in several real-life situations. The table below shows the domains and applications in which we apply the Byzantine generals problem solutions:

Domain	Application
Blockchain Technology	The problem of achieving consensus in a decentralized environment is particularly relevant in the context of blockchain systems, where multiple parties must reach a consensus on the state of the system to maintain its integrity.
Distributed Databases	In distributed databases, multiple nodes need to agree on the state of the data, and The Byzantine generals problem illustrates the difficulties of achieving this consensus in a decentralized environment.
Cloud Computing	In cloud computing, multiple parties provide resources, and the problem of achieving consensus on the state of the system is critical for maintaining its integrity.
Distributed Ledgers	Distributed ledgers, such as blockchain, require multiple parties to agree on the state of the ledger, and the Byzantine generals problem illustrates the difficulties of achieving this consensus in a decentralized environment.
Distributed Storage	In distributed storage, multiple parties need to agree on the state of the data, and the Byzantine generals problem illustrates the difficulties of achieving this consensus in a decentralized environment.
Multi-Agent Systems	In multi-agent systems, multiple agents need to agree on the state of the system and make decisions, and the Byzantine generals problem illustrates the difficulties of achieving this consensus in a decentralized environment.
Internet of Things (IoT)	In IoT, multiple devices need to agree on the state of the system and make decisions, and the Byzantine generals problem illustrates the difficulties of achieving this consensus in a decentralized environment.
Distributed Systems	The Byzantine generals problem is applicable in any distributed system where multiple parties need to agree on the state of the system and make decisions, and it illustrates the difficulties of achieving this consensus in a decentralized environment.

7. Anatomy of Any Byzantine Generals Problem Solution Algorithm

Here is a pseudocode that explains and solves the Byzantine generals problem using any Byzantine generals problem solution algorithm:

algorithm Main(G):
    // INPUT
    //    G = array of generals
    // OUTPUT
    //    a solution to the Byzantine generals problem

    g <- getLength(G)    // Number of generals
    M <- Array to store messages from each general
    B <- Array to store besieged (surrendered) generals
    b <- Number of besieged generals

    for i <- 1 to g:    // Acquiring messages ("Attack" or "Retreat") from Generals
        M[i] <- ReadMessage(G[i])    // Messages are written to message array M

    MajorityDecision <- CheckMajority(G[1] ... G[g], M[1] ... M[g])

    if MajorityDecision = true:
        Print("A supermajority of Generals agree to Attack")

    B <- CheckBesieged(G[1] ... G[g], M[1] ... M[g], MajorityDecision)
    b <- getLength(B)    // Number of besieged generals

    Print("Names of besieged generals:")
    for i <- 1 to b:
        Print(B[i].Name)    // Printing names of besieged generals

In this pseudocode, we define the number of generals and the number of Byzantine generals as $g$ and $b$ , respectively. We use the array of messages to store the messages sent by each general.

7.1. The CheckMajority Function

The $CheckMajority$ function checks if a supermajority of the generals agrees on a decision. Different algorithms use different mechanisms to calculate the supermajority condition. The function returns true when we achieve a supermajority:

algorithm CheckMajority(G, M, Threshold):
    // INPUT
    //    G = array of generals
    //    M = array of messages
    //    Threshold = the threshold for majority (algorithm-specific)
    // OUTPUT
    //    A binary result on a supermajority decision to attack

    g <- getLength(G)    // Number of generals
    n <- getLength(M)    // Number of messages

    MajorityDecision <- false    // The supermajority decision
    Measurement <- 0    // Initialize Measurement as a number

    for i <- 1 to n:    // Analyzing all messages
        Algorithm-specific steps go here...
        Measurement = Threshold:    // The measurement qualifies the supermajority condition
        MajorityDecision <- true

    return MajorityDecision

Sometimes we may use a subset of the message array rather than the entire one. By superset of messages, we imply the set containing more than one instance of messages from a particular general, sent at different times or to different destinations. The $CheckMajority$ function is a distinct function which we do not merge with the main function for a variety of reasons:

Depending on the algorithm used to address the Byzantine generals problem, we may call the $CheckMajority$ function numerous times in the main function
For some solution algorithms, we may invoke the $CheckMajority$ function recursively for a subset or superset of messages
Some algorithms may use more than one $CheckMajority$ function. The main function may call one or more $CheckMajority$ functions on a subset or superset of messages

7.2. The CheckBesieged Function

Different algorithms use different mechanisms to measure the trust scores of the generals. By checking each general’s incoming and outgoing message pairs, the $CheckBesieged$ function determines which generals have possibly been besieged (surrendered). When evaluating the generals, the $\boldsymbol{MajorityDecision}$ is also taken into account:

algorithm CheckBesieged(G, M, MajorityDecision, TrustThreshold):
    // INPUT
    //    G = array of generals
    //    M = array of messages
    //    MajorityDecision = the supermajority decision
    //    TrustThreshold = the trust threshold (algorithm-specific)
    // OUTPUT
    //    A list of besieged generals

    g <- getLength(G)    // Number of generals
    n <- getLength(M)    // Number of messages
    B <- empty array     // Array to store besieged generals
    k <- 0               // Index for array B
    TrustMeasurement <- 0   // Initialize TrustMeasurement as a number

    for i <- 1 to g:
        TrustMeasurement <- 0  // Initialize for each general
        for j <- 1 to n:  // Analyzing all message pairs
            Algorithm-specific steps go here ...
            TrustMeasurement <- Updated value for TrustMeasurement  // Updating the trust measurement for general Gi

        if TrustMeasurement <= TrustThreshold:  // Besieged general Gi identified
            k <- k + 1
            B[k] <- G[i]  // Gi added to array B

    return B[1:k]

Some algorithms may use the list of besieged generals in decision-making (in future rounds), such as when determining the reputation score for generals.

8. Examples of Byzantine Generals Problem Solution Algorithms

As seen in the above section, the general structure of the algorithms used to solve the Byzantine generals problem has three main functions. We assume the $Main$ and $CheckBesieged$ functions are identical for the following algorithms. We are comparing the presented algorithms based on how they arrive at the supermajority decision.

8.1. Solving the Byzantine Generals Problem Using the Byzantine Fault Tolerance (BFT) Algorithm

In the Byzantine fault tolerance (BFT) Algorithm, the $CheckMajority$ function checks if a supermajority of the generals agrees on a decision. It does this by counting the number of “Attack” in the messages array and checking if it’s greater than $g/2$ . If it is, it returns $true$ , indicating that a supermajority of the generals agrees on the decision:

algorithm CheckMajorityBFT(G, M):
    // INPUT
    //    G = array of generals
    //    M = array of messages
    // OUTPUT
    //    A binary result on a supermajority decision to attack

    Threshold <- g / 2

    for i <- 1 to n:
        if M[i] = "Attack":
            Measurement <- Measurement + 1
        // Updating the measurement
    
    return Measurement >= Threshold

8.2. Solving the Byzantine Generals Problem Using the Reputation Scores of Generals

A reputation score helps you understand and compare a general’s reputation to that of other generals. We can calculate this based on various factors. For the current example, let us assume that the reputation scores are already available to the $CheckMajority$ function. The $CheckMajority$ function now attaches weights to the messages coming from different generals. It does this by measuring the number of “Attack” in the messages array and checking if it’s greater than a threshold value:

algorithm CheckMajority(G, M, R, Threshold):
    // INPUT
    //    G = array of generals
    //    M = array of messages
    //    R = array of reputation scores (A number between 0 and 1)
    //        Ri = score for general sending messages Mi
    //    Threshold = the threshold for weighted majority decision
    // OUTPUT
    //    A binary result on a supermajority decision to attack

    n <- getLength(M)  // Number of messages

    Measurement <- 0  // Initialize Measurement as a number

    for i <- 1 to n:
        if M[i] = "Attack":
            Measurement = Threshold:
        MajorityDecision <- true
    else:
        MajorityDecision <- false

    return MajorityDecision

9. Conclusion

In this article, we learned the origins of the Byzantine generals problem and its commonly deployed solutions. We explored key components of a possible solution to the problem and studied the Byzantine fault tolerance algorithm in detail.

Learn Java Collections

Learn Spring

Learn Maven

View All Courses

Core Concepts

Operating Systems

Neural Networks

Graph Theory

Latex

Full Archive

About Baeldung