Two Generals’ Problem and Idempotency
Introduction
The Two Generals’ Problem is a thought experiment in computer science and distributed systems that highlights the challenge of achieving reliable communication over an unreliable network. The problem remains unsolved and illustrates the inherent difficulty of coordinating actions between two parties (generals, in this case) when messages can be lost or interrupted.
The Setup
Two generals are stationed on opposite hills and need to coordinate an attack on a city located between them. For the attack to succeed, both generals must attack at the same time. The only way for them to communicate is by sending a messenger across the valley, but the messenger might be intercepted or lost.
The Problem
The difficulty arises when the generals try to reach a consensus. Since both must attack simultaneously and one army alone is not enough to capture the city, communication is essential. However, the lack of guaranteed delivery creates an inconsistent state.
For example, General B might send a message: “Let’s attack at 10:00 on June 1st.” After sending it, General B cannot be sure if the messenger reached General A or was intercepted. Without certainty, General B hesitates to attack. As a solution, General A could send back a confirmation: “I have received your message to attack at 10:00 on June 1st.” Yet, this messenger might also be intercepted, leaving General A unsure if the confirmation arrived. General B could then send a “confirmation of confirmation,” but this, too, might be lost.
This leads to the realization that no matter how many confirmations are exchanged, there is no absolute guarantee that both generals will agree on a plan to attack.
Possible scenarios:
- If the first messenger is captured, General A never receives the plan, and neither general will attack.
- If General A receives the message but their confirmation is lost, General B will assume the message was not delivered.
- Even if both the message and the confirmation are received, either side may still hesitate, fearing the last confirmation was lost.
In all cases, communication unreliability prevents guaranteed consensus.
Comparison with TCP
TCP uses a three-way handshake (SYN, SYN-ACK, ACK) to establish a reliable connection. This ensures that both sides can send messages to each other, but it does not verify the content of those messages. For data transmission, TCP uses duplicate cumulative acknowledgements and timeout-based retransmissions to ensure delivery.
In the Two Generals’ Problem, consensus must be reached without retries, which is fundamentally different. Every message exchanged has a risk of loss, and the more messages sent, the greater the chance of losing one. TCP aims for reliable delivery, not guaranteed consensus on a plan, so it does not solve the Two Generals’ Problem.
Mitigation Strategies
While 100% reliability is impossible, the impact can be reduced:
Sending many messengers with serial numbers:
General B could send multiple messengers, each with a unique sequence number. If at least one messenger arrives, General A receives the message and can estimate channel reliability by noting missing numbers. TCP uses a similar sequence numbering approach, and Kafka applies it persistently to guarantee “exactly once” delivery.
Periodic message sending:
General B could send a message, wait a fixed time (enough for delivery and return), and resend if no confirmation arrives. This introduces retries, similar to TCP’s timeout-based retransmission. However, even with retries, all messages could still be lost in an unreliable network.
Both strategies risk duplicate messages. In computing, handling duplicates requires an idempotent consumer—a receiver that can process the same message multiple times without changing the result.
Idempotency
Idempotency means an operation can be applied repeatedly without changing the outcome beyond the first application. It allows safe retries by guaranteeing at-most-once effects.
In remote calls, it’s impossible to know for sure whether a request was received due to possible failures en route or during response. In APIs, idempotency means clients can send the same request repeatedly and get the same result. To implement this, the receiver uses a unique client-generated key (commonly a UUID) to detect and discard duplicates.
Implementation
On the receiving side, the system checks a persistent store to see if a request with the same key has been processed before. This is often implemented with a database enforcing a unique constraint.
Two approaches to persisting the key:
- Eager: Insert before processing the message (requires cleanup if processing fails).
- Lazy: Insert after processing, in the same transaction, to allow rollback if the key already exists.
When handling duplicates, it’s important to decide whether to ignore them, log them, or take corrective action. The key’s lifetime should match the maximum expected retry window, ranging from seconds to days depending on system requirements.
Idempotency in HTTP APIs
HTTP methods like GET, PUT, DELETE, HEAD, OPTIONS, and TRACE are idempotent by design. POST and PATCH are not, since multiple calls can create multiple resources or increment values. The IETF proposes using an Idempotency-Key header to make unsafe operations idempotent.
A server implementing this:
- Stores the key, request hash, and response.
- On receiving a duplicate with the same key and identical request hash, it returns the stored response.
- If the hash differs, it returns an error.
Conclusion
The Two Generals’ Problem shows that over an unreliable network, perfect coordination is impossible. Even with retries, uncertainty remains, and duplicates must be handled safely. Idempotency provides a practical way to make retrying safe, ensuring the same request can be processed multiple times without side effects.