By the end of today’s lecture, you will learn:
In the Byzantine Broadcast problem:
attack
or to retreat
(or 0
or 1
, or any other binary choice).There are a total of \(n\) parties, including the leader.
In the synchronous setting, our goal is to design a protocol – that is, a method that each party can use to decide what messages to send and to whom – that protects the honest parties as follows.
So far, we have two protocols that provide Byzantine Broadcast in the synchronous setting: the Dolev-Strong protocol and the phase-king protocol.
Theorem. In a plain pairwise authenticated channel model without any setup assumptions such as a PKI, Byzantine Broadcast is impossible if at least \(n/3\) parties are corrupt.
Finally, starting from synchronous broadcast, we can construct a blockchain in the synchronous, permissioned setting.
Goal for this lecture: Can we build a Byzantine reliable broadcast protocol that is asynchronous, does not require trusted setup, and can tolerate the maximum number of malicious parties \(f \leq \frac{n-1}{3}\)?
|
Dolev-Strong |
Phase-King |
Goal Protocol |
---|---|---|---|
Synchronous network |
YES | YES | NO |
PKI |
YES |
NO | NO |
Number of malicious parties |
\(f \leq n\) | \(f \leq \frac{n-1}{3}\) | \(f \leq \frac{n-1}{3}\) |
Number of rounds |
\(f+1\) | \(3(f+1)\) | constant |
At this point you may be asking yourself: why not stick with the synchronous network assumption? So far, both protocols we have seen have been synchronous.
However, this is a strong assumption to make. We can’t assume that the network delays are always predictable and that the messages will arrive within some time bound \(\Delta\). We can’t assume that all the clocks of the nodes are synchronized and that all nodes can enter and exist rounds at the exact time.
Basically: the internet is unpredictible and clocks are imprecise!
Today, we will examine a new model, known as the asynchronous model, that eliminates the requirement for nodes to have their own clocks.
Within an asynchronous consensus protocol, each node is only activated when it receives a message from the network, at which point it performs some computation and selects messages to send (if any). This model looks challenging because the adversary can delay messages by any bounded time.
The only good news about an asynchronous network is that we still assume that the adversary cannot drop messages in transit. So if a message is sent from honest Alice to honest Bob, it will eventually arrive at its destination… even if it takes awhile.
(On the other hand, messages sent from a malicious Mallory might never arrive – because the malicious party might never have sent it in the first place!)
Recall the broadcast properties we have seen:
In asynchronous broadcast, the properties are slightly different in two ways.
There is no termination property
The reason for this is that the adversary can delay messages arbitrarly, and there is no failure detector. As a result, honest parties don’t know whether they are dealing with a slow message/party or a failed party that didn’t send anything. In particular, if the leader is malicious, then they can simply choose not to send anything. The honest recipients will then wait forever for a message that in reality is never coming but that they believe is just very slow to arrive.
We add the word “eventually” to the consistency and validity definitions.
The reason for adding this disclaimer is that the adversary can delay messages for a long time, but it must eventually let them pass between honest parties.
Definition. An Asynchronous Byzantine Reliable Broadcast protocol satisfies the following two properties:
Let’s try to build an asynchronous Byzantine reliable broadcast protocol.
Tips about the asynchronous setting. Before we begin, here are some helpful tips to keep in mind when working in an asynchronous system:
A party cannot decide an action based on a timer or the ending of a round. For example, it is not permitted to say that a party should perform a particular action if it does not hear from a party after some time (say, a few seconds).
In an asynchronous system, an honest party never should wait to hear from more than \(n-f\) parties before sending a message of its own. If an honest party waits for more than \(n-f\) parties, it risks waiting forever since there are \(f\) malicious parties who might never send anything.
Following on the previous point: if a party hears from \(n-f\) parties, then \(f\) of those \(n-f\) parties might be malicious. If \(n \geq 3f+1\), then the best that we can say for sure is that \(f+1\) of these parties are honest (out of the \(n-f\) parties that replied).
We will construct a protocol in which each party has at most two “steps,” or occasions to send data. These would be called “rounds” in synchronous protocols, but now they no longer have to occur at the same time by all parties.
In each step, an honest party will send a maximum of only one message. (As always, we can never impose limits on what the malicious parties do.)
After these two steps, each party chooses an output as follows:
Once again, we provide pseudocode from the perspective of a single honest party \(P_i\). (By convention, we presume that the first party \(P_1\) is the leader.)
Note that our code is written in an event-driven programming style. That’s because we have no idea when messages arrive. Whenever they do come though, a party can react to it and potentially send a message of its own.
input: the leader has an input message v, other parties have no input
leader_send: the leader party P_1 sends <leader, v> to all parties (the other parties send nothing)
echo: upon hearing <leader, v> from the leader P_1 for the first time,
send <echo, v> to all parties
output: upon hearing n-f <echo, v> messages coming from n-f distinct parties containing the same value v,
output v
(implicitly, otherwise output nothing, i.e., never terminate)
Question. Does this protocol satisfy the two properties of an asynchronous Byzantine reliable broadcast protocol, reproduced below?
The answer is no!
Let’s call the protocol above the Simplified Bracha protocol, for reasons we will see in a few minutes.
In this protocol, it is possible that some honest parties complete the protocol but others do not – which violates the consistency rule.
Still, we are making progress. This protocol satisfies a relaxed notion of consistency.
Theorem. If \(n \geq 3f + 1\), then the Simplified Bracha protocol satisfies validity and weak consistency.
Proof of validity:
Proof of weak consistency:
In this section, we will study Bracha’s reliable broadcast protocol. Initially designed by Gabriel Bracha in 1987, this protocol is one of the most important building blocks for Asynchronous Byzantine Fault tolerant protocols.
Our goal here is to strengthen the consistency property of the previous protocol. So far, we have a weaker notion of consistency between the parties that finish. That is:
Bracha’s idea is to add a third communication, called the “ready step.” The idea is that if an honest party received enough echos of a single message to be happy with it in the previous protocol, then it will send it along one more time and wait for other honest parties to finish the protocol.
Only when the honest party makes sure that everyone can else has the ability to determine the output value, then it can itself finish the protocol and output the desired value. In this way, either all honest parties complete the protocol, or none of them do.
After the leader and echo steps from before, let’s add a third communication round that essentially serves as a kind of second echo. Only if a party hears \(n-f\) of these new ready
messages do they complete the protocol.
Here is pseudocode of this new protocol. Only the ready and output procedures are changed from before.
input: the leader has an input message v, other parties have no input
leader_send: the leader party P_1 sends <leader, v> to all parties (the other parties send nothing)
echo: upon hearing <leader, v> from the leader P_1 for the first time,
send <echo, v> to all parties
ready: upon hearing n-f <echo, v> messages coming from n-f distinct parties containing the same value v,
send <ready, v> to all parties
output: upon hearing n-f <ready, v> messages coming from n-f distinct parties containing the same value v,
output v
(implicitly, otherwise output nothing, i.e., never terminate)
Question. Unfortunately, this protocol does not quite work to achieve asynchronous broadcast. Can you see why?
Remember the helpful tips from above about working in an asynchronous system:
The issue here is with hint #3. If an honest party hears \(n-f\) ready
messages, then:
Question. Can you think of a way to fix this problem and make the protocol work?
Gabriel Bracha’s solution to this problem is to add an amplification step. We should amplify the number of ready
messages from \(f+1\) honest parties to \(n-f\) honest parties, so that all honest parties can eventually finish!’
Specifically, there are now two events that will cause an honest party to send a ready
message: either it has received \(n - f\) echo messages or it has received \(f + 1\) ready messages. The latter constraint means that at least one honest party is ready to finish the protocol.
input: the leader has an input message v, other parties have no input
leader_send: the leader party P_1 sends <leader, v> to all parties (the other parties send nothing)
echo: upon hearing <leader, v> from the leader P_1 for the first time,
send <echo, v> to all parties
ready_event1: upon hearing n-f <echo, v> messages from n-f distinct parties containing the same value v,
send <ready, v> to all parties
ready_event2: upon hearing f+1 <ready, v> messages from f+1 distinct parties containing the same v, and not having previously sent a ready,
send <ready, v> to all parties
output: upon hearing n-f <ready, v> messages coming from n-f distinct parties containing the same value v,
output v
(implicitly, otherwise output nothing, i.e., never terminate)
This fixes the only remaining problem, and the construction is now complete!
Theorem. If \(n \geq 3f + 1\), then Bracha’s protocol is an asynchronous Byzantine reliable broadcast protocol.
Proof of validity: This proof is largely similar to the previous proof for the simplified Bracha protocol, only now each party communicates up to three times.
If the leader is honest, then every honest party is going to hear the value \(v\) after the leader step, and therefore every honest party will echo \(v\) to all other parties.
Hence, every honest party will eventually hear at least \(n-f\) echo messages with the value \(v\) coming from the honest parties and will be able to terminate send a ready message with \(v\).
Therefore, every honest party will eventually hear at least \(n - f\) ready messages with the same \(v\) and at most \(f\) ready messages containing other values. Since there are only \(f\) malicious parties, the adversary Mallory cannot convince even a single honest party to output a different value \(v'\).
It remains to prove consistency. To do so, it is helpful first to consider the following lemma.
Lemma. If two honest parties send a ready
message, then they must send it for the same value \(v\).
Proof of the lemma. This is also a proof by contradiction.
Assume for the sake of contradiction that two parties \(P_1\) and \(P_2\) output two different values \(v_1\) and \(v_2\) respectively.
This means that party \(P_1\) either:
echo
messages from distinct honest parties with the same value \(v_1\), orready
message from another honest party \(P_i\) containing the value \(v_1\).If the second case is true, then it means that party \(P_i\) also either heard \(n-2f\) echos with \(v_1\) or heard a ready message with \(v_1\) from yet another honest party. Eventually, we will be able to find some honest party that was the first to create a ready message with value \(v_1\), and that party must have heard \(n-2f\) echo
messages from distinct honest parties with the same value \(v_1\).
Similarly, party \(P_2\) (or some other honest party) heard \(n-2f\) echo
messages from distinct honest parties with the same value \(v_2\).
Since \(n \geq 3f+1\), by the pigeonhole principle it follows that there was at least one honest party who sent an echo message containing values \(v_1\) and \(v_2\) to two different recipients. But this is a contradiction, because honest parties stick to the protocol and send the same echo message to everyone.
Now we can complete the proof of Bracha’s broadcast protocol.
Proof of consistency. We want to show that: if an honest party outputs \(v\), then all honest parties will eventually output \(v\).
ready
message will do so for the same value \(v\).ready
messages with \(v\), of which at least \(f+1\) came from honest parties.ready
message to everyone. As a consequence, all honest parties will be able to send a ready
for message \(v\) due to seeing the ready messages from these \(f+1\) honest parties (if they have not already done so due to seeing \(n-f\) echoes).ready
message containing the same value \(v\), so all of them will complete the protocol.Note that it still remains possible that no honest parties reach the state of being able to output a message. Hence, asynchronous protocols need not terminate; they can stall forever.
In the last lecture, we learned that we can use synchronous Byzantine broadcast to build a permissioned blockchain.
Question. Can we use asynchronous Byzantine broadacast to build a permissioned blockchain in the asynchronous setting?
Actually, this doesn’t quite work. The problem is that Bracha’s broadcast protocol doesn’t guarantee termination.
The synchronous blockchain protocol that we designed last time relied on the idea that each party takes a turn being the designated leader and creating a new block (first party \(P_1\), then party \(P_2\), and so on). But in the asynchronous setting, we may never be certain when the epoch where Party 1 is the designated leader has ended, so that we can move on to the next epoch with Party 2 as the designated leader.
At first glance, this looks to be an insurmountable problem. It is actually impossible to build an asynchronous Byzantine Broadcast protocol that guarantees termination. The problem is that the designated leader might simply refuse to participate in the protocol, and the honest parties will wait forever to listen for a message that may never arrive.
In some sense, the problem here stems from the fact that we are putting all of our faith in a single party to serve as the designated leader.
So let’s back away from that. We can define a new, slightly more general question called Byzantine Agreement. This is like Byzantine Broadcast, except that now everyone gets to send a message at the same time.
For this new problem:
attack
or retreat
.Let’s modify our properties from Byzantine Broadcast to work in the new, more sophisticated setting of Byzantine Agreement.
Consistency is the same as before:
all honest parties output the same value.
Validity only needs to hold if everyone starts with the same message… or at least, all the honest parties do, since we can never be sure what the malicious parties are doing. Formally, we guarantee that:
if all honest parties start with the same input value, then this shared value is guaranteed to become the output value.
Liveness is the asynchronous version of what we used to call termination. It states that:
all honest nodes output eventually (assuming that messages are delivered eventually, albeit with arbitrary delays)
In the synchronous setting, there are many ways to build Byzantine Agreement. One simple approach is to run \(n\) copies of a Byzantine Broadcast protocol (e.g., Dolev-Strong) in parallel, where everyone is the designated leader in one of them, and then take a majority vote of the results.
This works as long as \(f < \frac{n}{2}\), since otherwise the malicious parties are in the majority and can overrule the honest parties’ values. (And it turns out that \(f < \frac{n}{2}\) is the best that one can do.)
But the asynchronous setting is where things really get interesting. It turns out that:
CommonCoin
, it is possible to build a randomized protocol for asynchronous Byzantine Agreement that can withstand up to \(f < \frac{n}{3}\) malicious parties!In other words, we can achieve liveness with Byzantine Agreement, even though it was impossible to accomplish with Byzantine Broadcast. This is because we are no longer relying on a single leader who might just refuse to participate in the protocol; instead, everyone is allowed to contribute equally.
I don’t have time to describe the randomized protocol for asynchronous Byzantine Agreement in today’s class. You will read it for yourself in Chapter 13 of Elaine Shi’s textbook.
Hopefully this insight also provides better context for Satoshi Nakamoto’s contribution. The Bitcoin blockchain also achieves consistency and liveness in an asynchronous network. It does so in a permissionless setting where up to 1/2 of the computing power can be malicious (rather than up to \(1/3\) of parties).
This is just one of many amazing applications of a CommonCoin
! Next time, we will explore in more detail the problem of generating common randomness.
DS 653 — Lecture 12