Lecture 10: Byzantine Broadcast, Without Setup

Announcements

Required reading: Shi, chapters 3-5
HW 4 is due on Friday 2/28 at 10pm
Test 1 has been graded, and revisions are due on Monday 3/3 at 10pm

Learning outcomes

By the end of today’s lecture, you will learn:

The role that trusted setup can play in the design of a cryptographic protocol, and the benefit of avoiding trusted setup.
That Byzantine Broadcast is impossible to achieve without a PKI, if at least 1/3 of the parties are malicious.
That Byzantine Broadcast is possible without a PKI using the phase-king protocol, if fewer than 1/3 of the parties are malicious.

Review from last lecture

Last time, we began our exploration of the Byzantine Broadcast problem. In this question:

The commanding general, called the leader, is trying to send a message to attack or to retreat (or 0 or 1, or any other binary choice).
The remaining generals, called parties or nodes, want to reach agreement on the strategy.

There are a total of \(n\) parties, including the leader.

However, the generals cannot gather in a single location. They can only send messages back and forth to each other, one at a time.

This restriction is similar to how the Internet works today.

“The Internet is just the world passing notes in a classroom.”

– Jon Stewart

Even worse, a subset of \(f\) generals (possibly including the leader) may secretly malicious. The malicious generals don’t care about the attack vs retreat decision; instead, their goal is to break consensus.

Byzantine Broadcast (image source: Wikipedia)

Our goal is to design a protocol – that is, a method that each party can use to decide what messages to send and to whom – that guarantees the following properties.

Termination: All honest parties eventually complete the protocol and output something (i.e., the protocol doesn’t go on forever).
Validity: If the leader is honest and sends the bit \(b\), then all honest parties should output \(b\)
Consistency: If two honest parties output \(b\) and \(b'\) respectively, then \(b = b'\).

(We sometimes omit a discussion of the termination property when it is clear by construction.)

Note that our security guarantees only protect the honest parties. That’s because we have no idea what the malicious parties will do:

We cannot stop them from taking a different action, and
We cannot protect one malicious party from another malicious party.

This is a common feature of all of the security guarantees we will see throughout this course: we only strive to protect the honest parties. In particular: if the leader is malicious, we still insist that the honest parties reach a consistent decision. It doesn’t matter whether the decision is 0 or 1, as long as they all agree.

Last time, we also discussed several possible types of assumptions that we could make about the communication network.

Network type: The synchronous network assumption asserts that there is an upper bound on the time it takes for a message to travel between two parties, and as a result the protocol can be split into rounds that each last for this maximum length. An asynchronous network imposes no such requirement.
Adversary type: We assume some upper bound on the number of parties \(f\) under the adversary’s control. The adversary might control a dishonest majority, or there might be an honest majority that can reach agreement through a majority vote.
Network setup: With a public key infrastructure, every party knows the public key of every other party, and they can use the PKI to transmit and transfer digital signatures. Without this kind of setup, we assume only that there are authenticated channels between each pair of parties.

Finally, we concluded the last lecture by constructing the Dolev-Strong protocol. This protocol allows \(N\) parties to solve the Byzantine Broadcast problem in the following settings:

Synchronous network
Dishonest majority
PKI setup exists

The key idea of Dolev-Strong is to pass along signature chains of messages signed and countersigned by lots of parties.

Declaration of Independence (image source: Wikipedia)

The Dolev-Strong protocol guarantees termination, consistency, and validity even if any subset \(f \leq n\) of the parties are malicious. However, the protocol is slow: it requires \(f + 1\) rounds of communication to reach consistency.

10.1 Relying on Trusted Setup

Dolev-Strong protocol is an elegant protocol for broadcast in the face of adversaries. So why don’t we use it as the basis for a cryptocurrency like Bitcoin?

There are many answers to this question, but one important reason is the need for trusted setup.

Remember that Dolev-Strong assumes the existence of a public key infrastructure… that is, a mapping from public keys to identities.

You can think of a PKI as similar to a telephone directory, just for cryptographic public keys rather than telephone numbers.

Telephone directory (image source: Wikipedia)

Relying on a PKI can be dangerous and unstable!

It requires the parties to trust that the PKI is created properly, or else Dolev-Strong may give a false sense of confidence.
Even if there were a trusted PKI creator: building this directory is a tedious, thankless task. If the creator stops performing this role, then the network may no longer be able to reach consensus.

Still though, if we had a PKI then it is a powerful tool, because it allows the signatures to be transferrable. To explain this property, consider the following scenario with three parties: Far, Les, and Mor.

Suppose Far makes a statement to Mor, such as “I would like to broadcast the bit 0.”
If this statement is signed, then Mor can later show the message and signature to Les. Then Les can use the PKI to find Far’s public key and verify the signature.
This serves as proof of what Far thinks, even if Far has left the conversation!

But without signatures, a party’s claims about statements made by another party are not verifiable.

Example. Consider the following attempt at creating a Byzantine Broadcast where the two possible messages are M and L. In this case, Les is unable to tell the difference between the following two activities:

Far is honest, but Mor is malicious and incorrectly relays the message to Les (shown in the figure below).
Far is malicious and sent the wrong message to Mor, who correctly relayed it to Les (not shown).

Impossibility without signatures, informally (image source: Foundations of Distributed Consensus and Blockchains)

If we had a PKI, then we could settle this discrepancy with digital signatures.

In other words:

We can achieve Byzantine Broadcast for any number of malicious parties with a PKI (in the synchronous model)
But we cannot always do the same without a PKI.

So far, we have only shown that one specific two-round protocol fails to meet the consistency property of Byzantine Broadcast. But in fact this problem turns out to be impossible in general!

Also, it is not a coincidence that my (informal) counterexample above had 3 parties, of whom 1 was malicious. If we had four parties in the picture instead, we could have distinguished between the two possibilities of a malicious Far and a malicious Mor.

Question. How could we do this?

10.2 Broadcast without Trusted Setup

Great, so let’s try to build a Byzantine Broadcast protocol without using trusted setup like a PKI!

Unfortunately, that’s difficult to do.

Theorem. In a plain pairwise authenticated channel model without any setup assumptions such as a PKI, Byzantine Broadcast is impossible if at least \(n/3\) parties are malicious – that is, if \(n \leq 3f\).

This theorem was first proven by Pease, Shostak, and Lamport in 1980, and the proof was later simplified by Fischer, Lynch, and Merritt in 1985. Today, we’ll look at the FLM85 impossibility proof.

As a reminder, here are the three properties that Byzantine Broadcast is supposed to satisfy with a synchronous network.

Termination: all honest parties eventually reach a decision.
Consistency (aka agreement): all honest parties decide the same value.
Validity: if the leader is honest, then its value is the decision value.

Proof. We will only describe why this theorem holds in the case of \(n = 3\) parties. The theorem does generalize for any choice of \(n\), although we won’t prove it.

Consider any protocol \(\Pi\) (taking any number of rounds) involving 3 parties F, L, and M, where F is the leader. Suppose for the sake of contradiction that \(\Pi\) is a 3-party protocol that achieves Byzantine Broadcast.

Note that each party is expecting to talk to two others – let’s say one party on its left and another party on its right.

So in order to describe the protocol \(\Pi\), it suffices to write computer programs that explain how each of F, L, and M react to messages received from its left or right neighbors. The programs should codify what the party computes, who to talk to next, and what message to send.

Now let’s do something strange: let’s say there are actually 6 parties arranged in this hexagonal grid.

As far as each individual party is concerned, to them it looks the same as before: they can talk with one party on their left, and another party on their right. So these parties can also execute the protocol \(\Pi\)!

Suppose we initiate this network with F sending the message 0 and F' sending the message 1. What happens next?

Impossibility without signatures, formally (image source: Foundations of Distributed Consensus and Blockchains)

Using the consistency and validity rules, let’s look at what happens to the two parties on the top left of the hexagon, L and M, when one party is malicious.

Suppose that the leader F is malicious and both L and M are honest. Then, by the consistency property on the L-M-F triangle it must be the case that M and L decide the same value.
Suppose that L is malicious. Since the leader F is honest, in the same L-M-F triangle it must be the case that M decides 0.
Suppose that M is malicious, and now look at the M-L-F' triangle instead. Since the leader is honest, by the validity property it must be the case that L decides 1.

But these three statements are in contradiction! So no such protocol \(\Pi\) can exist that satisfies consistency and validity. This completes the proof.

This contradiction would not have been possible if there were a PKI. In this case, we could not have two parties F and F' both acting in the role of the leader, because the honest parties could verify signatures to disinguish between the two.

To recap: Byzantine Broadcast becomes much harder without trusted setup like a PKI.

With setup, the Dolev-Strong protocol provides Byzantine Broadcast with any number of malicious parties \(f \leq n\) (and the guarantee is meaningful even if \(f \leq n-2\)).
Without setup, the best one can possibly hope for is a Byzantine Broadcast protocol with \(f < n/3\) malicious parties.

10.3 Constructing Byzantine Broadcast Without Setup

This leads to the question: so can we actually construct a Byzantine Broadcast protocol with \(f < n/3\) malicious parties?

The good news is that the answer is yes.

In the remainder of today’s lecture, we will construct a Byzantine Broadcast protocol that provides termination, consistency, and validity in the following setting:

Synchronized network
Any number of parties \(n\), of which \(f\) are malicious with \(n \geq 3f + 1\)
Authenticated pairwise channels (i.e., you know who you’re talking to), but no PKI or other setup

Here is the high-level idea of the protocol that we will construct today, which is called the phase-king protocol.

Before the protocol starts, the parties are open-minded about what the output value will be (either 0 or 1).
Starting with the leader, each parties gets a chance to broadcast a value.
The parties check to see if they have reached consistency on the value. If so, they keep this value no matter what happens in later rounds. Otherwise, they remain open-minded.

In order to turn this idea into a fully-specified Byzantine Broadcast protocol:

We will first construct a sub-protocol called Gradecast.
We will then use Gradecast as a useful building block within the full phase-king protocol.

10.4 The Gradecast protocol

Let there be \(n\) parties, which we label as \(P_1, P_2, \ldots, P_n\). The Gradecast protocol is not about broadcasting a value. Instead it is about checking or verifying whether the parties have reached consensus about the value.

Each party \(P_i\) has an input value \(v_i \in \{0, 1\}\).
Each party has two outputs: a (potentially changed) value \(v\) and a \(\textit{grade}\) that reflects their knowledge about the value.

The grades reflect party \(P_i\)’s belief in the mutual knowledge of \(v\)… in the same way that the teacher provided mutual information to the muddy children. Here are the possible grades.

2: Party \(P_i\) is confident that all honest parties know same value \(v\) (even if others are not yet equally confident).
1: Party \(P_i\) believes that there are many honest parties who know the value \(v\), but isn’t sure yet whether this value is known to all honest parties.
0: Party \(P_i\) has no confidence in its current value.

Our goal is to design a protocol that allows the parties to reach similar levels of confidence, in the sense that their grades will differ by at most one.

In more detail, we want to design the Gradecast protocol to have the following security guarantees.

Knowledge of Agreement: If an honest party outputs a value with grade 2, then all honest parties output this value. That is: we will have mutual knowledge, even if some parties are not yet convinced.
Validity: If all honest parties have the same input value, then all of them output this value with grade 2. In this case, we will have a higher degree of common knowledge of \(v\) – every honest party knows it, and knows that the other parties know it.

Here is how we can build a Gradecast protocol.

This protocol is shown from the perspective of a single honest party \(P_i\). Remember that all honest parties perform the same procedure, whereas malicious parties can do whatever they want.

input: each party P_i holds a message v_i

Round 1: party P_i sends v_i to all parties (including party P_i itself)
Round 2: if party P_i receives the message b from n-f distinct parties in round 1, 
            then send b to all parties (including party P_i itself)
            else send nothing

output: if n-f distinct parties sent b to party P_i in round 2,
                then output b with grade 2
        else if f+1 distinct parties sent b to party P_i in round 2,
                then output b with grade 1
        else output v with grade 0

At first glance, it might appear that this pseudocode is not well-defined.

In round 2: what if an honest party hears two different messages \(b\) and \(b'\) from \(n-f\) distinct parties in round 1? Then it isn’t clear what party \(P_i\) should do, since we would like honest parties only to send one message in round 2.
At the output stage: what if an honest party \(P_i\) hears two different messages \(b\) and \(b'\) from \(n-f\) distinct parties in round 2? Or what if \(P_i\) hears two different messages from \(f+1\) distinct parties in round 2? Then it isn’t clear what party \(P_i\) should do, since parties are only allowed to reach one output decision.

Fortunately, it turns out that all of these issues will never happen!

Question. Can you see why?

As a hint: remember that \(n \geq 3f + 1\), and that the \(n-f\) honest parties (whoever they are) will send the same value to everyone in a particular round. Only the \(f\) malicious parties might send different messages to different parties.

Analysis of the Gradecast protocol

Let’s show that this Gradecast protocol achieves validity and knowledge of agreement, as long as \(n \geq 3f+1\).

We begin with validity, which if you recall is defined as:

If all honest parties have the same input value, then all of them output this value with grade 2.

Theorem. The Gradecast protocol satisfies validity.

If everyone starts with the same message \(v\) and \(n−f\) parties are honest, then at least \(n-f\) parties will send \(v\) in round 1.
As a result, at least \(n-f\) parties will send \(v\) in round 2.
Hence, all honest parties output value \(v\) with grade 2.

It is slightly more complicated to prove the knowledge of agreement property. As a starting point, let’s prove the following lemma.

Lemma. No two honest parties send conflicting round 2 messages.

Proof of lemma. Suppose for the sake of contradiction that there exist two honest parties \(P_i\) and \(P_j\) such that:

Honest party \(P_i\) sees \(n−f\) round 1 messages for \(v\), and
Honest party \(P_j\) sees \(n−f\) messages for a different value \(v' \neq v\).

By the pigeonhole principle:

Any two sets of \(n - f\) parties (out of \(n\) total parties) must have at least \(n - 2f\) parties in common.
Since \(n \geq 3f+1\), note that \(n - 2f \geq f+1\) so the two honest parties must have heard from at least \(f+1\) parties in common.
Since at most \(f\) of these parties are malicious, at least one of them must be an honest party \(P_h\).

However, honest parties send only one value in round 1. So this is a contradiction, and it must not be the case that two honest parties can receive enough conflicting round 1 messages in order to send conflicting round 2 messages.

Now we can prove the knowledge of agreement property, which states:

If an honest party outputs a value with grade 2, then all honest parties output this value.

Theorem. The Gradecast protocol satisfies knowledge of agreement.

If an honest party has grade 2, then it sees at least \(n−f\) round 2 messages.
In this case, all honest parties see at least \(n−2f=f+1\) round 2 messages. (Only the \(f\) messages from malicious parties could be different.)
Moreover, from the Lemma above, we know that there cannot be \(f+1\) round 2 messages for any other value.
As a consequence, all honest parties output \(v\), with a grade of either 1 or 2.

10.5 The Phase-King protocol

Finally, let’s construct a new Byzantine Broadcast protocol based on Gradecast.

This new protocol is called the phase-king protocol. The name comes from the fact that the main idea is for everyone to take a turn being the “king.” (There are actually many variations of the phase-king protocol; I will present just one of them in this class.)

The phase-king protocol proceeds in a sequence of \(f+1\) phases, each of which requires 3 rounds of interaction.

In each phase, one party is chosen to be the king and gets to select a value \(v\) to send to the others. The real leader \(P_1\) is the first king, and then party \(P_2\), then \(P_3\), and so on. When a party receives this value:

If the party is already happy with their value from previous phases – meaning that they have a Gradecast grade of 2 – then they ignore the current king.
Otherwise, the party adopts the value sent by the king.

Here is some pseudocode that describes this protocol in detail. It is written from the perspective of a single party \(P_i\). As always: each honest party runs the same code.

We assume that they have already agreed upon a convention for which party to label as \(P_1\) (this should be the leader), \(P_2\), \(P_3\), and so on.

input: v  # at the start, only the leader has an input; everyone else is clueless

# initialize the variable grade
grade = 0

for each phase j from 1 to f+1:
    
    # first round of the phase
    party P_j sends its value to all parties (other parties send nothing)
    if grade < 2,
       then v := value received from party P_j
        
    # remaining two rounds of the phase
    (v, grade) := Gradecast(v)
        
output:
    at the end of round 3(f+1), output v

This protocol achieves our goals for Byzantine Broadcast.

Theorem. The phase-king protocol achieves termination, validity, and consistency.

Proof. This protocol clearly terminates after \(3(f+1)\) rounds.

To see that validity holds:

If the leader is honest, then in the first round it broadcasts the same value \(v\) to everyone.
As a result, the validity property of the Gradecast protocol ensures that everyone outputs \(v\) with a grade of \(2\).
Thereafter, the honest parties ignore the proposals made by subsequent kings in the remaining phases. Their final decision will be \(v\).

Proving the consistency property is a bit more complicated. Recall that we anoint \(f+1\) different kings, in sequence. Since there are only \(f\) malicious parties, at least one king is honest.

Let \(P_j\) denote the first honest king. We’re going to split our discussion into three parts.

The end of phase \(j-1\) (before \(P_j\)’s turn as the king):

At this stage, it might be the case that no honest parties currently have a grade of 2. In this case, they will be open-minded and listen to the honest king.
Or it might be the case that some honest party has a grade of 2 already. In this case, Gradecast’s knowledge of agreement property guarantees that all honest parties (including \(P_j\)) also have the same value.

During phase \(j\):

The honest king \(P_j\) will send its value to all parties. Anyone who didn’t already have a grade of 2 will switch to it. If anyone did have a grade of 2, then there’s no need to switch because the honest king sent the same value.
The validity property of the subsequent Gradecast protocol ensures that, after phase \(j\), all honest parties will have the same value \(v\), with a grade of 2.

In the remaining phases:

A malicious king could try to send an incorrect value \(v'\).
But the honest parties already have a grade of 2, so they won’t listen to the remaining kings.

This completes the proof of consistency.

Efficiency of phase-king

Let’s conclude today by looking at the efficiency of the phase-king protocol. It may have been (relatively!) easy to describe, but unfortunately it is not the fastest protocol at performing the task of Byzantine Broadcast.

Phase-King requires \(3(f+1)\) rounds of interaction. This is a factor of \(3\) times slower than the best possible outcome of \(f+1\) rounds (and this lower bound follows from precisely the same issue with common knowledge that Dolev-Strong had).
Phase-King requires \(O(n^3)\) messages to be exchanged between the parties. It turns out that there are alternative ways to perform Byzantine Broadcast that only require \(O(n^2)\) messages – that is, about the number of messages required for each of the parties to communicate with the others.

Finally, let’s compare the Dolev-Strong protocol with the new Phase-King protocol, in terms of the underlying assumptions:

	Dolev-Strong	Phase-King
Synchronous network	YES	YES
PKI	YES	NO
Number of malicious parties	\(f \leq n\)	\(f \leq \frac{n-1}{3}\)
Number of rounds	\(f+1\)	\(3(f+1)\)

Next time, we’ll discuss how to solve the broadcast problem when the network is unreliable and we cannot rely on the synchrony assumption either.