Lecture 9: Byzantine Broadcast

Announcements

Challenge problems 1 and 2 due on Sunday 2/23
HW 4 will be released tomorrow and due next Friday 2/28
Tej’s Friday office hours are moved to 2-3:30pm

Learning outcomes

By the end of today’s lecture, you will learn:

How to broadcast a message \(m\) from one party to many others, such that everyone receives \(m\) and knows that everyone else received \(m\) too.
A formal specification of the broadcast problem, and the different kinds of assumptions we might make about the parties and the network connecting them.
Our first solution to the broadcast problem, called the Dolev-Strong protocol.

Review from Unit 1

Bitcoin was invented in 2009 by an anonymous party (or parties) holding the pseudonym Satoshi Nakamato. It relies on the following assumption:

The majority of the computational power used for mining (at any given time) is controlled by honest miners.

If this assumption holds, then Bitcoin achieves the following two security goals:

Liveness: Every transaction is eventually finalized by all honest nodes.
Safety: Honest nodes do not finalize different blocks at the same height.

The most important property of the Bitcoin blockchain is that all participants eventually reach agreement over the state of its public ledger, called the blockchain.

Remember: in this course, the term “agreement” means that everyone agrees on what you said… not that they share the same opinion.

Agreement is a deceptively powerful property. It means that everyone in the Bitcoin network has common knowledge, in the sense of the Muddy Children puzzle. If Alice sends 1 coin to Bob on the Bitcoin blockchain, then:

Bob knows he received 1 coin.
Everyone else knows that Bob received 1 coin.
Bob knows that everyone else knows that Bob received 1 coin (and so on…)

The common knowledge property is why Bob is convinced that the rest of the world will declare it valid when Bob tries to spend the coin later. And this is true even though Bob doesn’t communicate with the entire world directly.

9.1 Examining the Components of Bitcoin

At its core, Bitcoin carefully constructs and combines two components:

Broadcast: A connected peer-to-peer network of Bitcoin nodes and miners that gossip any transaction or block they hear, in a way that is censorship-resistant.
Leader election: A randomized lottery system to decide which miner gets to create the next block.

We have spent the past few lectures discussing how to record transactions within a blockchain using Bitcoin’s proof of work.

As long as the majority of computational power is controlled by honest miners, then everybody’s view of the blockchain is eventually consistent.
But relying on the honest majority of computing power is admittedly somewhat strange. It would be nice instead to rely on the honest majority of people: that is, that there are more good people than malicious people in the world.

The reason that Bitcoin requires an honest majority of computing power is to protect against Sybil attacks, in which one malicious person Mallory claims to be many different people in an attempt to increase its chance of winning the randomized leader election.

But what if we already knew the set of participants ahead of time? For instance, what if we want to create a blockchain among all the students in this class?

Then we wouldn’t need to protect against Sybil attacks; it’s not possible for me to pretend to be 100 different students in order to inflate my power in the randomized leader election, since all of you know that I am just one person.

In Unit 2 of this course, we ask the question:

Can we build a blockchain to achieve consensus on the Internet with no proof of work? No miners? Maybe not even a cryptocurrency?

Remember “blockchain = broadcast + leader election.” So, we will address this question in pieces.

Starting today, we will discuss how to build a censorship-resistant version of the Broadcast protocol. We will tackle leader election in future lectures.

9.2 The Byzantine Generals’ Problem

Today, we will consider a famous problem in computer science called the Byzantine generals problem. It was initially posed in a 1982 paper by Leslie Lamport, Robert Shostak, and Marshall Pease as follows.

“Imagine that several divisions of the Byzantine army are camped outside an enemy city, each division commanded by its own general. The generals can communicate with one another only by messenger. After observing the enemy, they must decide upon a common plan of action. However, some of the generals may be traitors, trying to prevent the loyal generals from reaching agreement.”

Byzantine Broadcast (image source: Wikipedia)

Note that there are two important goals to satisfy in this problem:

The loyal generals must be consistent. They could all decide to Attack, or they could all decide to Retreat. But if the army is divided, then they will fail for sure.
If a particular general (who we will call the “commander”) has some keen insight into whether the army should Attack or Retreat, then the generals should validly obey this order if the commander is honest.

If the generals were all in a common room, such as this classroom, then the problem would be easy to solve:

Leader election: Elect one of the generals to be their commander.
Broadcast: Have the commanding general shout in the classroom whether to Attack or Retreat. All loyal generals will obey this order.

It doesn’t matter who the commanding general is, as long as all loyal generals agree on the result of the election. The commander could even be a traitor! Still we would achieve our desired goal, which is that all loyal generals perform the same action.

To reiterate: we will solve the leader election problem in a future lecture. For now, suppose that has already happened.

Stating the problem

The big challenge here is that the generals are not in a common room; they can only talk via one-to-one communication, and nobody knows for sure whether the commanding general might be giving conflicting orders behind their back.

To state the problem precisely: there are \(n\) generals, they all know each other’s identities and have 1-to-1 communication channels with each other. One of them is already known to everyone to be the commander, and the other generals are called lieutenants.

The commander would like to propose an order that is either Attack or Retreat to all generals, such that: 1. All loyal generals reach the same decision; and 2. If the commander is loyal, then all loyal lieutenants will obey the commander’s order.

If we somehow knew for sure that the commanding general is honest, then this problem is easy! The commanding general can send its order to everyone, and the loyal generals just follow the orders of the commanding general.

However, the problem becomes really hard when the commander is malicious. The commanding general might send different orders to different generals!

Question. What can we do to reach agreement, even if the commanding general is malicious?

One approach might be to add another round of communication in which each general gossips the message that they heard to everyone else. Unfortunately, even this does not solve the problem. That’s because the gossiped messages could themselves be wrong, if they were sent by traitorous lieutenants.

Image source: Lamport, Shostak, Pease 1982

We need to design a more sophisticated protocol that is capable of withstanding a malicious commander!

Defining the problem

Let’s turn back to computers now. Let’s assume that Attack and Retreat are encoded as the bits 0 and 1, respectively. Furthermore, let’s refer to the \(n\) computers as parties, and refer to the commander as the leader.

A Byzantine Broadcast (BB) protocol must satisfy the following two requirements:

Validity: If the leader is honest and sends the bit \(b\), then all honest parties should output \(b\).
Consistency: If two honest parties output \(b\) and \(b'\) respectively, then \(b = b'\). (This property must hold whether or not the leader is honest.)

We will add some more requirements to this definition later.

Remember than an honest party is one who follows the protocol. The other parties are called malicious and can behave however they want; that is, they can deviate from the rules of the protocol in any way that they see fit.

Question. If validity doesn’t exist then consistency is trivially easy. Can you guess why?

If the leader is malicious, then it only matters that the parties are consistent (i.e., there is no validity requirement in this case). The parties can output any bit \(b\), as long as it’s the same bit for all of them.

Also one quick note about the definition: although we will only consider Byzantine Broadcast for a single bit \(b\) of information in this course, the idea can easily be extended to a \(k\)-bit long message by running the protocol \(k\) times.

9.3 Modeling Assumptions

Before we construct a Byzantine Broadcast protocol, we should think about what we believe to be true of the underlying network and participants. For instance:

Can the leader cheat? Can the leader fail to send a message entirely (and can we detect it)?
How long does it take for a message to reach the other parties?
Is the leader’s message even guaranteed to reach all parties?
How many parties are there? How many of them are malicious?
Can we restrict the type of cheating that the parties do? Do we know who the bad parties are?
Do the parties (including the leader) have access to randomness?

Modeling the network

There are two kinds of networks that we could consider for the parties to communicate with each other. (In the Byzantine generals problem, this corresponds to assumptions about how long it takes the messengers to transmit messages between the generals.)

Synchronous network: We have an upper bound on the time it takes for a message to travel between two parties. The protcol is broken into rounds. If a message doesn’t arrive in a specific round then the leader is assumed to be malicious (or traitorous, in the Byzantine generals problem).
Asynchronous network: We have no upper bound on the time it takes for a message to travel between parties. We cannot differentiate between honest slow leaders and malicious parties that choose not to send any messages.

In both network types: if an honest party sends a message, then the message is eventually received.

Number of honest vs. malicious parties

We will only consider security for the honest participants. As a result, we assume that there must be some honest parties/generals in the world. (After all, if everyone is malicious, then who is left to protect?)

Just like with Bitcoin: we can make an assumption that some number \(h\) of the parties are honest, and the remaining \(f = n - h\) parties are malicious (or faulty).

Note that we are not assuming that any one particular party is good, just that there is goodness out there in the world somewhere.

We often categorize a protocol based on whether the majority of participants are honest or malicious.

Honest majority (\(f < n/2\)): More than half the parties are honest. Sometimes we might insist upon an honest supermajority like \(f < n/3\) or \(f < n/4\).
Dishonest majority (\(f \geq n/2\)): Most parties are malicious (i.e., dishonest). In this scenario, we often consider the worst-case option of \(f = n - 1\); that is, there might only be one honest person in the world to protect.

Use of digital signatures

We can also categorize protocols based on how much they take advantage of digital signatures.

Authenticated pair-wise channels: Every party is connected to every other party, and they somehow have a way to know who is communicating with them. In other words: no party can impersonate another, so a party Bob can be confident that he is receiving messages from another party Alice.
Public key infrastructure (PKI): Every party knows the public key of every other party. In this case, not only can Bob know for sure that he is communicating with Alice, but he receives digitally signed messages from Alice and can transfer them along to another party Carol.

You can think of a public key infrastructure as similar to a telephone directory, just for cryptographic public keys rather than telephone numbers.

Telephone directory (image source: Wikipedia)

9.4 Designing a Broadcast Protocol

[The remainder of this lecture follows the Decentralized Thoughts blog.]

The Dolev-Strong protocol solves the Byzantine Broadcast problem in the following settings:

Synchronous network
Dishonest majority (there can be as many malicious parties as Mallory wants!)
Uses authenticated pair-wise channels and digital signatures with a PKI

Let’s number the parties from \(1\) to \(n\). By convention, let’s say that the leader party is #1. Suppose that \(f = 1\) of the \(n\) parties is malicious… however, we don’t know which one is malicious! It could be the leader, or it could be someone else.

As a first attempt, let’s try to implement Byzantine Broadcast for \(f=1\) in \(2\) rounds.

We are going to do something similar to the Commander-Lieutenant 2 round protocol from the Lamport, Shostak, Pease 1982 paper, but we will add digital signatures to the messages in order to distinguish between the two bad cases.

Remember that a Byzantine Broadcast (BB) protocol must satisfy the two requirements:

Validity: If the leader is honest and sends the bit \(b\), then all honest parties should output \(b\)
Consistency: If two honest parties output \(b\) and \(b'\) respectively, then \(b = b'\).

Additionally, because we’re working with a synchronous network then we will add one more (natural) requirement:

Termination: All honest parties eventually complete the protocol and output something. That is, the protocol doesn’t go on forever.

Here is our first attempt at a protocol. It has two rounds of communication, after which each party decides whether to output the bit \(b = 0\) or \(b = 1\).

In the pseudocode that follows, we will use the notation sign(m, i) to mean that “a digital signature of the message \(m\) by party number \(i\).”

Input: leader has a bit b, other parties have no input

Round 1: leader (party 1) sends message <b, sign(b,1)> to all parties

Round 2: if party i received <b, sign(b,1)> from the leader in round 1,
            then add b to set B_i and send <b, sign(b,1)> to all parties

End of round 2: 
         if party i receives <b, sign(b,1)> from anyone
            then add b to B_i

Output:
         if the set B_i contains only a single value b,
            then output b
            else output a default value (say, 0)

In words, here’s another way to describe the protocol:

In the first round, the leader sends its message \(b\) to all parties. At least, the leader will do this if they’re honest… otherwise they might send different messages to different parties!
The other parties only listen in the first round. If they hear a message, then in round 2 they will gossip this message in the second round. Note that their round 2 message <b, sign(b,1)> is the message together with a signature by the leader, not by themselves!
After the 2 rounds are complete, each party checks to see which messages they have received with valid signatures from the leader.
- If they received a valid signature from the leader for a single message (either \(b = 0\) or \(b = 1\)) then they will output that bit.
- But if they have received no valid signatures, or valid signatures of both messages \(0\) and \(1\), then the leader is clearly malicious. In this case, the party defaults to the output of \(b = 0\). (The exact choice of this fallback selection doesn’t really matter; we could have said that everyone would pick \(b = 1\) by default instead. Since the leader is clearly malicious, all that matters is that everyone makes a consistent choice.)

Question. Does this protocol satisfy the Termination, Validity, and Consistency requirements of a Byzantine Broadcast protocol?

This protocol clearly terminates after two rounds. Also it satisfies validity, which says that if the leader is honest and wants to broadcast the bit \(b\), then everyone will hear it. In fact, everyone will hear this message directly from the leader in round 1.

Hence, all honest parties will gossip this message in round 2; the malicious parties might choose not to do this, but that’s okay. Thanks to the unforgeability of the digital signature scheme, the malicious parties cannot forge the honest leader’s signature on the wrong message \(1 - b\).

Ergo: at decision time, the honest parties only hear the leader’s bit \(b\), so they will output \(b\).

It seems like the protocol satisfies consistency too.

Suppose the leader tries to cheat and send two different messages \(b = 0\) and \(b = 1\) to two different honest parties in round 1.
Then in round 2, those honest parties will forward these signed messages to everyone else.
Hence, at decision time, everyone will know that the leader is malicious, and all of them will output the default bit 0.

However, the previous argument is flawed. Here is a counter example involving a malicious leader that will break consistency:

The malicious leader doesn’t send any messages in round 1.
In round 2, the leader sends the bit 1 only to half of the honest parties. Those honest parties will finish with bit 1, but the others will finish with the default bit 0!

The main problem here is that (some of) the honest parties only learned their value in the last round. They haven’t yet had any time to tell the honest parties about the leader’s message, because it was the last round and there’s no more time to send messages.

And we do need for there to be a “last round,” or else the protocol could go on forever and not terminate.

9.5 The Dolev-Strong Protocol

Dolev and Strong show a very elegant way to guarantee consistency, even if the value decided is revealed to some of the honest parties in the last round.

It is based on the idea that for the leader’s message to be approved by honest participants, they need to see the message signed by at least one honest party!

Remember that the parties have no idea who else is honest. The only thing we know for sure is that there are at moest \(f\) malicious parties. So if any message has been approved by \(f + 1\) other people, then by the pigeonhole principle, at least one of them is honest!

As a result, the Dolev-Strong protocol runs for \(f + 1\) rounds. When an honest party receives a message from the leader, it forwards it along and adds its own signature on top of it. In some sense, the honest party is endorsing any message that it receives.

By the last round, an honest party should only accept a message if it sees \(f + 1\) signatures on it.

More specifically: the objective is to (eventually) show all parties a signature chain where the message is initially signed by the leader and then by \(f\) other parties too.

You can think of a signature chain as though each party is appending its name to the bottom of a contract. So you could see:

a message that was initially signed by party 1,
and then that signed document is also signed by party 2,
and then that double-signed document is also signed by party 3, and so on.

Declaration of Independence (image source: Wikipedia)

The honest parties do not know which other people among them are honest.

Even so: if anyone sees a message with \(f+1\) distinct signatures, then they know for a fact that at least one signature comes from an honest party (by the pigeonhole principle). The key idea in the Dolev-Strong protocol is that everyone can rely on that honest party to have made sure that everyone else saw the message too.

The Protocol

To construct the signature chain in a way that satisfies the consistency rule, the Dolev-Strong protocol uses a powerful idea that we will see several times in protocols that use a synchronous network:

The validity of a message is a function also of the time it is received!

Here is the Dolev-Strong protocol. We describe the first round, all other rounds, and then the output decision rule.

Remember that as always, we can only prescribe the actions of the honest parties. Malicious parties can do whatever they want; they don’t have to obey this procedure.

The input and first round remain the same as with our initial attempt.

Input: leader has a bit b, other parties have no input

Round 1: leader (party 1) sends message <b, sign(b,1)> to all parties

Now let’s see how each party \(i\) acts in every subsequent round \(j\) from 2 to \(f+1\).

In round \(j\), an honest party will review the messages it has received from the previous round \(j-1\), and if any message has been validly signed by \(j-1\) other people (starting with the leader) then it will apply its own signature and forward along to everybody else. (Otherwise, an honest party sends nothing in this round.)

Round j: for each message m received by party i in the previous round j-1:
             if m is a valid (j-1) signature chain on v, then:
                    add b to the set B_i
                    send <m, sign(m, i)> to all parties

A \((j-1)\) signature chain is deemed to be valid by party \(i\) if all of the following are true:

All signatures are valid
The signature chain has length \(j-1\)
The first signer of the signature chain is the leader
All signers in the chain are distinct
Party \(i\) is not in the signature chain

Finally, after all \(f+1\) rounds are complete, the output decision rule is the same as in our initial attempt.

Output:
         if the set B_i contains only a single value b,
            then output b
            else output a default value 0

Necessity of \(f+1\) rounds

Question. If we were to modify the protocol above to stop at \(f\) rounds instead of \(f+1\) rounds, can you come up with an attack against it?

Answer. Assume that the input bit \(b\) is 1, and that the leader is malicious. Nobody sends anything for the first \(f - 1\) rounds. Just before round \(f\) finishes, all the malicious parties will send the message \(b = 1\), signed by all \(f\) of them, to one honest party \(p_i\).

This honest party would see it but it won’t be able to send it to any party because the round has finished! That honest party alone would finish with \(1\) while every other honest party finishes with \(0\), so the honest parties have not agreed on a consistent answer.

The actual Dolev-Strong protocol avoids this issue by having \(f + 1\) rounds. This way, an honest party must be part of a signature chain that is sent in the final round.

Analyzing Dolev-Strong

The real Dolev-Strong protocol (with \(f+1\) rounds) satisfies all the requirements of a Byzantine Broadcast protocol.

Termination and validity hold for the same reason as in the first attempt. The protocol clearly terminates after \(f+1\) rounds of interaction, and if the leader is honest then everyone will receive the same message in the first round and begin the process of co-signing it in future rounds.

Let’s reason about consistency.

If an honest party receives a \(k\) distinct signatures for a bit \(b\) at the end of round \(k\), where the first signature is from the leader, then it will send it to all honest parties in round \(k+1\). This holds for all rounds except the final round \(f+1\).
But since the honest party can only receive \(f+1\) distinct signatures for a given bit \(b\) in round \(f+1\) (and not send anything), and a there is least one signature from an honest party \(P_h\), then \(P_h\) must have already sent \(b\) along with the proper signatures to all other honest parties.
Thus, every bit received by one honest party will be received by all other honest parties. If two different bits have been propagated to two different honest parties by a malicious leader, then all honest parties will detect it and default to the bit \(0\).

Amazingly, consistency holds no matter how many malicious parties there are!

If \(f = n\), then everyone is malicious so there are no honest parties to protect. (This case is somewhat boring.)
If \(f = n - 1\), then consistency says that all honest parties must agree on the output. There is only one honest party, and obviously it agrees with itself. (This case is also rather trivial.)
If \(f = n - 2\), then consistency says that both of the two honest parties have agreed on the answer, even if they’re surrounded by \(n - 2\) adversaries all around them! That’s amazing!

In this way, Dolev-Strong’s protocol achieves an even stronger level of security than Bitcoin, which required an honest majority. But this is not an apples-to-apples comparison, since Bitcoin is doing much more than letting one person broadcast a single message to the world.

In order to solve the broadcast problem, the Dolev-Strong protocol did need to make some other strong assumptions:

Knowledge of who the other parties were, so that nobody could conduct a Sybil attack.
Public key infrastructure (PKI), so that signatures could be transferred from one party to another.
Synchronous network, so that parties would know when each round ends and the next one begins.

Next week, we will see other ways to construct a broadcast protocol with fewer of these requirements.