Lecture 11: From Broadcast to Blockchains

Announcements

  • Required reading: Shi, chapters 3-5
  • HW 4 is due on Friday 2/28 at 10pm
  • HW 5 will be released tomorrow and is due on Friday 3/7 at 10pm
  • Test 1 has been graded, and revisions are due on Monday 3/3 at 10pm
  • We will grade challenge problems 1-2 soon

Learning outcomes

By the end of today’s lecture, you will learn:

  • How randomness can improve Byzantine Broadcast, at least if we move away from a strict worst-case notion of security and allow a small probability of error.
  • How to design a permissioned blockchain, given any Byzantine Broadcast protocol and fair coin tossing protocol.

Review from last lecture

Bitcoin carefully constructs and combines two components:

  • Broadcast: A connected peer-to-peer network of Bitcoin nodes and miners that gossip any transaction or block they hear, in a way that is censorship-resistant.

  • Leader election: A randomized lottery system to decide which miner gets to create the next block.

In the Byzantine Broadcast problem:

  • The commanding general, called the leader, is trying to send a message to attack or to retreat (or 0 or 1, or any other binary choice).
  • The remaining generals, called parties or nodes, want to reach agreement on the strategy.

There are a total of \(n\) parties, including the leader.

Our goal is to design a protocol – that is, a method that each party can use to decide what messages to send and to whom – that protects the honest parties as follows.

  • Termination: All honest parties eventually complete the protocol and output something (i.e., the protocol doesn’t go on forever).
  • Validity: If the sender is honest and sends the bit \(b\), then all honest parties should output \(b\)
  • Consistency: If two honest parties output \(b\) and \(b'\) respectively, then \(b = b'\).

So far, we have two protocols that provide Byzantine Broadcast in the synchronous setting: the Dolev-Strong protocol and the phase-king protocol.


Dolev-Strong
Phase-King
Synchronous network
YES YES
PKI YES
NO
Number of malicious parties
\(f \leq n\) \(f \leq \frac{n-1}{3}\)
Number of rounds
\(f+1\) \(3(f+1)\)

Here’s a high-level description of how the phase-king protocol works:

  • There are \(f+1\) different phases. In each phase, parties \(P_1, P_2, \ldots, P_{f+1}\) each take a turn being the “king.” (The first party \(P_1\) is the initial sender, and it doesn’t matter which other parties get to be the king as long as everyone agrees which party is #2, #3, and so on.)
  • In each phase: the king sends a message to everyone. Then all parties use the Gradecast protocol to check whether they have reached consensus about the value.
  • If a party ever receives a value with a grade of 2 (full confidence), then they become “stubborn” and keep this value forever. Otherwise, they remain open-minded to any value sent by the next king.

11.1 The power of randomness

Let’s continue our exploration of Byzantine Broadcast in the synchronous model. Recall that we have constructed two protocols that we could use.

  • With a PKI: the Dolev-Strong protocol withstands \(f < n\) malicious parties. It completes in \(f+1\) rounds.
  • Without a PKI: the phase-king protocol withstands \(f < \frac{n}{3}\) malicious parties (and this is the best one can hope for). It completes in \(f+1\) phases, each of which contain 3 rounds, for a total of \(3(f+1)\) rounds.

Both of these protocols achieve our security guarantees of termination, consistency, and validity. But they are admittedly slow to reach consensus.

Can we do better?

Theorem. [Dolev-Strong] No deterministic protocol (even with a PKI) can achieve Byzantine Broadcast in fewer than \(f + 1\) rounds.

We’re not going to prove this theorem formally. But why do you think it might be true?

This issue is easier to see in the phase-king protocol, where each party in turn acts as a “king” for a phase of 3 rounds.

  • In the worst case, it could be the case that our adversary has corrupted the first \(f\) parties. As a result, we cannot be sure that we have learned anything until there is a phase that is led by an honest party.

  • Even more nefariously: our adversary might wait until we decide upon an ordering of the parties, and then adaptively corrupt the first \(f\) parties in the ordered list.

The Dolev-Strong protocol has a similar issue, even though it does not follow the style of electing one leader per phase. And it turns out that every deterministic protocol suffers from the same fate.

But what if we consider a protocol that makes random choices? Can we do better? How could we even take advantage of randomness to foil the adversary?


An assumption: CommonCoin

Let’s imagine that the \(n\) parties can get together to flip a coin.

In the physical world, a coin is a powerful object that provides some nice guarantees.

  • Common view. Once the coin is flipped, everyone can agree on the result.
  • Unpredictability. No coalition of \(f < n\) parties know what the result of the coin will be in advance.
  • Unbiasedness. No coalition of \(f < n\) parties can influence the outcome of the coin. That is, they cannot bias the result away from a 50-50 chance of heads vs tails.

If all \(n\) parties are colluding together, then they can break these properties since they could just set the coin to be the value that they all like. But in this case, there is no honest party to protect.

These properties are very cool! It would be great if we can build such a CommonCoin in the digital world.

If we could do this, it would solve many challenges involving agreement.

For instance: suppose you and your friends are talking on the telephone and trying to decide where to eat dinner. Each of you prefers a different restaurant, and you want to pick fairly between them. A CommonCoin protocol would let you “toss a coin” over the phone.

There is some good news and some bad news here.

  • The bad news: it is impossible to design any computer protocol (that we could run over the Internet) that guarantees these properties for any number of malicious parties \(f < n\).
  • The good news is that we can build a CommonCoin protocol that can withstand a minority of malicious parties, say \(f < n/3\).

We will discuss how to build a CommonCoin protocol next week. For now, let’s just assume that it exists. What can we do with a CommonCoin, beyond choosing a restaurant for dinner?


Integrating CommonCoin into the phase-king protocol

Returning back to our original question: can we build a faster Byzantine broadcast protocol that can complete in fewer than \(f+1\) rounds?

Recall that the phase-king protocol has the following structure.

  • It proceeds in \(f+1\) phases, each of which contain 3 rounds.
  • In each phase, one party is nominated as a “king” (or leader) and gets to broadcast a value \(b\). The parties adopt this value, unless they are already happy with a value sent in a previous phase (based on the Gradecast protocol).

This protocol satisfies:

  • Validity because the designated sender gets to be the first king, and if they are honest then everyone will agree upon their value.
  • Agreement because after \(f+1\) rounds it is guaranteed that there is an honest party.

Question. How can we improve upon this protocol using randomness?

Here is one option:

  • The leader gets to act as the king in the first phase, as usual.
  • In between phases, run the CommonCoin protocol to toss many coins, and use the sequence of heads/tails to elect a new king
  • Then run an iteration of the normal phase-king protocol. The randomly-elected king gets to broadcast a value, and parties will adopt it unless they were already happy in a previous phase.

Question. How many phases of this protocol do we need to run, in order to be convinced that there has been one phase led by an honest party with overwhelming probability?

Let’s consider this question in the setting where \(f = \frac{n}{3} - 1\). That is, there are approximately \(2/3\) honest parties and \(1/3\) malicious parties.

  • The CommonCoin protocol guarantees that the choice of leader is unbiased. So, each phase individually has a \(2/3\) chance of being led by an honest party.
  • If we run for \(k\) phases, then the probability that there are no honest leaders is \(\left(\frac{1}{3}\right)^k\).
  • Hence, the probability of having at least one phase with an honest leader is \(1 - \left(\frac{1}{3}\right)^k\).

This is a powerful observation!

For example: if we want to ensure that we reach agreement with \(n = 3\) million parties, then we have two options.

  • To be 100% certain of reaching agreement, we must run phase-king for about \(\frac{n}{3} = 1\) million phases.
  • To be 99.9999999999% certain of reaching agreement (i.e., up to 1 in a trillion error), we must run 26 phases.

11.2 Broadcast vs. Blockchains

Let’s take state of where we stand in this course so far.

  • We started the semester by exploring the question of how to design a replicated mechanism that reliably records financial transactions. We constructed a mechanism for doing so: a blockchain.
  • Then we delved into the question of how to broadcast a message to a set of parties.

Blocks connected in a chain (image source: Nakamoto whitepaper)

Question. How are these problems related?

There are some similarities between the problems:

  • Both broadcast and blockchains involve the dissemination of data.
  • Both have some notion of consistency: the idea that all parties (eventually) know the data, and they know that everyone else (who is honest) knows the data too.
  • Both protocols are built on top of a network with point-to-point connections; that is, it allows any party to send data to each other party.

But also, there are at least four differences between broadcast and blockchains.

One-shot vs long-running

  • A Byzantine Broadcast is a one-shot protocol: the sender has data, the parties talk for some number of rounds/steps, and then it terminates.
  • By contrast, a blockchain can add new content forever. Our goal is to guarantee that each atomic unit of data, which we call a transaction, is eventually known by everyone.

Computational power

  • Byzantine Broadcast protocols do not require that much computational power. The most computationally intensive operation that some of them perform is a digital signature, but that can be done in less than 1 millisecond on a laptop.
  • The Bitcoin protocol relies upon a proof of work puzzle to achieve eventual agreement. This puzzle consumes a lot of electricity in order to calculate about \(5.6 \cdot 10^{20}\) SHA-256 hash operations every second.

Bitcoin’s SHA-256 hash rate, where an “exahash” is \(10^{18}\) hashes (image source: blockchain.com)

Time to finalize a transaction

  • A Byzantine Broadcast protocol requires many rounds to reach consensus: \(f+1\) rounds for a 100% guarantee, or around 25 to 75 rounds for a probabilitic guarantee.
  • Within the Bitcoin blockchain, most people consider a block to be finalized (probabilistically) after about 6 or so blocks have been created afterward.

Assumptions about the network

  • A broadcast protocol might assume that the underlying network is synchronous (e.g., Dolev-Strong or phase-king), or it may be flexible enough to operate over an asynchronous network (which we will see next week).
  • The Bitcoin protocol works even if the underlying network is asynchronous.

That said, there is nothing that stops us from designing a blockchain that relies upon the assumption of a synchronous network. Let’s do that now.


11.3 Defining a Blockchain

So far we have only seen a blockchain in one context: within the Bitcoin protocol.

Let’s abstract away from what we have seen in the Bitcoin context, and define the objectives of a “blockchain” more generically.

A blockchain has three types of data:

  • A transaction is a single value. It could be a bit like 0 or 1, or a multi-bit message like “Alice pays Bob 1 coin, signed Alice.”
  • A block is an unordered set of several messages, typically ones that are sent around the same time.
  • A log is an ordered list of blocks. This chain is intended to be an append-only log; it only grows at the end, and never shrinks or changes in the middle.

Definition. Let’s define a blockchain protocol in the synchronous setting as any interactive protocol that distributes a log. It must satisfy the following two properties:

  • Safety (also called “consistency” in Shi’s textbook): all parties have the “same” view of the linearly-ordered log. But since the chain may take some time to propagate, it’s acceptable if some parties record a block in their chain before other ones do.

    Formally: for any pair of honest parties \(i\) and \(j\) and at any round \(r\), it must be the case that party \(i\)’s log is a subset of party \(j\)’s log, or the other way around. (Note: these need not be proper subsets; the two parties could fully agree on the log.)

  • Liveness: there is some upper bound \(\Delta\) on the number of rounds that it takes for a transaction to be included in the agreed-upon log.

    Formally: if an honest party receives a transaction \(tx\) at some round \(r\), then by the end of round \(r + \Delta\) all honest parties’ logs must include this transaction.

One more bit of terminology: once a single transaction \(tx\) reaches this point, we say that the transaction is finalized. This means that it has been confirmed and cannot be undone, aka double-spent, from then onward.


11.4 From Broadcast to Blockchains

Let’s consider the following question: starting from a Byzantine Broadcast protocol, can we construct a blockchain?

\(\Longrightarrow\)
[Image source: Wikipedia] [Image source: Nakamoto whitepaper]

For now, let’s limit consider the easiest options of all design parameters:

  • There are a fixed collection of \(n\) parties that participate in the blockchain protocol forever; no party ever enters or leaves the blockchain protocol in progress. And each of the \(n\) parties knows the identity of the other \(n-1\) parties. This is often called a permissioned blockchain. (By contrast, Bitcoin is a permissionless blockchain; anyone can become a node or a miner, and they don’t need to know each other.)

  • Assume the network is synchronous (e.g., it proceeds in rounds that each take ~1 second).

  • You can choose any threshold of malicious parties that you want (well, as long as you can tolerate at least \(f = 1\) party), and you can use a PKI setup if you want.

Question. How might we do this?

Here is one way to build a blockchain, starting from any \(R\)-round Byzantine Broadcast protocol. We will say that the protocol proceeds in several “epochs,” each of which last for \(R\) rounds.

Epoch 1 (containing \(R\) rounds):

  • Party 1 is chosen as the leader. This party collects any transactions that it has heard, and forms a block.
  • All parties execute an \(R\)-round Byzantine Broadcast, with Party 1 as the designated sender.
  • At the end of this epoch of \(R\) rounds: all honest parties have agreed upon some block, and furthermore if Party 1 is honest then the agreed-upon block will be the one that Party 1 chose. Honest parties add this block to their log.

Now repeat this cycle forever:

  • In epoch 2 we elect Party 2 as the designated sender of an \(R\)-round Byzantine Broadcast protocol to add a block of their choice to the chain. In epoch 3 we elect Party 3 as the designated sender to propose a block to add, and so on.
  • In epoch \(n+1\) we go back to the beginning and elect Party 1 as the designated sender. And so on.

Impressively, this simple construction works! You can instantiate it with Dolev-Strong if a PKI is available, with phase-king if there is no setup, or any other Byzantine Broadcast protocol that you want.

Theorem. Let BB denote any Byzantine Broadcast protocol in the synchronous setting, which can communicate messages from a particular set (e.g., one bit, or multi-bit messages) within \(R\) rounds for a network of \(n\) parties and tolerating up to \(f\) malicious parties.

Then, the above blockchain construction satisfies safety and liveness with \(\Delta = O(Rn)\), also for the same \(n\) and \(f\) and the same setup assumptions.

Proof of safety. This follows immediately from the consistency property of the underlying Byzantine Broadcast protocol.

  • The broadcast protocol guarantees that at the end of each epoch, “all honest parties decide the same value” for the block to add to the log in that epoch.
  • As a result, honest parties always have exactly the same log, because the protocol states that honest parties only append to the log at the end of each epoch.

Proof of liveness. To argue this claim, let’s think about how a transaction \(tx\) might be sent by someone in the world to one or more of the \(n\) parties in the permissioned blockchain.

  • If a transaction \(tx\) is sent to a single honest party, then there are at most \(n\) epochs until that party becomes the designated sender and can add the transaction to a block that gets included in the log.

  • If a transaction \(tx\) is sent to all parties, then there are at most \(f+1\) epochs until some honest party is nominated to the role of the designated sender. This party will include the transaction \(tx\) in its block (note: for now we are assuming that block sizes can be unbounded).

Since each epoch takes \(R\) rounds, and we typically think of \(f\) as being a constant fraction of \(n\): in either case it takes \(O(Rn)\) rounds to guarantee that a transaction gets added to the log.


Takeaways

Let’s step back and think about what we have just accomplished.

On the one hand: this is an amazing theorem! It states that if we know how to build Byzantine Broadcast, then in some sense we have solved the problem of making a blockchain and therefore a financial transaction system.

Moreover:

  • We were able to accomplish this goal without relying on energy-consuming computing power or needing massive economic incentives to keep the miners in line.
  • By designing the system in a modular fashion, we can reuse components like Byzantine Broadcast in other situations that require resilient computing, such as fault-tolerant systems in airplanes, cars, power plants, and other critical infrastructure.

On the other hand: this blockchain protocol not that great for a few reasons.

  • It is really slow; that is, it takes awhile for your transactions to get included in a block. If each round takes 1 second and there are \(n = 301\) parties running phase-king with \(f = 100\) malicious parties, then each broadcast requires \(303\) seconds and liveness takes \(30603\) seconds or about \(8.5\) hours for your transaction to be finalized.

    So if you want to buy lunch, the restaurant might force you to sit there all afternoon and evening to make sure your transaction settles before they will let you leave. One nice thing about Satoshi Nakamoto’s invention is that it carefully combines Broadcast and the CommonCoin leader election system in a way that outperforms the generic combination we’ve created today.

  • Also, we made a lot of assumptions along the way that are unrealistic or sub-optimal for a financial transaction system, such as network synchrony and the existence of a permissioned set of \(n\) parties that will handle all financial transactions for the rest of time.

    The good news is that we can withstand any \(f\) of them getting bored and deciding to quit the system… but what happens if more of them want to leave? Or what if they eventually become punch-drunk with power and decide that they want to use this power for nefarious purposes?

    The nice thing about asynchronous, permission-less systems like Bitcoin is that they are more stable in the face of such power. For instance, Bitcoin can tolerate up to 50% of computing power being malicious whereas Byzantine Broadcast can only tolerate 33% malicious parties without a PKI. Moreover, if some people become bad – for instance, if all miners decide to censor Alice – then in a permissionless system like Bitcoin, other honest people can join the system.