Lecture 13: Common Randomness

Announcements

HW 5 is due on Saturday 3/8 at 10pm
Test 1 re-attempts have been graded
Challenge 1-2 have been graded, resubmissions are due Friday 3/21
Enjoy your spring break!

Learning outcomes

By the end of today’s lecture, you will learn:

How your computer generates random numbers.
How a trusted randomness beacon can provide common randomness that everyone agrees upon.
How a collection of people can generate common randomness, without a trusted third party.

Review from last lecture

Last time, we studied Byzantine Broadcast in the asynchronous model, which eliminates the requirement for nodes to have their own clocks. We make two changes to the requirements:

There is no termination property
We add the word “eventually” to the consistency and validity definitions.

In this setting, we saw how to construct Bracha’s reliable broadcast protocol.

	Dolev-Strong	Phase-King	Bracha
Synchronous network	YES	YES	NO
PKI	YES	NO	NO
Number of malicious parties	$f \leq n$	$f \leq \frac{n-1}{3}$	$f \leq \frac{n-1}{3}$
Number of steps	$f+1$	$3(f+1)$	3

12.4 Byzantine Agreement in the Asynchronous Setting

Last week, we learned how to use synchronous Byzantine broadcast in order to build a permissioned blockchain.

Question. Can we use asynchronous Byzantine broadacast to build a permissioned blockchain in the asynchronous setting?

We already know Byzantine broadcast protocols in the asynchronous setting
So we can just use that to build an asynchronous blockchain protocol, right?

Actually, this doesn’t quite work. The problem is that Bracha’s broadcast protocol doesn’t guarantee termination.

The synchronous blockchain protocol that we designed last time relied on the idea that each party takes a turn being the designated leader and creating a new block (first party $P_1$, then party $P_2$, and so on). But in the asynchronous setting, we may never be certain when the epoch where Party 1 is the designated leader has ended, so that we can move on to the next epoch with Party 2 as the designated leader.

At first glance, this looks to be an insurmountable problem. It is actually impossible to build an asynchronous Byzantine Broadcast protocol that guarantees termination. The problem is that the designated leader might simply refuse to participate in the protocol, and the honest parties will wait forever to listen for a message that may never arrive.

In some sense, the problem here stems from the fact that we are putting all of our faith in a single party to serve as the designated leader.

So let’s back away from that. We can define a new, slightly more general question called Byzantine Agreement. This is like Byzantine Broadcast, except that now everyone gets to send a message at the same time.

For this new problem:

There is no longer a commanding general. Instead, every general has their own opinion on whether to attack or retreat.
The goal of the protocol is to ensure that all honest generals make the same decision. Furthermore, if the generals all happened to have the same opinion, then this common opinion should be the end result.

Byzantine Broadcast (image source: Wikipedia)

Let’s modify our properties from Byzantine Broadcast to work in the new, more sophisticated setting of Byzantine Agreement.

Consistency is the same as before:

all honest parties output the same value.

Validity only needs to hold if everyone starts with the same message… or at least, all the honest parties do, since we can never be sure what the malicious parties are doing. Formally, we guarantee that:

if all honest parties start with the same input value, then this shared value is guaranteed to become the output value.

Liveness is the asynchronous version of what we used to call termination. It states that:

all honest nodes output eventually (assuming that messages are delivered eventually, albeit with arbitrary delays)

In the synchronous setting, there are many ways to build Byzantine Agreement. One simple approach is to run $n$ copies of a Byzantine Broadcast protocol (e.g., Dolev-Strong) in parallel, where everyone is the designated leader in one of them, and then take a majority vote of the results.

This works as long as $f < \frac{n}{2}$, since otherwise the malicious parties are in the majority and can overrule the honest parties’ values. (And it turns out that $f < \frac{n}{2}$ is the best that one can do.)

But the asynchronous setting is where things really get interesting. It turns out that:

It is impossible to construct a deterministic protocol that achieves asynchronous Byzantine Agreement, even in the easiest possible setting where there is only $f = 1$ malicious party!

But: using a CommonCoin, it is possible to build a randomized protocol for asynchronous Byzantine Agreement that can withstand up to $f < \frac{n}{3}$ malicious parties!

In other words, we can achieve liveness with Byzantine Agreement, even though it was impossible to accomplish with Byzantine Broadcast. This is because we are no longer relying on a single leader who might just refuse to participate in the protocol; instead, everyone is allowed to contribute equally.

I don’t have time to describe the randomized protocol for asynchronous Byzantine Agreement in today’s class. You will read it for yourself in Chapter 13 of Elaine Shi’s textbook.

Hopefully this insight also provides better context for Satoshi Nakamoto’s contribution. The Bitcoin blockchain also achieves consistency and liveness in an asynchronous network. It does so in a permissionless setting where up to 1/2 of the computing power can be malicious (rather than up to $1/3$ of parties).

This is just one of many amazing applications of a CommonCoin! For the rest of today’s lecture, we will explore in more detail the problem of generating common randomness.

13.1 Importance of Randomness

The ability to generate unpredictable random numbers is going to be crucial to everything we will do in this class after spring break.

Let’s see what can happen when a computer’s source of randomness isn’t actually random.

First, bad randomness can lead to financial loss, for instance in casinos and lotteries.

“The man would parlay a \$20 to \$60 investment into as much as \$1,300 before cashing out and moving on to another machine, where he’d start the cycle anew. Over the course of two days, his winnings tallied just over \$21,000. The only odd thing about his behavior during his streaks was the way he’d hover his finger above the Spin button for long stretches before finally jabbing it in haste; typical slots players don’t pause between spins like that. … the operatives use their phones to record about two dozen spins on a game they aim to cheat. They upload that footage to a technical staff in St. Petersburg, who analyze the video and calculate the machine’s pattern based on what they know about the model’s pseudorandom number generator. Finally, the St. Petersburg team transmits a list of timing markers to a custom app on the operative’s phone; those markers cause the handset to vibrate roughly 0.25 seconds before the operative should press the spin button.” – Wired

“Prosecutors say they have unearthed forensic evidence that shows how a former computer security official for a US state lottery association let him rig drawings worth millions of dollars across five states using unauthorized code that tampered with a random number generator used to pick winning tickets. … A forensic examination found that the generator had code … that directed the generator not to produce random numbers on three particular days of the year if two other conditions were met. Numbers on those days would be drawn by an algorithm that [the computer security official] could predict.” – ArsTechnica

Second, bad randomness can lead to easily-guessable cryptographic keys, because (as we saw with RSA and DSA signatures) the key generation algorithm relies on randomness to create a key from a large space (e.g., 256 bits).

The software engineers of Debian Linux made this mistake from 2006 to 2008, and many Internet-of-Things embedded devices don’t properly generate random numbers.

“Luciano Bello discovered that the random number generator in Debian’s openssl package is predictable. This is caused by an incorrect Debian-specific change to the openssl package. As a result, cryptographic key material may be guessable.” – Debian Linux

“We perform the largest ever network survey of TLS and SSH servers and present evidence that vulnerable keys are surprisingly widespread. We find that 0.75% of TLS certificates share keys due to insufficient entropy during key generation, and we suspect that another 1.70% come from the same faulty implementations and may be susceptible to compromise. Even more alarmingly, we are able to obtain RSA private keys for 0.50% of TLS hosts and 0.03% of SSH hosts, because their public keys shared nontrivial common factors due to entropy problems, and DSA private keys for 1.03% of SSH hosts, because of insufficient signature randomness.” – Heninger, Durumeric, Wustrow, Halderman 2012

There are many other kinds of harm that can arise from bad randomness, such as bias and skewing of probabilistic methods.

For the rest of today’s lecture, let’s see how to generate randomness well.

13.2 Generating Local Randomness

Have you ever wondered how your computer generates random numbers? After all, a computer is just a fancy calculator, and it has been carefully designed to be deterministic and reproducible; if I run a computer program today and tomorrow, I expect to get the same results.

So, how does a computer create unpredictable random numbers?

This is not a trick question; computers do have the ability to generate truly random numbers. For instance, on a Unix-like operating system (e.g., Linux, MacOS, BSD) you can request random data by fetching from a special file called /dev/urandom.

It’s not really a file, but rather a operating system service that will create random data as often as you need, whenever you call for it. You can treat /dev/urandom as if it is a file, and read from it whenever you want.

Here is an example: the terminal command below reads 16 bytes of data from /dev/urandom and prints the result in hexadecimal format.

!xxd -u -l 16 -p /dev/urandom

2D171E40E862E7600D5413AA0221ADC3

If I run this exact line of code again in the terminal, I will get a different answer.

!xxd -u -l 16 -p /dev/urandom

16491C488E947583EFDA538F51126675

Question. If you were designing the computer hardware and operating system, how would you generate random numbers?

“Fortunately, it’s not hard to harvest truly unpredictable randomness by tapping the chaotic universe that surrounds a computer’s orderly, deterministic world of 1s and 0s.” – Taylor and Cox, IEEE Spectrum magazine

Modern computers follow a three-step process to generate randomness:

Harvest megabytes of data that “looks random.”
Extract ~128 bits of uniform randomness from the harvested data. (For now, you can think of this step as applying SHA-256 to the harvested data, although it’s slightly more complicated than this.)
Expand to create lots of pseudorandom data, using a similar technique as Bitcoin’s hierarchical deterministic wallets that leverage one secret to generate many secrets.

For today, I will just focus on the harvesting step. Your laptop and phone have many sensors that can capture random-looking noise.

Physics: EM radiation, temperature
Logical gates: clock drift, thermal noise
Quantumness: beam splitters & polarization, tunneling, entanglement
Measurement sensors: microphone, camera, gyroscope
You, the person using the computer: keystroke timings, mouse movements

13.3 Trusted Centralized Randomness

Okay, now we know how to generate random numbers on your own computer. But what if a group of us all want a CommonCoin between us? Recall from last week that we discussed several nice properties of a physical coin:

Common view. Once the coin is flipped, everyone can agree on the result.
Unpredictability. No coalition of $f < n$ parties know what the result of the coin will be in advance.
Unbiasedness. No coalition of $f < n$ parties can influence the outcome of the coin. That is, they cannot bias the result away from a 50-50 chance of heads vs tails.

One way we can agree about common randomness is to ask a trusted source. As long as the source is trustworthy and won’t collude with the attacker, then we should be okay.

This is exactly what NIST Randomness Beacon provides. Since September 5, 2013, the US National Institute of Standards and Technology (NIST) has been publishing a 512-bit, full-entropy random number every minute of every day.

Each new post is called a pulse. Here is a link to the latest pulse: https://beacon.nist.gov/beacon/2.0/pulse/last

The pulse contains several fields:

outputvalue: The random value that should be used.
timestamp: the time that this randomness has been generated at.
signature: the output value signed under the secret key of NIST (RSA2048).
listValues: The pulse contains references to the previous pulses, establishing a chain of random numbers.

Here is a link to the full NIST specifications.

Three independent different sources of randomness feed entropy into each pulse (another source is the state of the previous pulse).

Question. What’s wrong with using NIST’s random beacon to elect leaders, or to pick what restaurants we should go to?

Nothing is wrong as long as you trust NIST. That’s why it was build for.

Can NIST cheat? Can you think of a way where they can cheat?

NIST can try to precompute many different output using the randLocal values, and throw away the outputs that they don’t like. Thus, they can bias the output.

13.4 Two Party Common Coin

Next, let’s try to build a CommonCoin between two parties Alice and Bob, but this time without relying on a trusted third party.

Remember that limiting trust is a common theme of this course!

“Cryptography is how people get things done when they need one another, don’t fully trust one another, and have adversaries actively trying to screw things up.”

– Ben Adida

In all the topics we explore in this class: we want to lessen our dependency on trusted outsiders, and to the extent that we do require the assistance of others, we want to be able to understand exactly who we are relying upon, and for what.

Back to the CommonCoin problem: if the two parties are physically next to each other, then the problem is easy to solve!

Alice can throw the coin in the air, while Bob watches.

Question. Can we replicate coin tossing in the digital world? If so, how?

Remember the properties of a physical coin that we want to replicate in the digital world.

Common view. Once the coin is flipped, everyone can agree on the result.
Unpredictability. No coalition of $f < n$ parties know what the result of the coin will be in advance.
Unbiasedness. No coalition of $f < n$ parties can influence the outcome of the coin. That is, they cannot bias the result away from a 50-50 chance of heads vs tails.

Answer. This question is actually difficult to solve. Some initial ideas that sound promising do not work, such as saying that Alice should record a video of tossing a coin and send it to Bob. There are several ways for Alice to bias the output of the coin toss.

Alice can use a biased coin.
Alice can select the output she likes. Assume that Alice prefers an outcome of tails. She can create recordings of throwing the coin as many times as necessary until the outcome is tails. She can send only the final video to Bob.

In fact, it turns out that a perfectly fair coin toss between two parties is actually impossible! No matter how you design the protocol, there will always be some potential for bias by one of the two parties.

So let’s try to do the next best thing: find a way to provide a fair coin toss if the parties are honest, and to detect if they ever act maliciously.

Commit and Reveal method

The problem with the previous approach was that the result of the coin toss depends only on Alice. As a result, Alice can flip many coins and select the output that she likes.

What if the final coin output depends on both Alice and Bob’s input equally? Can Alice still cheat?

Let’s consider the following protocol:

Alice picks a bit $b_1$ and a secret salt $s$ at random. Alice sends $h = \text{SHA-256}(b_1, s)$ to Bob.
Bob picks a bit $b_2$ at random.
Alice reveals $b_1$ and the salt $s$ to Bob. Then, Bob checks that Alice told the truth by checking that \[h \overset{?}{==} \text{SHA-256}(b_1, s).\]
If the check passes, then Bob sends his bit $b_2$ to Alice.

Both parties output $b_1$ xor $b_2$.

Question. Is this a good two-party coin tossing protocol?

Let’s try to run this protocol together right now. I will play the role of Alice, and you play the role of Bob.

For step 1 of the protocol: here is the hash of a value I personally committed to previously: 497780967d5a3d0df654d2b9099f6db3f89067008b6e1b5ec02b7e36d63b58ce

Now, Bob needs to select a bit 0 or 1.

Finally, I as Alice need to reveal my bit and salt.

import hashlib
salt = b"this is a salt"
b = b"0"

print(hashlib.sha256(b + salt).hexdigest())

497780967d5a3d0df654d2b9099f6db3f89067008b6e1b5ec02b7e36d63b58ce

Can Alice bias the result of the two-party coin toss?

If she completes the protocol in a way that Bob’s verification check is successful, then no! The first step is a binding commitment to Alice’s bit $b_1$. Due to collision resistance, there is only one opening that Alice knows.
Alice could refuse to complete the protocol, in which case we have no output. The only good news here is that Alice hasn’t learned anything either: she committed to her random value and cannot change it before seeing Bob’s value.

Can Bob bias the result of the two-party coin toss?

If Bob is honest, then he chooses $b_2$ in step 2 before he sees Alice’s bit $b_1$. Bob doesn’t know what value Alice has chosen because the output of the hash function looks random.
But if Bob is malicious, he could wait and choose the bit in step 4, and therefore entirely control the result of the coin toss!
One way to fix this problem is to have Bob send a commitment to his bit in step 2, just like Alice did in step 1.
However, this introduces a new problem of selective failure: Bob knows the result of the coin toss after step 3, whereas Alice will not learn it until step 4. If Bob realizes the result of the coin toss is bad for him, then he can refuse to complete the protocol!

In this way, a malicious Bob can bias the coin toss. As a result, Alice and Bob could use this protocol for secure coin tossing over the telephone or internet, but it is crucial for Alice not to play again if Bob refuses to reveal $b_2$. Otherwise, Bob can bias the coin by refusing to complete the protocol when the output is not in his favor.

13.5 Generalizing Common Coin

We can generalize the commit and reveal method we used in the two party setting to the multi-party setting with $n$ parties. The protocol would work in the following way.

All $n$ parties commit to their random bit and salt.
All $n$ parties supply their input and salt.
The bit of all parties are aggregated using the xor function.

A version of this commit and reveal is used in Ethereum 2.0 as a source of randomness for the blockchain. The randomness is used to do leader election for block proposers among other things (think applications like lottery).

Question. Can you describe an attack on this new CommonCoin protocol?

Here is one problem: in Round 2, an attacker Mallory would wait for everyone else to send their salt. The attacker would compute the output with and without their own input bit. Based on that result, the attacker would choose whether to reveal their bit or abandon the protocol.

We will propose two solutions to this problem.

Economic solution. Let everyone who wants to participate in the randomness generation send money to a smart contract. They can only get their money back if they completed Round 2.

In case the attacker doesn’t reveal their salt properly. The attacker would get slashed and lose money. This approach will work… as long as the economic benefit of biasing the coin is less than the escrowed money that they stand to lose.

Cryptographic solution. A better approach is not to rely on the players to reveal their own commitments. Instead, it would be nice if the group of players can reveal any commitment of any single party. That way, no single party can choose not to reveal it’s own value.

After spring break, we will discuss ways to solve precisely this problem! Specifically, we’ll discuss a tool called Secure Multi-Party Computation that allows the set of $n$ parties to work together to compute over data that they cannot see!

In the specific case of CommonCoin, we will construct a way for each party to commit to its desired bit, such that any threshold of the group can open the commitment… even if the original party who made the commitment later refuses to participate in opening it!

	Dolev-Strong	Phase-King	Bracha
Synchronous network	YES	YES	NO
PKI	YES	NO	NO
Number of malicious parties	\(f \leq n\)	\(f \leq \frac{n-1}{3}\)	\(f \leq \frac{n-1}{3}\)
Number of steps	\(f+1\)	\(3(f+1)\)	3