Lecture 7: Bitcoin’s Security Guarantees

Announcements

Homework 3 is posted on Piazza, and is due on Gradescope on Saturday 2/15 at 10pm
Test 1 is next Tuesday 2/18
- Scope: all material through the end of this week that has been covered in lecture, lab, homework, and textbook reading
- Location: discussion lab at the usual time and place (note that next Tuesday follows a Monday schedule)
Optional challenge problems 1 and 2 are posted, and due on 2/23 at 10pm

Learning outcomes

By the end of today’s lecture, you will learn:

How nodes and miners can check the validity of the blockchain (and how light clients can partially check validity with some assistance from the nodes).
The value of reaching consensus and having common knowledge.
How we can formally specify the consensus of blockchains using two concrete guarantees: liveness and safety.

Review from last lecture

Bitcoin is a digital currency that does not require a single, trusted banking authority to manage everybody’s transactions and bank balances.

There are 3 types of data structures: transactions, blocks, and the entire blockchain (which acts as an append-only public ledger).
There are 3 types of participants: light clients, nodes, and miners.

Last time, we showed how to thwart 3 types of potential attacks on Bitcoin.

Censorship resistance is achieved through the randomness of the leader election protocol. All miners have a chance to create a block (proportional to their computing power), so eventually an honest miner will post a transaction.
Consistency is achieved through the rule that everyone should always choose the longest blockchain (or more precisely, the chain with the largest cumulative difficulty). In case there is a transient fork, this rule ensures that everyone eventually reaches consensus over the chain to use.
Double spending is thwarted through the proof of work puzzle. Because the miner’s work is computationally intensive, it is difficult for a miner to overwrite blocks from the distant past.

7.1 Blockchain Validity

Nodes and miners in Bitcoin maintain a local copy of the blockchain. As a result, we require that they check the validity of each new block that is created and gossiped around the peer-to-peer network.

Overall, here is the high-level idea of Bitcoin’s consensus algorithm (adapted from Abhishek Jain’s slides):

New transactions are broadcast to all nodes that happen to be online right now.
Each miner collects new transactions into a block.
After solving a proof of work puzzle, some miner broadcasts its block.
All nodes check that the block is valid (as we will discuss momentarily).
Miners express their acceptance of the block through their willingness to build the next block on top of it.

Transaction validity

For an individual transaction to be valid, it must:

Contain a valid signature (i.e., one that passes the digital signature scheme’s Verify function) from everyone who is sending money.
Not send out more money than the senders have.
Point to a previous transaction where: (a) the sender(s) actually received the money that they are now trying to spend, and (b) this transaction has not yet been spent. (This is why Bitcoin is said to follow an unspent transaction (UTXO) model.)

If I show a single transaction to a light client, then

They can check properties 1 and 2 on their own.
But they cannot check property 3! This requires knowledge of the previous transactions in the blockchain, so only nodes and miners can verify this.

Block validity

For a block to be valid, it must:

Only contain transactions that are valid, as described above.
Contain a winning nonce that satisfies the difficulty rule (i.e., where the hash of the entire block begins with some large number of 0s).
Link to blocks that are themselves valid, all the way back to the first (or “genesis”) block in the blockchain.

(There are some other constraints in the Bitcoin software beyond the ones we’ve described here, but these are the most important ones that we will focus on.)

Block header and transactions (image source: Nakamoto whitepaper)

Once again: if I hand an individual block to a light client, they can check property 2 on their own, but not the rest. Only full nodes and miners can check the validity of all transactions and blocks, since they have stored a local copy of the entire blockchain.

7.2 Common Knowledge

Remember that the most important property of the Bitcoin blockchain is that all participants eventually reach consensus over the state of its public ledger, called the blockchain.

But what exactly does “consensus” mean?

I want to spend some time explaining how this is a deceptively powerful property. It means that everyone in the Bitcoin network has common knowledge.

Common knowledge is a powerful concept in logic.

Consider the following statements. Each one of them is true, but note that each statement is strictly stronger than the previous ones!

I (the instructor) have black hair.
You know that I have black hair.
I know that you know that I have black hair.
You know that I know that you know that I have black hair.
(… and so on)

Muddy Children Puzzle

To see the power of common knowledge, let’s step away from Bitcoin for a moment and consider the following problem.

There are \(n\) children who gather in a circle, after playing in the playground.
Their teacher announces that “one or more of you have mud on your forehead.”
Each child can see if others have mud on their forehead, but cannot tell whether their own forehead is muddy.
The children are not allowed to communicate with each other.

The teacher says, “at this moment, if you know you have mud on your forehead, please step forward.”

Question. How will the children decide whether to respond?

After one minute passes, the teacher says again “at this moment, if you know you have mud on your forehead, please step forward.”

After yet another minute passes, the teacher issues the same call once again.

Question. What happens now? Remember that the children cannot communicate with each other. Their only form of communication is to step forward if they wish to answer that yes, they do have mud on their forehead.

Reasoning about the puzzle

In this puzzle, the children:

Start with the mutual knowledge that “one or more of you have mud on your forehead.”
Develop the common knowledge over time that nobody else is confident in answering the question just yet!

Let’s see what happens in each round.

In the first round:

If a child knows the mutual knowledge but looks around and sees nobody with mud on their forehead, they must be the only muddy child.
As a result, this child will step forward (but nobody else will).

In the second round:

The children have more information! Now they know the mutual knowledge and the fact that nobody was confident to speak in the first round. As a consequence, there must exist at least two children with a muddy forehead!
As a result, any child who only sees one muddy forehead can be convinced that they are the second muddy child, and they will step forward.

Continuing by induction, it turns out that if there are \(k\) muddy children, then:

Nobody will move for \(k-1\) rounds, and then
In round \(k\), all \(k\) muddy children will simultaneously step forward.
The non-muddy children will (correctly) never step forward.

Amazingly, the children have learned so much information from each other without saying anything at all! Remember that the children did not know the value of \(k\) at the start of the puzzle.

7.3 Consensus via an honest majority assumption

In the same way, if Alice sends 1 coin to Bob on the Bitcoin blockchain, then:

Bob eventually knows he received 1 coin, and
Bob eventually knows that everyone else also knows that he received 1 coin, and that they know that he knows that they know, and so on.
This common knowledge is what makes Bob confident that the rest of the world will allow him to spend the coin later. And this is true even though Bob doesn’t communicate directly with the entire world.

Bitcoin achieves this strong idea of consensus based on the following assumption:

The majority of the computational power used for mining (at any given time) is controlled by honest miners.

Here, the term “honest” means someone who runs the unmodified Bitcoin software, plays by the rules, and doesn’t try to double-spend or otherwise cheat the system. (Being greedy and selfish does not disqualify someone from being “honest,” at least in the sense that we will use the term in this course.)

If this assumption holds, then a leader election based on proof of work (eventually) stops double-spending attacks. That’s because a dishonest miner cannot keep up with the raw computing power of the honest majority.
On the other hand: if an attacker actually has the majority of the hash power, then they can solve proof of work puzzles faster than the honest miners and can therefore rewrite history!
Note that this honest majority assumption is different than the “anytrust” assumption that one of the miners is honest, which was all that we needed to ensure censorship resistance.

Another view of forking

To see why Bitcoin needs an honest majority of computing power, let’s use the same forking picture as before, but change the scenario as follows:

The time in the present is actually after block D has been posted.
A dishonest miner wants to revoke a transaction that occurred in the past, in block B.

Bitcoin’s consistency rule (image source: Mango Research)

To do this, the miner will need to post many proofs of work in a row, very quickly. It needs to create \(3 + N\) blocks faster than the real world produces \(N\) blocks.

This is a Markov chain problem!

Nakamoto’s whitepaper includes the following function (which I translated from C++ into Python). It calculates the probability that an attacker who controls a q fraction of the computing power of the network can execute a double-spend attack to overwrite the last z blocks in the blockchain.

# Code adapted from the Bitcoin whitepaper at https://bitcoin.org/bitcoin.pdf

from math import exp, pow

def AttackerSuccessProbability(q: float, z: int):
    p = 1.0 - q
    poisson_lambda = z * (q / p)
    sum = 1.0
    for k in range(z+1):
        poisson = exp(-poisson_lambda)
        for i in range(1, k+1):
            poisson = poisson * poisson_lambda / i
        sum = sum - poisson * (1 - pow(q / p, z - k))
    return sum

q = 0.1
print("q =", q)
for z in range(10):
    print("z =", z, "\t  Prob =", format(AttackerSuccessProbability(q, z),"1.7f"))
print("")
q = 0.3
print("q =", q)
for z in range(0, 55, 5):
    print("z =", z, "\t  Prob =", format(AttackerSuccessProbability(q, z),"1.7f"))

q = 0.1
z = 0     Prob = 1.0000000
z = 1     Prob = 0.2045873
z = 2     Prob = 0.0509779
z = 3     Prob = 0.0131722
z = 4     Prob = 0.0034552
z = 5     Prob = 0.0009137
z = 6     Prob = 0.0002428
z = 7     Prob = 0.0000647
z = 8     Prob = 0.0000173
z = 9     Prob = 0.0000046

q = 0.3
z = 0     Prob = 1.0000000
z = 5     Prob = 0.1773523
z = 10    Prob = 0.0416605
z = 15    Prob = 0.0101008
z = 20    Prob = 0.0024804
z = 25    Prob = 0.0006132
z = 30    Prob = 0.0001522
z = 35    Prob = 0.0000379
z = 40    Prob = 0.0000095
z = 45    Prob = 0.0000024
z = 50    Prob = 0.0000006

This code shows that even if a dishonest miner controls 10% of all the hash computing power on the blockchain, it will only have about a 1.3% chance of succeeding at double-spending a transaction from z = 3 blocks ago.

The attacker can increase its chance of success by buying or renting more computational power.
In response, defenders can resist this attack by waiting for more blocks to be confirmed in the public, honest chain.

An example

Suppose Mallory creates a transaction that sends money to Alice. But Alice doesn’t trust the sender and wants to be 99.9% confident that her new money cannot be taken away through a double spend attack.

The Nakamoto whitepaper includes the following table that shows how many blocks Alice needs to see mined after the block containing the Mallory \(\to\) Alice transaction, as a function of the dishonest miner’s computational power, so that a double-spend succeeds with at most 0.1% probability.

Adversary’s computational power	# blocks required
10%	5
15%	8
20%	11
25%	15
30%	24
35%	41
40%	89
45%	340

There are two important takeaways from this table:

The hash power required to execute a double-spend attack quickly becomes onerous enough to be not worth it, for all but the most expensive transactions. Even if the attacker controls even 10% of the Bitcoin network’s computing power, executing a double spend attack will likely take more than one hour. By contrast, if the miner uses this hash power honestly to create new blocks, it could have earned (in expectation) millions of dollars in mining fees.

On the other hand, if the attacker controls the majority of the hash power, then your money is never safe; even transactions made years ago could eventually be double-spent. This is called a 51% attack.

7.4 Finalization

When does the existence of a transaction become common knowledge? Suppose that Bob sees a transaction on the Bitcoin peer-to-peer network in which Alice sends 1 coin to him.

Bob shouldn’t believe yet that the money is his, because the transaction isn’t in a block yet.
Even after Bob sees the transaction in a block B, he still shouldn’t believe that the money is his because a fork could happen and this block might not wind up on the longest chain.

Image source: Mango Research
Even after Bob sees a few more blocks built on top of the block B containing his transaction, he still might not believe that the money is his because Alice might be trying to run a double-spend attack on the forked chain.

Image source: Mango Research
But surely at some point, Bob gets to believe that the money is his… right?

Definition. A node considers a block to be finalized if it is at least \(k\) blocks deep inside of the longest chain. Also, the node considers a transaction to be finalized if it is part of a finalized block.

In other words: a block is finalized when it becomes common knowledge that it will remain in the blockchain forever.

Overall, Bitcoin’s proof-of-work blockchain achieves the following security goals as long as a majority of the computing power is honest (i.e., follows the rules).

Liveness: Every transaction is eventually finalized by all honest nodes.
Safety: Honest nodes do not finalize different blocks at the same height.

Bitcoin achieves liveness and safety, and therefore blocks on the blockchain eventually do become finalized. The exact choice of \(k\) to use depends on one’s view of:

How much computing power the adversary has.
How low of a probability for double-spend attacks that one is willing to tolerate.

As a simple heuristic, many people consider Bitcoin transactions to be finalized once they are \(k = 6\) blocks deep in the blockchain. At that point, an attacker with 10% of the mining power has only probability \(p = 0.0002428\) of rewriting history.

In order to achieve these goals, Bitcoin is purposely designed to be:

Slow, in that transactions take awhile to become finalized.
Bounded, in that only a small number of transactions can be inserted in each block.
Computationally intensive, in order to distinguish the honest majority from the dishonest minority.

In Unit 2 of this course, we will explore alternative ways to satisfy the liveness and safety goals with different tradeoffs.