Lecture 16: Full Disk Encryption
Announcements
- This week’s reading:
- AMPH sections 2.1-2.2 (pages 9-19) (BU login required)
- Aumasson, pages 7-12
- HW6 is due Friday 3/28 at 10pm
- Challenge problems 3-4 are due Sunday 3/30 at 10pm
Learning outcomes
By the end of today’s lecture, you will learn:
- How to protect data confidentiality on a computer or phone with sector- or file-level encryption.
- How to derive a password from a key, and how to protect one key using another key.
Review from last lecture
“Cryptography is about communication in the presence of an adversary.”
– Prof. Ron Rivest
Encryption is one of the main tools that we use to protect data confidentiality. The goal of Part 3 of this course is to study encryption and its role in:
- Protecting data at rest on your laptop or phone
- Protecting data in transit over the Internet or messaging app
Last time, we studied pseudorandom permutations \(f: \{0, 1\}^n \to \{0, 1\}^n\), which are deterministic, fixed-length functions that act “as if” they are random.
The most common pseudorandom permutation used in practice is the Advaned Encryption Standard (AES).
We also studied the sponge function design, from which we constructed many symmetric crypto primitives.
The SHA-3 hash function family.
Password-based hash functions that are deliberately slow.
Symmetric key encryption, which allows two parties to send messages to each other with confidentiality.
A message authentication code, which allows two parties to send messages to each other with integrity. It is the symmetric-key version of a signature scheme, where each party can both sign and verify.
A combined primitive called authenticated encryption that performs both of these tasks at once.
Definition. Authenticated encryption with associated data (AEAD) contains three algorithms:
KeyGen
: randomly choose \(k\), as usualEnc
\((k, P, A, N)\): produces a ciphertext/tag \(C\) of length \(|C| > |P|\)Dec
\((k, C, A, N)\): returns the original private message \(P\)
“If you have to perform any cryptographic operation before verifying the MAC on a message you’ve received, it will somehow inevitably lead to doom!”
– Moxie Marlinspike
16.1 Protecting data at rest
Our goal for today is to study encryption of data on your laptop or phone.
All major operating systems support disk encryption, often by default.
Operating system | Product | On by default since… |
---|---|---|
Windows | Microsoft Bitlocker | Windows 8 (2012) |
Mac OS X | Apple FileVault | OS X Yosemite 10.10 (2014) |
Linux | Linux Unified Key Setup (luks) | |
Multi OS 3rd party tools | TrueCrypt, SecureDoc, … | |
iOS | [built into OS] | iOS 8 (2013) |
Android | [built into OS] | Android 6.0 (2015) |
Scenario
Our objective is to protect data confidentiality against an adversary Mallory who temporarily obtains physical access to Alice’s laptop or phone.
Here is the scenario with Alice and Mallory that we will consider today.
Alice | Mallory | Future Alice | ||
---|---|---|---|---|
|
\(\longrightarrow\) |
|
\(\longrightarrow\) |
|
Alice’s actions
- Knows a symmetric key \(k\) (say, one that is derived from her password using a slow hash function)
- Encrypts each sector or file of her hard drive using \(k\)
- Inputs \(k\) every time she boots or unlocks the device
Mallory’s capabilities
- Steals the device while it is powered off
- Reads (and potentially modifies) any location of the encrypted drive
- Potentially we might even give Mallory the superpower to ask Alice to decrypt a few locations of the drive
Our objective: even with these powers, Mallory still cannot
- Read the plaintext data (except by asking Alice)
- Tamper with any sector or file of data, without detection
Non-objective: Mallory can replay earlier contents of the hard drive. We will not be able to use crypto to stop this.
We will consider two options.
- We control the hardware and can encrypt sectors of the hard drive.
- We control the operating system and can encrypt files on the hard drive.
16.2 Sector-level encryption
Imagine you are the manufacturer of the hard drive and other hardware. Your goal is to build a cryptosystem that automatically encrypts the drive.
Some terminology: the basic unit of data within a hard drive is called a “sector.” Sectors have a fixed length specified by the drive manufacturer.
- Common options are 512, 520, 2048, 4096 bytes
- Total number of sectors depends on the drive’s total storage
Here are the 4 types of hardware that we are going to consider today.
Remember that having the secret key gives someone the power to decrypt and read a message. With that in mind, whenever you see encryption used in practice, the first question you should ask is “where are the keys?
Question. In sector-level encryption, where should we store the symmetric key?
An appealing, but ultimately incorrect, idea is to store the symmetric key in hard drive. After all, we already use the hard drive to store everything else.
Question. Why is this a mistake?
Instead, a better workflow is to store the key in a tamper-resistant piece of hardware when the computer is off, and move the key into the RAM when the computer is on.
Sector-level encryption uses a special kind of symmetric encryption scheme, called AES-XTS. Why is that?
- Ideally, we would want to use authenticated encryption.
- But we cannot do so for space reasons: after we encrypt the sector, there’s no extra space to put the MAC.
- Still, we want to do something to mitigate tampering attacks.
- Within 1 sector: want encryption to provide some resistance to tampering.
- Between sectors: encryption should work “independently” for each sector, so Mallory cannot move ciphertexts between sectors.
- Recall our non-goal: Mallory can perform a replay attack to restore a single sector to an earlier version.
- One idea within AES-XTS is to pseudorandomly generate the one-time pad as a function of the block sector number
Password-based key derivation
How does the Trusted Platform Module work?
It computes the symmetric encryption key \(k\) at boot time from:
- The state of the machine, to prevent booting into a malicious OS that steals data
- Alice’s password, using a password-based key derivation function called PBKDF2
Importantly: when the device is powered off, it lacks the ability to derive the key!
Recall that in the last lecture, we discussed the idea of a password hash function that is deliberately slow to frustrate brute-force attackers.
Password hashing is useful when your goal is to compare equality of the current password attempt with the correct password provided during account registration.
By contrast, in today’s lecture we want to do something stronger.
- We do not merely want to check whether the password is correct.
- Instead we want to use it to derive a symmetric key!
Concretely, we want a function that maps a password \(P\) \(\mapsto\) a symmetric key \(K\).
- We call this primitive a password-based key derivation function.
- The current NIST standard is called PBKDF2.
This is the best way to derive symmetric keys from a login procedure that is non-interactive (i.e., runs on your computer, rather than with a web server) because the device never stores its own long-term secrets.
- Alice must choose a strong enough password to stop Mallory from brute-forcing \(K\) given enough CPU + RAM and the hashed password file in
/etc/passwd
- Offline dictionary attack: Mallory can perform this brute-force attack on her own computer beforehand
PBKDF2 works as shown on the right. It uses a pseudorandom function (PRF) and has four inputs.
- \(P\): Alice’s secret password, which is used as the “secret key” for the PRF (even though it is not actually a key)
- \(S\): public salt (aka nonce), which is used for variety (the derived keys on two computers are different, even if they used the same password)
- \(C\): iteration count
- \(len\): desired key length
The output is the symmetric key \(MK\).
Common PRFs used in practice are AES and HMAC-SHA2 (which is just SHA-2 with some important but minor tweaks).
Key wrapping
What if several keys should grant access to the data on the hard drive?
- Different accounts on one machine (each with their own password-derived key)
- Recovery key stored in a separate, safe place
However, we do not want to encrypt the entire drive several times.
\[ \operatorname{Enc}_{\text{K1}}(\text{sector}) \quad \operatorname{Enc}_{\text{K2}}(\text{sector}) \quad \operatorname{Enc}_{\text{K3}}(\text{sector}) \]
Question. What can we do instead?
Key wrapping is a special kind of encryption that allows us to protect one key under another.
- For each sector, we can generate an ephemeral (one-time) key to encrypt the file.
- Then we can wrap this key under all of the password-derived keys for all accounts that are authorized to read the file.
\[ \operatorname{Wrap}_{\text{K1}}(ek) \quad \operatorname{Wrap}_{\text{K2}}(ek) \quad \operatorname{Wrap}_{\text{K3}}(ek) \quad \operatorname{Enc}_{\text{ek}}(\text{sector}) \]
There are two downsides to this approach:
- If a file is shared between different accounts, we will need public key encryption to encrypt to the other parties even if we do not know their password-derived secret keys.
- The encrypted data plus the keywraps no longer fit inside of the original sector.
16.3 File-level encryption
For the rest of today’s lecture, we will explore how modern operating systems encrypt data at the filesystem level. This approach is more flexible because we don’t have to fit the encrypted data back into a fixed-size sector.
As a case study, we will examine the design of Apple’s data encryption system on iPhones.
This part of the lecture is based on the descriptions in Apple’s security guide.
They have provided full disk encryption since iOS 8, with TPM-like hardware protection of key material since the iPhone 5s (released in September 2013)
File-level encryption on other phones and laptops works similarly.
Core idea: data should only be decryptable in the right context.
- Encrypt each file before writing it to the drive (and decrypt before reading).
- Hardware provides some protection against brute-forcing passwords.
Bag of keys
The iOS file-level encryption system uses a lot of symmetric keys.
Let’s work our way through this image from right to left. There is (at least) one symmetric key for each file that is stored on the drive.
“Every time a file on the data volume is created, Data Protection creates a new 256-bit key (the per-file key) and gives it to the hardware AES Engine, which uses the key to encrypt the file as it’s being written to flash storage.
On A14 through A18 and M1 through M4 devices, the encryption uses AES-256 in XTS mode, where the 256-bit per-file-key goes through a Key Derivation Function (NIST Special Publication 800-108) to derive a 256-bit tweak and a 256-bit cipher key.”
– Apple Platform Security guide (link)
Since Apple controls both the software and hardware:
- Their Secure Enclave includes a hardware component to generate keys at random.
- Their fast AES engine is placed between the RAM and hard drive.
Moving left, each of the file keys is itself key-wrapped using a “class key,” which determines when the file needs to be readable and writable.
Availability | Example | Key erased if phone is… |
---|---|---|
Always | SIM PIN | Wiped |
After 1st unlock | WiFi password | Shut down |
When unlocked | Browser bookmarks | 10s after unlock (without biometric) |
When locked | Incoming email | (works differently) |
This keywrap goes in the file’s metadata. Whenever the file is read, the filesystem is designed to retrieve this metadata along with the encrypted file contents.
Question. How does the phone get the class keys?
Remember our mantra for today: derive keys rather than storing them! Once again we use keywrapping to protect the class keys with a passcode key that is derived from:
- The alphanumeric PIN used to log into the phone.
- A unique string fused into the chip at manufacture time, unknown outside Secure Enclave.
Note that there exist other ways to derive the class keys, in order to enable:
- Biometric-based login (TouchID and FaceID)
- Overnight iCloud backups
- Corporate device management
- Customer service (i.e., the company helping you to regain control of your data if you forget your password)
- …
There are two countermeasures built into the phone in order to make brute forcing the PIN as difficult as possible.
- Crypto: use 10,000 iterations of PBKDF2 to derive the key (~80 ms per guess)
- Hardware: pause between tries, and optionally wipe the phone
Apple vs. FBI case
Remember that encryption is a powerful tool with immense social consequences.
“Cryptography rearranges power: it configures who can do what, from what.”
– Prof. Phillip Rogaway (UC Davis)
Let’s look at some of the legal questions raised by the use of encryption.
- Can the police force you to decrypt your device?
- Can the police force the device manufacturer to decrypt your device?
The latter question gained significant attention in early 2016, when the FBI wanted data on a locked iPhone 5c in its possession, but they did not have the PIN.
Remember that key
= pbkdf2(pin
, uid
), so to read the files they could:
- Brute-force the
pin
on the phone itself, or - Pry open the phone to find the hardware
uid
and then brute-force thepin
on a faster computer.
FBI wanted Apple to modify its operating system to enable a brute-force search of the PINs:
- Allow PINs to be submitted via external interface, not by hand
- Remove the delay between incorrect guesses (not the crypto delay, but the other one)
- Remove the “poison pill” wiping of the phone after 10 misses
Note: the iPhone 5c does not have a Secure Enclave, so its delay is software-enforced rather than hardware-enforced.
Question. If these are software-only protections, why couldn’t the FBI change the operating system code on its own?
Here we need to discuss the crypto used for updates to the operating system. Since the code itself is not a secret, here the goal is integrity.
The server’s digital signature contains:
- Device ID to personalize the response to this particular phone
- Nonce to connect the response to the initial request, in order to prevent replay attacks
Many (though not all!) automatic software update systems for operating systems and individual programs work similarly.
The FBI wanted Apple to produce and digitally sign an updated “GovtOS” without the delay and poison pill. Only Apple could create this software update, or else the phone would reject it.
This court case was never fully resolved.
- The FBI withdrew their request after finding a company who could break into the phone without Apple’s assistance.
- While we do not know the exact methods used with that specific phone, here is a Vice magazine article that describes a device and procedure that law enforcement agencies in the United States use to decrypt other phones.
We will discuss more interactions between cryptography and the law, often informally dubbed the Crypto Wars, in Part 5 of the course.