25.08.01 · computer-science / security

Cybersecurity, encryption, and privacy

shipped3 tiersLean: none

Anchor (Master): Goldreich, Foundations of Cryptography Vol. 1-2; Boneh and Shoup, A Graduate Course in Applied Cryptography; Anderson, Security Engineering

Intuition Beginner

Cybersecurity is the practice of protecting systems, networks, and data from digital attacks. Every time you enter a password, your browser encrypts it before sending it. Every time you make an online purchase, encryption protects your credit card number from eavesdroppers. The padlock icon in your browser's address bar indicates that your connection to the website is encrypted using TLS (Transport Layer Security), ensuring that no one between you and the server can read or modify the data.

Encryption transforms readable data (plaintext) into an unreadable form (ciphertext) using a mathematical algorithm and a key. Only someone with the correct key can decrypt the ciphertext back into plaintext. There are two types of encryption. Symmetric encryption uses the same key for both encryption and decryption, like a shared password. Asymmetric encryption uses a pair of keys: a public key that anyone can use to encrypt messages, and a private key that only the recipient uses to decrypt them. The mathematical relationship between the public and private keys makes this possible: the private key can undo what the public key does, but deriving the private key from the public key is computationally infeasible.

Symmetric encryption is fast and efficient. AES (Advanced Encryption Standard), the most widely used symmetric algorithm, can encrypt gigabytes of data per second on modern hardware. The challenge with symmetric encryption is key distribution: both parties need to share the same secret key, but how do you share a secret key over an insecure channel without an eavesdropper intercepting it?

AES replaced DES (Data Encryption Standard), which was standardized in 1977 with a 56-bit key. By the late 1990s, DES was vulnerable to brute-force attacks: the Electronic Frontier Foundation built a machine called Deep Crack that could break a DES key in under 24 hours for $250,000. Triple DES (3DES), which applies DES three times with different keys, extended the life of the standard but was slow. NIST held a public competition to select AES, evaluating 15 candidates over three years. Rijndael, created by Vincent Rijmen and Joan Daemen, won due to its combination of security, performance, and simplicity. AES supports key sizes of 128, 192, and 256 bits.

Asymmetric encryption solves the key distribution problem. Alice wants to send Bob a private message. She looks up Bob's public key, encrypts the message with it, and sends the ciphertext. Only Bob's private key can decrypt it. Even if an eavesdropper intercepts the ciphertext and knows Bob's public key, they cannot decrypt the message without Bob's private key. RSA, the first practical asymmetric algorithm, was invented in 1977 and remains widely used.

In practice, asymmetric encryption is too slow for large volumes of data. Hybrid encryption combines the best of both approaches. When you connect to a website, TLS uses asymmetric encryption to establish a shared secret key, then switches to symmetric encryption (AES) for the actual data transfer. This gives the key distribution benefits of asymmetric encryption with the speed of symmetric encryption.

A hash function takes any input and produces a fixed-size output called a hash or digest. SHA-256, a common hash function, produces a 256-bit output regardless of whether the input is one byte or one terabyte. Hash functions have three important properties. They are one-way: given a hash, it is infeasible to find the original input. They are collision-resistant: it is infeasible to find two different inputs that produce the same hash. They are deterministic: the same input always produces the same hash. Additionally, they exhibit the avalanche effect: a tiny change in input (even flipping one bit) produces a completely different output hash, making it infeasible to find related inputs by studying the hash output.

Hash functions are used for password storage. A website does not store your actual password. It stores the hash of your password. When you log in, the site hashes the password you enter and compares it to the stored hash. If they match, you are authenticated. Even if the database is stolen, the attacker obtains hashes, not passwords, and cannot reverse the hash to recover the passwords.

Digital signatures combine asymmetric encryption with hash functions to provide authentication and integrity. When Alice signs a message, she hashes it and encrypts the hash with her private key. Bob verifies the signature by decrypting it with Alice's public key and comparing the result to his own hash of the message. If they match, Bob knows the message came from Alice (authentication) and was not modified in transit (integrity).

Digital signatures also provide non-repudiation: the signer cannot later deny having signed the message, because only they possess the private key used to create the signature. This property is essential for legally binding electronic contracts, software distribution (verifying that an update genuinely comes from the vendor), and financial transactions. Code signing certificates allow software developers to sign their applications, and operating systems warn users when running unsigned code.

The CIA triad summarizes the three goals of cybersecurity. Confidentiality: preventing unauthorized access to data. Integrity: preventing unauthorized modification of data. Availability: ensuring authorized users can access data when needed. Cybersecurity threats target one or more of these goals. Encryption protects confidentiality. Hashes and signatures protect integrity. Redundancy and DDoS protection ensure availability.

Common attack vectors threaten the CIA triad in different ways. Phishing attacks trick users into revealing credentials by impersonating trusted entities via email or fake websites. Malware (viruses, worms, Trojans, ransomware) compromises systems to steal data, disrupt operations, or extort payment. Man-in-the-middle attacks intercept communications between two parties, potentially modifying messages without detection. SQL injection exploits improper input validation in web applications to execute unauthorized database commands. Cross-site scripting (XSS) injects malicious scripts into trusted websites to steal user session tokens or redirect users to attacker-controlled sites.

Defense in depth is a security strategy that uses multiple layers of protection. No single security measure is sufficient; instead, organizations deploy firewalls (filter network traffic), intrusion detection systems (monitor for suspicious activity), antivirus software (detect known malware), access controls (restrict who can access what), encryption (protect data at rest and in transit), and security awareness training (educate users about threats). If one layer fails, the other layers provide backup protection. This approach recognizes that breaches will occur and aims to limit their impact.

Authentication and access control are fundamental security mechanisms. Authentication verifies identity through three factors: something you know (password), something you have (security token or phone), and something you are (biometric). Multi-factor authentication (MFA) combines two or more factors, dramatically reducing the risk of account compromise. Access control determines what authenticated users are allowed to do. Role-based access control (RBAC) assigns permissions based on job roles. The principle of least privilege grants users only the minimum permissions needed for their tasks.

Network security protects data as it travels across networks. Firewalls filter traffic based on rules (source/destination IP, port, protocol), blocking unauthorized access while permitting legitimate traffic. Virtual private networks (VPNs) create encrypted tunnels over public networks, ensuring that all traffic between your device and the VPN server is confidential. Intrusion detection systems (IDS) monitor network traffic for patterns associated with known attacks, alerting administrators when suspicious activity is detected.

Application security focuses on preventing vulnerabilities in software. Input validation ensures that user-supplied data conforms to expected formats, preventing injection attacks. Parameterized queries prevent SQL injection by separating SQL code from user data. Content security policies (CSP) restrict which resources a web page can load, mitigating cross-site scripting. Security headers (HSTS, X-Frame-Options) instruct browsers to enforce security policies. Regular security audits, code reviews, and penetration testing identify vulnerabilities before they can be exploited.

Visual Beginner

Concept Type Purpose Example
AES Symmetric encryption Fast encryption of data Encrypting files, disk drives
RSA Asymmetric encryption Key exchange, signatures TLS handshake, email signing
SHA-256 Hash function Integrity verification Password storage, blockchain
TLS Protocol Secure communication HTTPS, secure email
Digital signature Mechanism Authentication + integrity Software signing, contracts

Worked example Beginner

Alice wants to send Bob the message "MEET AT NOON" securely. They use RSA encryption.

Bob has previously generated a key pair: public key and private key . He has shared his public key with Alice.

Alice converts each letter to a number (A=1, B=2, ..., Z=26). The message becomes: 13, 5, 5, 20, 1, 20, 14, 15, 15, 14.

For each number , Alice computes . For example, the first letter M = 13: . (In practice, messages are encoded as larger blocks for security.)

Bob receives the ciphertext and decrypts each value by computing , recovering the original numbers and converting back to letters.

In a real RSA implementation, the modulus is at least 2048 bits (about 617 decimal digits), making brute-force factoring infeasible with current technology. The security of RSA rests on the mathematical difficulty of factoring the product of two large prime numbers.

Hash function example: verifying file integrity. Suppose you download a software installer and want to verify it has not been tampered with. The publisher provides the SHA-256 hash of the legitimate file: a3f2b7c1d9e8... (64 hexadecimal characters, representing 256 bits). After downloading, you compute the SHA-256 hash of your copy. If the hashes match, the file is almost certainly identical to the original.

The probability of a tampered file having the same hash as the original is approximately , which is less than . This probability is so small that the matching hashes provide practical certainty of integrity. Even a single-bit change in the file produces a completely different hash due to the avalanche effect: each bit of the input affects every bit of the output.

Digital signature example. Alice signs a contract document to prove she agreed to its terms. She hashes the document using SHA-256, producing a 256-bit digest. She then encrypts this digest with her RSA private key to create the signature. Anyone can verify by decrypting the signature with Alice's public key and comparing the result to their own SHA-256 hash of the document. If they match, the signature is valid. The signature provides non-repudiation: Alice cannot deny signing because only she possesses her private key.

Worked Diffie-Hellman example with realistic parameters. Alice and Bob agree on the prime and generator . Alice chooses secret . She computes using modular exponentiation (repeated squaring). Bob chooses secret and computes . They exchange and publicly.

Alice computes the shared secret as . Bob computes . Both arrive at the same value . An eavesdropper who sees , , , and cannot compute without solving the discrete logarithm problem to recover either from or from .

Check your understanding Beginner

Formal definition Intermediate+

Symmetric encryption. A symmetric encryption scheme consists of three algorithms. : generates a key . : encrypts message with key to produce ciphertext . : decrypts ciphertext with key to recover . Correctness: for all valid and .

Asymmetric encryption. An asymmetric scheme has producing a key pair (public and private). . . Correctness: .

Hash function. A cryptographic hash function satisfies: (1) Preimage resistance: given , infeasible to find with . (2) Second preimage resistance: given , infeasible to find with . (3) Collision resistance: infeasible to find any with .

RSA

RSA key generation: (1) Choose two large primes and , compute . (2) Compute . (3) Choose with and . (4) Compute . Public key: . Private key: .

Encryption: . Decryption: .

Correctness follows from Euler's theorem: for .

Diffie-Hellman key exchange

Alice and Bob agree on a prime and generator . Alice picks secret , sends . Bob picks secret , sends . Shared secret: . An eavesdropper sees , , , and but cannot compute without solving the discrete logarithm problem.

AES (Advanced Encryption Standard)

AES operates on 128-bit blocks using keys of 128, 192, or 256 bits. The algorithm processes data through multiple rounds (10, 12, or 14 depending on key size). Each round applies four transformations: SubBytes (a byte-by-byte substitution using an S-box), ShiftRows (cyclic shifts of rows in the state matrix), MixColumns (a linear transformation on columns), and AddRoundKey (XOR with the round key derived from the master key). The final round omits MixColumns.

AES security has been extensively analyzed since its standardization in 2001. The best known attack against full AES-256 reduces the key search space from to , which remains completely infeasible. Related-key attacks on AES-256 have been demonstrated in theoretical settings but require adversaries to encrypt messages under multiple related keys, which does not occur in practice. AES is implemented in hardware (AES-NI instructions on modern CPUs), enabling encryption speeds exceeding 10 Gbps.

Digital signatures and certificates

A digital signature scheme consists of three algorithms: produces a key pair , produces a signature , and outputs accept or reject. Correctness: . Security (existential unforgeability under chosen-message attack, EUF-CMA): no efficient adversary who can obtain signatures on messages of their choice can produce a valid signature on a new message.

Public key infrastructure (PKI) binds public keys to identities through digital certificates. A certificate authority (CA) signs a certificate containing a public key and the identity of its owner. The X.509 certificate standard is used in TLS, S/MIME email, and code signing. Certificate chains establish trust: a root CA (whose public key is pre-installed in browsers and operating systems) signs intermediate CA certificates, which sign end-entity certificates. This hierarchical model enables billions of devices to verify the authenticity of any TLS certificate without prior arrangement.

Password hashing and key derivation

Storing passwords securely requires more than a simple hash. Salting adds a random value (the salt) to each password before hashing: . The salt is stored alongside the hash. Salting prevents rainbow table attacks (precomputed tables of hash-to-password mappings) because each password has a unique hash even if two users choose the same password.

Key derivation functions like bcrypt, scrypt, and Argon2 are designed specifically for password hashing. They differ from general-purpose hash functions in two ways: they incorporate a salt to prevent rainbow table attacks, and they are intentionally slow (computationally expensive) to make brute-force attacks impractical. Argon2, the winner of the 2015 Password Hashing Competition, also includes a memory-hardness parameter that requires significant RAM, making GPU and ASIC-based attacks less efficient because memory is expensive relative to computation on these platforms.

Key result: semantic security of encryption Intermediate+

Theorem (Goldwasser and Micali, 1984). A public-key encryption scheme is semantically secure if and only if it is IND-CPA secure (indistinguishable under chosen-plaintext attack).

Definition (IND-CPA). Consider the following game between a challenger and an adversary. (1) The challenger generates and gives to the adversary. (2) The adversary submits two messages of equal length. (3) The challenger picks uniformly at random and gives to the adversary. (4) The adversary outputs .

The adversary's advantage is . The scheme is IND-CPA secure if no probabilistic polynomial-time adversary has non-negligible advantage.

Semantic security means that for any efficiently computable function , seeing the ciphertext does not help the adversary compute any better than they could without seeing the ciphertext. This is the gold standard for encryption security.

RSA without random padding is not IND-CPA secure because it is deterministic: encrypting the same message always produces the same ciphertext. OAEP (Optimal Asymmetric Encryption Padding) adds randomness before encryption, making RSA IND-CPA secure under appropriate assumptions.

IND-CCA (indistinguishability under chosen-ciphertext attack) is a stronger notion than IND-CPA. In the IND-CCA game, the adversary can request decryption of arbitrary ciphertexts (except the challenge ciphertext) before and after receiving the challenge. This models scenarios where the adversary can interact with a decryption oracle. RSA-OAEP is believed to be IND-CCA secure in the random oracle model. Cramer-Shoup (1998) constructed the first practical public-key encryption scheme provably IND-CCA secure without random oracles, a significant theoretical achievement.

The relationship between security notions forms a hierarchy: IND-CCA implies IND-CPA, but not vice versa. For symmetric encryption, IND-CPA requires randomized or stateful encryption (since deterministic encryption leaks whether the same message was encrypted twice). The notion of authenticated encryption (AE) combines IND-CPA with integrity (INT-CTXT), guaranteeing both confidentiality and that any modification of the ciphertext is detected. Modern protocols like TLS 1.3 mandate AEAD (authenticated encryption with associated data) cipher suites exclusively.

Exercises Intermediate+

Domain evidence Master

TLS and web security. Over 95% of web traffic is now encrypted via HTTPS (TLS). Google's transparency report shows that encrypted traffic grew from under 50% in 2014 to over 95% by 2023. This shift was driven by revelations about mass surveillance (the Snowden leaks in 2013), browser warnings for HTTP sites, and search ranking preferences for HTTPS. TLS 1.3, adopted widely by 2020, eliminated legacy cipher suites and reduced handshake latency.

Ransomware economics. Ransomware attacks encrypt an organization's data and demand payment for the decryption key. The average ransom payment exceeded 20 billion annually. The Colonial Pipeline attack (May 2021) forced a six-day shutdown of the largest fuel pipeline in the United States, causing fuel shortages across the East Coast. The attackers received 2.3 million was later recovered by the FBI. This incident demonstrated that cybersecurity failures can have physical-world consequences affecting millions of people.

Supply chain attacks. The SolarWinds attack (discovered December 2020) compromised the software build process of a widely used network management platform, inserting a backdoor into updates distributed to approximately 18,000 organizations, including US government agencies. The attack was sophisticated, patient (months of undetected access), and devastating. It highlighted the risk of trusting software updates and the challenge of securing the software supply chain.

Cryptographic standards. The AES competition (1997-2000) was an open, transparent process to select a new symmetric encryption standard. Fifteen candidate algorithms were publicly evaluated for security, performance, and implementation characteristics. Rijndael, submitted by Vincent Rijmen and Joan Daemen, was selected as AES in 2001. The open competition model was so successful that it was repeated for the SHA-3 hash function competition (2007-2012), won by Keccak. NIST's post-quantum cryptography standardization (2017-2024) followed the same model.

Advanced results Master

Zero-knowledge proofs

A zero-knowledge proof (ZKP) allows a prover to convince a verifier that a statement is true without revealing any information about why it is true. Formally, a ZKP must satisfy three properties: completeness (an honest prover convinces an honest verifier), soundness (a dishonest prover cannot convince the verifier of a false statement), and zero-knowledge (the verifier learns nothing beyond the truth of the statement).

The classic illustration is the "cave" example. Imagine a circular cave with a door in the middle that requires a secret password to open. Peggy wants to prove to Victor that she knows the password without revealing it. Victor stands outside the cave. Peggy enters and goes either left or right around the circle. Victor then enters and asks Peggy to exit from a specific side. If Peggy knows the password, she can always comply by opening the door if needed. If she does not know it, she has only a 50% chance of being on the correct side. Repeating this protocol multiple times reduces the probability of deception exponentially: after 20 rounds, the probability of Peggy fooling Victor is less than one in a million.

The Fiat-Shamir heuristic converts interactive ZKPs into non-interactive ones by replacing the verifier's random challenge with a hash of the statement and commitment. Non-interactive ZKPs (zk-SNARKs and zk-STARKs) are used in blockchain systems (Zcash, Ethereum) to prove transaction validity without revealing transaction details. zk-SNARKs (Succinct Non-interactive Arguments of Knowledge) produce very short proofs (hundreds of bytes) that can be verified quickly, making them practical for on-chain verification. zk-STARKs (Scalable Transparent Arguments of Knowledge) eliminate the need for a trusted setup but produce larger proofs.

Homomorphic encryption

Fully homomorphic encryption (FHE) allows computation on encrypted data without decrypting it. Given ciphertexts and , one can compute and without knowing or . Craig Gentry constructed the first FHE scheme in 2009, using a technique called bootstrapping to manage the noise that accumulates during computation.

Partially homomorphic schemes existed before FHE. RSA is multiplicatively homomorphic: . The Paillier cryptosystem is additively homomorphic: . These properties enable specific applications like electronic voting (Paillier can sum encrypted votes) and blinded signatures (RSA can multiply encrypted values). Fully homomorphic encryption generalizes these properties to support arbitrary computation.

FHE enables privacy-preserving cloud computing: a server can process your data without ever seeing it. Applications include privacy-preserving machine learning (train models on encrypted patient data), encrypted database queries (search without revealing the query or the data), and secure voting (tally votes without revealing individual choices). Current FHE schemes are orders of magnitude slower than unencrypted computation, but performance is improving rapidly. Microsoft's SEAL library and IBM's HELib provide practical FHE implementations.

Post-quantum cryptography

Shor's algorithm, running on a sufficiently large quantum computer, can factor integers and compute discrete logarithms in polynomial time, breaking RSA and Diffie-Hellman. While current quantum computers are too small to break practical RSA keys (which would require millions of logical qubits), the threat is real enough that NIST has standardized post-quantum cryptographic algorithms. The "harvest now, decrypt later" threat means that encrypted data captured today could be decrypted in the future when quantum computers become available, making the transition urgent for data with long confidentiality requirements.

Lattice-based cryptography (CRYSTALS-Kyber for encryption, CRYSTALS-Dilithium for signatures) forms the basis of NIST's post-quantum standards. Security rests on the hardness of lattice problems like the Learning With Errors (LWE) problem, which are believed to be hard even for quantum computers. The LWE problem asks: given a random matrix and a vector where is a secret vector and is a small error vector, find . This problem is as hard as solving certain worst-case lattice problems, providing a strong security foundation.

Code-based cryptography (McEliece cryptosystem) relies on the difficulty of decoding random linear error-correcting codes. Hash-based signatures (SPHINCS+) derive security solely from the properties of hash functions, requiring no additional mathematical assumptions. Isogeny-based cryptography (SIKE, though it was broken in 2022) explored the hardness of finding isogenies between elliptic curves. The diversity of mathematical foundations provides resilience: even if one approach is broken, others remain secure.

Multi-party computation

Secure multi-party computation (MPC) allows multiple parties to jointly compute a function over their inputs while keeping those inputs private. Each party learns only the output, not the other parties' inputs. Yao's garbled circuits (1986) and secret sharing (Shamir, 1979) are the foundational techniques.

Yao's garbled circuits work as follows for the two-party case. Alice creates a Boolean circuit that computes the desired function. She "garbles" the circuit by encrypting each gate's truth table with random labels. She sends the garbled circuit to Bob along with the labels corresponding to her inputs. Bob obtains his labels through oblivious transfer (a protocol that lets him learn his labels without Alice learning which ones he chose). Bob evaluates the garbled circuit, obtaining the output without learning any intermediate values. The computation takes time proportional to the size of the circuit, which can be large for complex functions.

Shamir's secret sharing splits a secret into shares such that any shares can reconstruct the secret but fewer than shares reveal nothing. The scheme uses polynomial interpolation: the dealer chooses a random polynomial of degree with (the secret) and gives share to party . Any parties can interpolate to find ; fewer than parties have no information. This -threshold scheme is widely used in key management and distributed systems.

MPC enables applications like privacy-preserving data analysis (multiple hospitals collaboratively analyze patient outcomes without sharing patient records), secure auctions (determine the winner without revealing any bids), and collaborative fraud detection (banks detect fraudulent patterns across institutions without exposing customer data). The performance overhead has decreased dramatically, making MPC practical for real-world deployments.

Side-channel attacks

Even mathematically secure algorithms can be vulnerable to side-channel attacks that exploit physical characteristics of the implementation. Timing attacks measure how long operations take to deduce secret keys. Power analysis measures power consumption during cryptographic operations. Cache attacks exploit CPU cache behavior to extract keys.

The first practical timing attack, demonstrated by Paul Kocher in 1996, recovered RSA private keys by measuring the time taken by modular exponentiation. Square-and-multiply exponentiation processes each bit of the exponent sequentially: for a 1 bit, it performs both a square and a multiply; for a 0 bit, only a square. The extra multiply operation takes measurable time, leaking information about which bits of the private key are 1. Spectre and Meltdown (2018) exploited CPU speculative execution to read memory across security boundaries, affecting virtually every modern processor. These attacks demonstrated that hardware optimization can create security vulnerabilities that are invisible to software-level analysis.

Defenses include constant-time implementations (operations take the same time regardless of secret values), blinding (adding randomness to intermediate values), and hardware security modules (HSMs) that perform cryptographic operations in tamper-resistant environments. Constant-time code avoids data-dependent branches and memory accesses, ensuring that execution time reveals nothing about secret data. This is difficult to achieve in practice because compilers may reintroduce timing variability through optimization.

Elliptic curve cryptography

Elliptic curve cryptography (ECC) provides equivalent security to RSA with much smaller key sizes. A 256-bit elliptic curve key provides security comparable to a 3072-bit RSA key, making ECC more efficient for bandwidth-constrained environments like mobile devices and IoT. ECC operates on the algebraic structure of elliptic curves over finite fields.

An elliptic curve over a field is defined by the equation where . The points on this curve, together with a "point at infinity," form an abelian group under a geometrically defined addition operation. The elliptic curve discrete logarithm problem (ECDLP), finding given and , is believed to be much harder than the regular discrete logarithm problem, which is why smaller key sizes suffice. The NIST curves (P-256, P-384) and Curve25519, designed by Daniel Bernstein, are the most commonly used. Curve25519 is preferred in modern implementations because it was designed to be resistant to side-channel attacks and implementation errors.

Authenticated encryption and TLS

Authenticated encryption provides both confidentiality and integrity in a single primitive, ensuring that ciphertext has not been modified. AES-GCM (Galois/Counter Mode) combines CTR mode encryption with GHASH authentication, providing authenticated encryption at high speed. ChaCha20-Poly1305, designed by Bernstein, is an alternative used in TLS 1.3 and mobile applications.

TLS 1.3, finalized in 2018, simplified the protocol and improved security. It removed support for outdated algorithms (RSA key exchange, CBC mode ciphers, SHA-1) and reduced the handshake from two round trips to one (or zero for resumed connections). The TLS 1.3 handshake uses ephemeral Diffie-Hellman for forward secrecy and authenticates the server using digital certificates. The protocol is designed so that every handshake provides perfect forward secrecy by default, and the only supported cipher suites are authenticated encryption with associated data (AEAD) constructions.

Connections Master

Connections to mathematics

Modern cryptography is deeply connected to number theory (prime numbers, modular arithmetic, elliptic curves), information theory (Shannon's perfect secrecy, entropy), computational complexity (one-way functions, hardness assumptions), and algebra (finite fields, groups, rings). The security of every cryptographic system rests on mathematical assumptions about the hardness of specific problems.

Shannon's 1949 paper "Communication Theory of Secrecy Systems" established the information-theoretic foundation of cryptography. Shannon proved that perfect secrecy (where the ciphertext reveals no information about the plaintext) requires a key at least as long as the message. The one-time pad, which XORs the plaintext with a random key of equal length, achieves perfect secrecy but is impractical for most applications because of the key distribution problem. Modern cryptography relaxes perfect secrecy to computational security: the ciphertext may theoretically reveal information about the plaintext, but recovering that information requires computational effort that is infeasible for any realistic adversary.

Connections to law and policy

Encryption policy is a battleground between privacy advocates and law enforcement. Governments argue that encryption impedes criminal investigations and propose "backdoors" that would allow authorized access. Cryptographers and civil liberties organizations argue that backdoors inevitably create vulnerabilities exploitable by criminals and foreign intelligence. The "going dark" debate remains unresolved.

The Clipper Chip episode (1993) illustrates the backdoor debate. The US government proposed the Clipper Chip, a hardware encryption device with a built-in backdoor (key escrow) allowing law enforcement to decrypt communications with a court order. The proposal was withdrawn after widespread opposition from cryptographers (who showed the escrow system was vulnerable), industry (who feared loss of competitiveness), and civil liberties organizations (who objected to government access to private communications). The technical argument against backdoors is that any mechanism that allows authorized access can potentially be exploited by unauthorized parties.

Privacy regulations (GDPR, CCPA, HIPAA) impose legal requirements on data handling that intersect with cybersecurity practices. Encryption, access controls, and audit logging are technical measures that help organizations comply with these regulations. GDPR Article 32 requires "appropriate technical and organizational measures to ensure a level of security appropriate to the risk," explicitly mentioning encryption and pseudonymization.

Connections to blockchain and cryptocurrency

Blockchain technology uses cryptographic hashes for tamper-evident logging, digital signatures for transaction authentication, and Merkle trees for efficient verification. Cryptocurrencies like Bitcoin and Ethereum are applications of cryptographic primitives to distributed consensus. The intersection of cryptography, distributed systems, and economics creates a rich research area.

Bitcoin's proof-of-work mechanism uses SHA-256 hash computations to secure the blockchain. Miners compete to find a nonce such that , where the target is adjusted to maintain an average block time of 10 minutes. This computational puzzle makes it expensive to rewrite history: an attacker would need to outpace the combined hash rate of all honest miners. Ethereum's transition from proof-of-work to proof-of-stake (September 2022) eliminated the computational puzzle in favor of economic incentives, reducing energy consumption by over 99.9%.

Connections to distributed systems

Many cryptographic protocols are inherently distributed, requiring coordination between multiple parties over unreliable networks. Byzantine fault tolerance, a core problem in distributed systems, intersects with cryptography through protocols like practical Byzantine fault tolerance (PBFT) and their blockchain descendants. Consensus protocols use digital signatures to authenticate messages and hash functions to ensure data integrity.

Certificate transparency (RFC 6962) uses Merkle trees to create a public, append-only log of TLS certificates, enabling detection of improperly issued certificates. This is a direct application of cryptographic hash functions to a distributed systems problem: how to maintain a verifiable audit trail across multiple untrusted parties.

Historical and philosophical context Master

From Caesar ciphers to modern cryptography

The history of cryptography spans millennia. Julius Caesar used a simple substitution cipher (shifting letters by a fixed amount). Arab mathematician Al-Kindi developed frequency analysis in the 9th century to break substitution ciphers. The Enigma machine, used by Nazi Germany, was broken by Polish and British cryptanalysts (including Alan Turing) during World War II, shortening the war by an estimated two years.

The Enigma machine was an electromechanical rotor cipher device with approximately possible settings. The Polish Cipher Bureau, led by Marian Rejewski, made the first breakthroughs in 1932 using mathematical techniques and a replica of the machine. After the invasion of Poland, the work continued at Bletchley Park in England, where Turing developed the Bombe, an electromechanical device that rapidly tested Enigma settings. The intelligence derived from decrypted Enigma messages (code-named Ultra) provided the Allies with critical information about German military plans.

Modern cryptography began with the publication of the Data Encryption Standard (DES) in 1977, the invention of RSA by Rivest, Shamir, and Adleman in 1977, and the Diffie-Hellman key exchange in 1976. The formalization of cryptographic security notions (semantic security, IND-CPA, IND-CCA) by Goldwasser and Micali in the 1980s transformed cryptography from an art into a rigorous science.

The discovery of public-key cryptography was one of the most significant advances in the history of the field. Whitfield Diffie and Martin Hellman's 1976 paper "New Directions in Cryptography" proposed the concept of public-key cryptography and introduced the Diffie-Hellman key exchange. Their insight was that cryptographic keys could come in pairs: one for encryption (public) and one for decryption (private). This solved the key distribution problem that had plagued cryptography for millennia. Shortly after, Ron Rivest, Adi Shamir, and Leonard Adleman published the RSA algorithm, the first practical public-key encryption and signature scheme. These discoveries enabled secure communication on the internet, electronic commerce, and digital signatures.

The philosophical significance of encryption

Encryption creates a space that is mathematically inaccessible to anyone without the key. This raises fundamental questions about the relationship between mathematics and power. If a government cannot decrypt its citizens' communications, it cannot perform surveillance. If individuals can encrypt data that is mathematically impossible for anyone else to read, they have a form of privacy that no law can override through technology alone.

The debate between individual privacy and collective security is one of the defining tensions of the information age. Cryptography does not resolve this tension, but it shifts the balance by giving individuals tools that were previously available only to states. The widespread availability of strong encryption since the 1990s has given individuals and organizations the ability to communicate privately without depending on government-provided security.

The "crypto wars" of the 1990s pitted the US government against the technology industry and civil liberties groups. The government classified encryption software as a munition, restricting its export under the International Traffic in Arms Regulations (ITAR). This prevented widely used software like Netscape Navigator from including strong encryption in its international versions. Phil Zimmermann's PGP (Pretty Good Privacy), released in 1991, made strong encryption available to the public for the first time. The US government investigated Zimmermann for three years for allegedly violating export restrictions but ultimately did not prosecute. The export restrictions were gradually relaxed through the 1990s and early 2000s.

Privacy as a human right

The United Nations has recognized privacy as a fundamental human right (Article 12 of the Universal Declaration of Human Rights). In the digital age, privacy requires encryption. Without the ability to communicate privately, to store data securely, and to browse without surveillance, the right to privacy is theoretical rather than practical.

The challenge is balancing legitimate security needs (preventing terrorism, fighting cybercrime) with the right to private communication. Different societies strike this balance differently, and the conversation continues to evolve as technology advances.

The Snowden revelations (June 2013) revealed the scale of mass surveillance by the NSA and its allies. The documents showed that the NSA had been collecting phone records of millions of Americans, tapping internet backbone connections (the PRISM program), and working to weaken cryptographic standards. This catalyzed a global debate about surveillance, privacy, and the role of encryption. The immediate technical response was a rapid deployment of encryption across the internet: Google, Facebook, Apple, and others enabled encryption by default for their services. The IETF accelerated the development of TLS 1.3 to provide stronger security guarantees.

Bibliography Master

Primary sources

  • Diffie, W. and Hellman, M. (1976). "New directions in cryptography." IEEE Transactions on Information Theory, 22(6), 644-654.
  • Rivest, R., Shamir, A., and Adleman, L. (1978). "A method for obtaining digital signatures and public-key cryptosystems." Communications of the ACM, 21(2), 120-126.
  • Goldwasser, S. and Micali, S. (1984). "Probabilistic encryption." Journal of Computer and System Sciences, 28(2), 270-299.
  • Gentry, C. (2009). "Fully homomorphic encryption using ideal lattices." STOC, 169-178.
  • Bellare, M. and Rogaway, P. (1994). "Optimal asymmetric encryption padding." EUROCRYPT, 92-111.
  • Bernstein, D.J. (2006). "Curve25519: new Diffie-Hellman speed records." PKC, 207-228.
  • Kocher, P. (1996). "Timing attacks on implementations of Diffie-Hellman, RSA, DSS, and other systems." CRYPTO, 104-113.
  • Shor, P. (1994). "Algorithms for quantum computation: discrete logarithms and factoring." FOCS, 124-134.
  • Goldreich, O. (2001). Foundations of Cryptography, Vol. 1. Cambridge University Press.

Secondary sources

  • Katz, J. and Lindell, Y. (2014). Introduction to Modern Cryptography (2nd ed.). CRC Press.
  • Stallings, W. (2017). Cryptography and Network Security (7th ed.). Pearson.
  • Anderson, R. (2008). Security Engineering (2nd ed.). Wiley.
  • Boneh, D. and Shoup, V. (2020). A Graduate Course in Applied Cryptography. Online textbook.
  • Smart, N.P. (2016). Cryptography Made Simple. Springer.
  • Ferguson, N., Schneier, B., and Kohno, T. (2010). Cryptography Engineering. Wiley.