25.06.01 · computer-science / networks

Computer networks and internet architecture

shipped3 tiersLean: none

Anchor (Master): Clark, The Design Philosophy of the DARPA Internet Protocols (1988); RFC 791, 793, 2616; Varghese, Network Algorithmics

Intuition Beginner

A computer network is a collection of computers connected to share information. The simplest network is two computers linked by a cable. The most complex is the internet, connecting billions of devices across every continent and into space. Understanding how networks work requires understanding layered abstraction: each layer provides services to the layer above and uses services from the layer below.

Networks can be classified by scale. A personal area network (PAN) connects devices within arm's reach (Bluetooth headphones to a phone). A local area network (LAN) connects devices within a building or campus (home Wi-Fi, office Ethernet). A metropolitan area network (MAN) covers a city (cable TV networks). A wide area network (WAN) spans countries or continents (the internet backbone). Each scale uses different technologies optimized for its distance and bandwidth requirements.

The internet works because every device agrees on a common set of rules called protocols. A protocol is like a language that computers use to communicate. Just as two people need to speak the same language to understand each other, two computers need to use the same protocols to exchange data. The most important internet protocols are IP (Internet Protocol), which handles addressing and routing, and TCP (Transmission Control Protocol), which ensures reliable delivery.

When you visit a website, a remarkable chain of events occurs in fractions of a second. Your browser constructs a request message. This message is broken into small chunks called packets, each labeled with a destination address. The packets travel through your local network to your internet service provider, then hop from router to router across the internet until they reach the server hosting the website. The server's response follows the same process in reverse. Each packet may take a different route, yet they all arrive and are reassembled into the complete webpage.

Packet switching vs. circuit switching. The telephone network uses circuit switching: when you make a call, a dedicated path is reserved for your conversation, even when neither party is speaking. The internet uses packet switching: data is divided into packets that share network links with packets from other communications. This sharing makes much more efficient use of network capacity, especially for bursty data traffic (like web browsing) where the connection is idle much of the time.

This process involves multiple layers working together. At the application layer, your browser speaks HTTP to request a webpage. At the transport layer, TCP breaks the request into numbered segments and ensures they all arrive correctly. At the network layer, IP adds source and destination addresses and routes each packet toward its destination. At the link layer, each physical hop forwards the packet to the next node. At the physical layer, electrical signals, light pulses, or radio waves carry the data through cables, fiber optics, or the air.

This layering is called encapsulation. Each layer wraps the data from the layer above with its own header (control information). The result is a nested structure: the HTTP message is wrapped in a TCP segment, which is wrapped in an IP datagram, which is wrapped in an Ethernet frame. At the destination, each layer unwraps its corresponding header, passing the payload up to the next layer. This separation of concerns allows each layer to evolve independently.

The layered model is an abstraction that makes network design manageable. Each layer can be designed and implemented independently, as long as it provides the expected interface to the adjacent layers. You can change from copper wire to fiber optic cable at the physical layer without modifying any other layer. You can switch from IPv4 to IPv6 at the network layer without changing the transport or application layers.

IP addresses are the internet's equivalent of postal addresses. Every device connected to the internet has an IP address that identifies it uniquely (or at least uniquely within its local network). IPv4 addresses are 32-bit numbers, typically written as four decimal numbers separated by periods, like 192.168.1.1. With 32 bits, there are about 4.3 billion possible addresses, which the explosive growth of internet-connected devices has exhausted. IPv6 addresses are 128-bit numbers, providing approximately $3.4 \times 1 0^{38}$ addresses, enough for the foreseeable future.

The Domain Name System (DNS) translates human-readable names like example.com into IP addresses like 93.184.216.34. When you type a URL into your browser, the first step is a DNS lookup: your computer queries a DNS server, which returns the IP address of the destination. DNS is a distributed, hierarchical database, with root servers at the top, top-level domain servers (.com, .org, .net) in the middle, and authoritative servers for individual domains at the bottom.

TCP provides reliable, ordered delivery of data. It numbers every byte it sends, acknowledges received data, retransmits lost packets, and reorders out-of-sequence packets. If a packet is lost (a common occurrence on the internet), TCP detects the loss through a missing acknowledgment and retransmits the data. If packets arrive out of order (because they took different routes), TCP reassembles them in the correct sequence before delivering the data to the application.

UDP (User Datagram Protocol) provides a simpler, faster alternative to TCP. UDP sends packets without establishing a connection, without acknowledging receipt, and without retransmitting lost data. UDP is used when speed matters more than reliability: real-time video streaming, online gaming, and DNS queries all use UDP because the overhead of TCP's reliability mechanisms would introduce unacceptable delay.

Client-server vs. peer-to-peer are two fundamental network architectures. In client-server, a central server provides services to many clients (web servers, email servers, database servers). This model is simple to manage but creates a single point of failure and a scalability bottleneck. In peer-to-peer (P2P), every participant is both client and server. P2P networks (BitTorrent, blockchain networks) scale naturally because each new participant adds both demand and capacity. However, P2P networks are harder to manage and may have reliability and security challenges.

The socket API is the primary interface for network programming. A socket is an endpoint of a communication channel. To create a network connection, the server calls socket(), bind(), listen(), and accept(). The client calls socket() and connect(). Once connected, both sides use send() and recv() (or write() and read()) to exchange data. This API, introduced with BSD Unix in 1983, is supported by virtually every programming language and operating system. The simplicity and universality of the socket API contributed to the rapid growth of networked applications.

Routing is the process of determining the best path for packets through the network. Routers are specialized computers that forward packets based on their destination addresses. Each router maintains a routing table that maps destination networks to the next hop. When a packet arrives, the router looks up the destination address in its routing table and forwards the packet to the appropriate next hop.

Routing algorithms must adapt to changing network conditions. Links fail, congestion builds, and new paths become available. The Border Gateway Protocol (BGP), which routes traffic between internet service providers, is the protocol that holds the internet together. BGP allows each network to advertise which IP addresses it can reach and to select the best path from among the advertised routes.

NAT (Network Address Translation) helps cope with IPv4 address exhaustion. A NAT device sits between a private network and the internet, translating private IP addresses (like 192.168.1.x) to a single public IP address. When a device on the private network sends a packet to the internet, the NAT replaces the private source address with its own public address and records the mapping in a table. When the response arrives, the NAT translates the destination back to the private address.

This allows many devices to share a single public IP address. NAT breaks the end-to-end principle because the NAT device must understand and modify the transport-layer headers, creating complications for protocols like FTP and SIP that embed IP addresses in their payloads.

HTTP and the web represent the most visible application-layer protocol. HTTP/1.1 (1997) established the persistent connections and pipelining that powered the web for two decades. HTTP/2 (2015) introduced multiplexing (multiple requests over a single connection), header compression, and server push. HTTP/3 (2022) replaced TCP with QUIC, eliminating head-of-line blocking at the transport layer. Each generation reduces latency and improves performance, but the fundamental request-response model remains the same: a client sends a request (method, path, headers, optional body) and receives a response (status code, headers, body).

Network performance metrics quantify network quality. Bandwidth (throughput) measures the maximum data transfer rate, typically in megabits per second. Latency (round-trip time, RTT) measures the time for a packet to travel to the destination and back, typically in milliseconds. Jitter measures the variation in latency, which affects real-time applications like video calls. Packet loss rate measures the fraction of packets that fail to reach their destination. These metrics interact in complex ways: TCP's congestion control reduces throughput when it detects packet loss, and high latency amplifies the impact of loss because retransmissions take longer.

Visual Beginner

The table below shows the five layers of the internet protocol stack.

Layer	Protocol examples	Responsibility
Application	HTTP, DNS, SMTP, FTP	Application-specific communication
Transport	TCP, UDP	End-to-end reliability and multiplexing
Network	IP, ICMP, OSPF	Addressing and routing across networks
Link	Ethernet, Wi-Fi, PPP	Node-to-node delivery on local network
Physical	Copper, fiber, radio	Bit transmission over physical medium

Worked example Beginner

When you type https://example.com into your browser and press Enter, the following events occur.

Step 1: DNS resolution. The browser needs the IP address of example.com. It checks its local cache first. If not found, it queries the operating system's DNS resolver, which contacts a recursive DNS server (usually provided by your ISP). That server queries the root DNS server for .com, which refers it to the .com TLD server. The TLD server refers to example.com's authoritative DNS server, which returns the IP address 93.184.216.34. This entire process typically takes 20-120 milliseconds.

Step 2: TCP connection. The browser initiates a TCP connection to 93.184.216.34 at port 443 (the standard port for HTTPS). TCP performs a three-way handshake: the browser sends a SYN packet, the server responds with SYN-ACK, and the browser sends ACK. This takes one to two round-trip times.

Step 3: TLS handshake. Because the URL specifies HTTPS, the browser and server negotiate an encrypted connection using TLS (Transport Layer Security). They exchange certificates, agree on encryption algorithms, and establish shared secret keys. This takes one to two additional round trips.

Step 4: HTTP request. The browser sends an HTTP GET request through the encrypted connection. The request contains the path "/", headers specifying the browser type and accepted formats, and any cookies.

Step 5: Server processing. The server receives the request, determines the appropriate response, and sends back an HTTP response with a status code (200 OK), headers describing the content, and the HTML of the webpage.

Step 6: Rendering. The browser receives the HTML, parses it, and requests additional resources (images, stylesheets, JavaScript) referenced in the HTML. Each resource may require a separate HTTP request, though modern browsers use connection reuse and multiplexing (HTTP/2) to fetch many resources over a single connection.

The total time from pressing Enter to seeing the page is typically 200 milliseconds to several seconds, depending on the distance to the server, the size of the page, and network conditions.

Latency breakdown. For a typical webpage load, the DNS lookup takes 20-120ms, the TCP handshake takes one RTT (10-200ms depending on distance), the TLS handshake adds one to two RTTs, and the initial HTTP request-response takes another RTT. For a user in New York connecting to a server in California (RTT ~80ms), the total setup time before any content is received is approximately 200-400ms. The actual content transfer depends on the page size and available bandwidth. Modern optimization techniques (connection reuse, HTTP/2 multiplexing, preconnect, prefetching) reduce this latency by eliminating redundant round trips and predicting what resources will be needed next.

The critical-path latency, the minimum time before the user sees anything, is determined by the number of sequential round trips required. Each protocol layer adds its own round trips: DNS resolution, TCP handshake, TLS handshake, and HTTP request. Reducing any of these improves perceived performance. DNS prefetching resolves domains before the user clicks. TCP Fast Open allows data in the SYN packet, eliminating one RTT. TLS 1.3 reduces the handshake from two RTTs to one. HTTP/2 server push sends resources before the client requests them.

Check your understanding Beginner

Formal definition Intermediate+

Protocol stack. The TCP/IP model organizes network functionality into five layers. Each layer $L_{i}$ provides services to $L_{i + 1}$ through a well-defined interface and uses services from $L_{i - 1}$ .

Encapsulation. At each layer, a protocol data unit (PDU) consists of a header (control information added by that layer) and a payload (the PDU from the layer above). An HTTP message becomes the payload of a TCP segment, which becomes the payload of an IP datagram, which becomes the payload of a link-layer frame.

TCP: reliable data transfer

TCP provides a reliable byte stream between two endpoints using the following mechanisms.

Sequence numbers. Each byte is numbered. The sequence number in a TCP header identifies the first byte in the segment's payload.

Acknowledgments. The receiver sends an ACK with the sequence number of the next expected byte. Cumulative acknowledgments acknowledge all bytes up to the specified number.

Retransmission. The sender starts a timer when it transmits a segment. If an ACK is not received before the timer expires, the segment is retransmitted. The retransmission timeout (RTO) is dynamically estimated from observed round-trip times.

Flow control. The receiver advertises a window size in each ACK, indicating how many bytes of buffer space it has available. The sender must not send more data than the advertised window.

Congestion control. TCP adjusts its sending rate in response to observed congestion. The congestion window (cwnd) limits the amount of unacknowledged data the sender can have in flight. The effective window is $min (rwnd, cwnd)$ where rwnd is the receiver's advertised window.

IP addressing and subnetting

An IPv4 address is a 32-bit number. The address is divided into a network prefix and a host identifier. Classless Inter-Domain Routing (CIDR) uses a variable-length prefix specified as address/prefix_length. For example, 192.168.1.0/24 means the first 24 bits identify the network, and the last 8 bits identify hosts within that network, allowing 254 hosts ( $2^{8} - 2$ , subtracting network and broadcast addresses).

A subnet mask is a 32-bit number where the network prefix bits are 1 and host bits are 0. For /24, the mask is 255.255.255.0. The network address is computed as $address & mask$ (bitwise AND).

Routing algorithms

Distance-vector routing. Each router maintains a vector of distances (costs) to all destinations and shares this vector with its neighbors. The Bellman-Ford equation computes the shortest path: $D (x, y) = min_{v} {c (x, v) + D (v, y)}$ , where $c (x, v)$ is the cost of the link from $x$ to $v$ and $D (v, y)$ is the distance from neighbor $v$ to destination $y$ .

Link-state routing. Each router maintains a complete map of the network topology. Dijkstra's algorithm computes shortest paths from this map. OSPF (Open Shortest Path First) is the most common link-state protocol.

Path-vector routing. BGP uses path vectors that record the sequence of autonomous systems (ASes) traversed to reach a destination. Each AS advertises paths to its neighbors, who prepend their own AS number to the path.

BGP's path selection is more complex than simple shortest-path. BGP evaluates multiple attributes in order: weight (Cisco-specific, local to the router), local preference (within an AS), AS path length, origin type, multi-exit discriminator (MED), eBGP over iBGP preference, IGP metric to next hop, and router ID. This ordered attribute evaluation gives network operators fine-grained control over traffic engineering: they can prefer certain paths for cost, latency, or business reasons without modifying the underlying topology.

Ethernet and the link layer

Ethernet, the dominant wired link-layer technology, uses 48-bit MAC addresses to identify network interfaces. Switches maintain a MAC address table mapping each port to the MAC addresses of devices connected to it. When a frame arrives, the switch looks up the destination MAC address and forwards the frame only to the correct port (unicast), to all ports (broadcast for unknown destinations), or to a subset (multicast).

The Address Resolution Protocol (ARP) bridges the gap between network-layer IP addresses and link-layer MAC addresses. When a host needs to send a packet to an IP address on the local network, it broadcasts an ARP request: "Who has 192.168.1.1? Tell 192.168.1.100." The host with that IP address responds with its MAC address. The sender caches this mapping for future use. ARP operates entirely within the local network and is not routable.

Spanning Tree Protocol (STP) prevents loops in Ethernet networks by computing a tree topology and blocking redundant links. Without STP, broadcast frames would circulate endlessly in a looped topology (a broadcast storm), consuming all available bandwidth. STP elects a root bridge and computes least-cost paths from all switches to the root, blocking ports that would create loops. Rapid Spanning Tree Protocol (RSTP) reduces convergence time from 30-50 seconds to a few seconds.

DNS in depth

DNS is a hierarchical, distributed database. The root zone delegates authority for top-level domains (.com, .org, .net, country codes) to TLD servers. TLD servers delegate authority for second-level domains (example.com) to authoritative servers operated by domain registrants. This hierarchy enables the system to scale to billions of records without any single server maintaining the complete database.

DNS caching at multiple levels (browser, OS, resolver, authoritative) reduces load and latency. Time-to-live (TTL) values control how long cached records remain valid. Short TTLs (seconds to minutes) enable rapid failover but increase query volume. Long TTLs (hours to days) reduce load but slow propagation of changes. DNS propagation delay, the time for a DNS change to reach all caches worldwide, is bounded by the maximum TTL.

DNS Security Extensions (DNSSEC) add cryptographic signatures to DNS records, allowing resolvers to verify that responses have not been tampered with. DNSSEC does not encrypt queries or responses; it provides integrity and origin authentication. DNS over HTTPS (DoH) and DNS over TLS (DoT) add encryption to prevent eavesdropping on DNS queries, addressing privacy concerns.

Key result: TCP congestion control and AIMD Intermediate+

Theorem. TCP's Additive Increase, Multiplicative Decrease (AIMD) algorithm converges to a fair and efficient allocation of bandwidth among competing flows sharing a bottleneck link.

Proof sketch. Consider two TCP flows sharing a single bottleneck link with capacity $C$ . Let $w_{1}$ and $w_{2}$ be their congestion windows. The total load is $w_{1} + w_{2}$ .

Additive increase. Each flow increases its window by 1 per round-trip time. So $(w_{1}, w_{2})$ moves toward $(w_{1} + 1, w_{2} + 1)$ , which moves along the $4 5^{\circ}$ line (equal increase) toward the capacity line $w_{1} + w_{2} = C$ .

Multiplicative decrease. When congestion is detected (a packet loss), each flow halves its window: $(w_{1}, w_{2}) \to (w_{1} /2, w_{2} /2)$ . This moves toward the origin along the line from the current point, preserving the ratio $w_{1} / w_{2}$ .

Convergence to fairness. Define the fairness index as $w_{1} / w_{2}$ . AI preserves this ratio (both increase equally). MD preserves this ratio (both are halved). But the capacity constraint forces adjustment. After a loss at the capacity line, both flows halve, and the total drops below $C$ , triggering additive increase. The alternating increase/decrease cycle drives the system toward $w_{1} = w_{2} = C /2$ , the fair allocation.

Chiu and Jain (1989) showed that AIMD is the only combination of linear increase/decrease strategies that converges to both efficiency (full utilization of the bottleneck) and fairness (equal bandwidth sharing). $□$

TCP slow-start begins each connection (or restarts after idle) with a congestion window of 1-10 segments and doubles the window each round-trip time until a threshold (ssthresh) is reached or loss occurs. This exponential growth quickly discovers available bandwidth. When loss occurs, ssthresh is set to half the current window, the window is reset (to 1 in TCP Tahoe, to ssthresh in TCP Reno), and slow-start begins again. The alternation between slow-start (exponential growth) and congestion avoidance (linear growth) creates the characteristic sawtooth pattern of TCP congestion windows.

TCP CUBIC, the default congestion control algorithm in Linux since 2006, uses a cubic function $w (t) = C (t - K)^{3} + W_{m a x}$ for window growth, where $W_{m a x}$ is the window size at the last loss event. The cubic function is independent of RTT, providing better fairness among flows with different round-trip times than the RTT-dependent linear growth of standard TCP. CUBIC is particularly effective on high-bandwidth, high-latency paths (long-fat networks, or LFNs) where traditional TCP takes many RTTs to fully utilize available bandwidth.

BBR (Bottleneck Bandwidth and Round-trip propagation time), developed by Google in 2016, takes a fundamentally different approach. Instead of using packet loss as the congestion signal, BBR explicitly models two parameters: the bottleneck bandwidth (maximum delivery rate observed) and the minimum RTT (propagation delay). BBR sends at exactly the bottleneck bandwidth paced over the minimum RTT, avoiding the buffer filling that causes loss in traditional TCP. BBR can achieve significantly higher throughput on paths with large buffers or random loss, but it raises fairness concerns when competing with loss-based TCP flows that back off when buffers fill.

Exercises Intermediate+

Exercise 4 (medium, short answer).

Explain how TCP's three-way handshake works and why it is necessary (what problem does it solve?).

Hint

Consider what could go wrong if a connection were established without synchronization.

Answer

The three-way handshake is: (1) client sends SYN with initial sequence number $x$ , (2) server responds with SYN-ACK acknowledging $x$ and providing its own initial sequence number $y$ , (3) client sends ACK acknowledging $y$ . This ensures both sides know each other's initial sequence numbers, which is necessary for reliable data transfer (sequence numbers are used to order and acknowledge data). The handshake also prevents old, delayed connection requests from being mistaken for new ones: if a delayed SYN arrives at the server, the server responds with SYN-ACK, but the client (which did not initiate the connection) will not send the final ACK, and the server discards the half-open connection.

Exercise 5 (medium, short answer).

Compare distance-vector and link-state routing algorithms. What are the trade-offs?

Hint

Consider what information each router maintains and how it computes routes.

Answer

Distance-vector (DV) algorithms (RIP, IGRP) are simple: each router maintains only a distance table and exchanges it with direct neighbors. They converge slowly and can suffer from count-to-infinity problems (where a routing loop causes distances to increment indefinitely). Link-state (LS) algorithms (OSPF, IS-IS) are more complex: each router maintains a complete network topology map and runs Dijkstra's algorithm locally. They converge faster and avoid count-to-infinity but require more memory (full topology) and bandwidth (flooding link-state advertisements to all routers). DV scales better in very large networks because each router's state is proportional to the number of destinations, not the number of links. LS provides faster convergence and more accurate routing.

Exercise 6 (hard, short answer).

Explain why QUIC chose to build on UDP rather than modifying TCP. What advantages does this approach provide?

Hint

Consider the ecosystem of middleboxes (firewalls, NAT devices) that exist between clients and servers.

Answer

TCP implementations are embedded in operating system kernels, making changes slow to deploy (requiring OS updates on billions of devices). More critically, middleboxes (firewalls, NAT devices, traffic shapers) have evolved to understand TCP headers and reject packets with unknown TCP options or behaviors. Deploying new TCP features requires bypassing these middleboxes, which is often impossible. QUIC builds on UDP because UDP is simple enough that middleboxes pass it through without interference. QUIC's transport logic runs in user space (in the browser or application), enabling rapid updates without OS kernel changes. This also allows QUIC to encrypt transport-layer headers (using TLS 1.3), preventing middlebox inspection and modification. The result is a protocol that can evolve quickly while TCP remains frozen by middlebox compatibility requirements.

Domain evidence Master

Internet scale and growth. As of 2024, the internet connects over 5.3 billion users and 20 billion IoT devices. Global internet traffic exceeds 4 zettabytes per year and doubles approximately every three years. The average round-trip time between continents is 100-200ms over fiber optic cables, limited by the speed of light. Submarine cables carry over 99% of intercontinental data traffic, with total capacity exceeding 1 petabit per second.

BGP incidents. BGP route leaks and hijacks cause significant disruptions. In 2008, Pakistan Telecom accidentally announced more specific routes for YouTube's IP ranges, redirecting global YouTube traffic to Pakistan and making the site inaccessible worldwide for several hours. In 2019, a small Nigerian ISP announced routes for Google's IP space, briefly redirecting Google traffic. These incidents demonstrate the fragility of BGP's trust model: any network can announce any prefix, and the global routing table accepts it.

The DNS infrastructure. The DNS root zone is served by 13 logical root server addresses (labeled A through M), operated by 12 organizations. Despite the "13 server" limit, each logical address is backed by hundreds of physical servers distributed globally using anycast routing. The root zone contains approximately 1,500 top-level domains. DNS handles over 2 trillion queries per day. The 2016 Dyn DNS DDoS attack, which targeted a major DNS provider, rendered Twitter, Reddit, Netflix, and many other major websites inaccessible for several hours, demonstrating the critical dependency on DNS infrastructure.

TCP performance in practice. TCP's congestion control has been remarkably successful at preventing internet collapse. The slow-start algorithm initially sends a small window of data and doubles the window each round trip until loss occurs, then switches to additive increase. Modern TCP variants (CUBIC, BBR) improve performance over high-bandwidth, high-latency paths. CUBIC, the default in Linux, uses a cubic function for window growth that is independent of RTT, providing fairness across flows with different latencies. BBR (Bottleneck Bandwidth and Round-trip propagation time), developed by Google, models the network path and adjusts the sending rate to match the bottleneck bandwidth, avoiding the packet loss that traditional congestion control relies on as a signal.

Advanced results Master

Software-defined networking (SDN)

Traditional networks configure each router independently, using distributed protocols to converge on consistent forwarding tables. SDN separates the control plane (deciding how to forward packets) from the data plane (actually forwarding them). A centralized controller computes forwarding rules and pushes them to switches via the OpenFlow protocol.

SDN enables network programmability: forwarding decisions can be based on any combination of packet header fields, not just destination address. This enables traffic engineering, network virtualization, and dynamic policy enforcement that would be difficult or impossible with distributed routing protocols.

The SDN architecture has three layers. The infrastructure layer consists of forwarding devices (switches) that process packets according to flow tables. The control layer runs the network operating system (the SDN controller) that maintains a global view of the network and computes forwarding rules. The application layer runs network applications (firewalls, load balancers, monitoring) that interact with the controller through northbound APIs. The separation of control and data planes allows each layer to evolve independently: switches can be simple and fast (optimized for packet forwarding), while the controller can be sophisticated (running complex algorithms on a general-purpose server).

Content delivery networks (CDNs)

CDNs replicate content at geographically distributed edge servers, reducing latency for end users. When a user requests content from a CDN-backed website, DNS redirects the request to the nearest edge server. This reduces round-trip time and avoids bottlenecks at the origin server.

CDNs like Cloudflare, Akamai, and AWS CloudFront serve a significant fraction of all internet traffic. The technical challenges include cache management (determining what to cache and when to invalidate), request routing (directing users to the optimal edge server), and content consistency (ensuring cached copies are up-to-date).

Akamai, founded in 1998 by MIT mathematician Tom Leighton and graduate student Danny Lewin, pioneered the CDN concept. The key insight was that web performance is limited by latency (the speed of light through fiber optic cables, approximately 200 km/ms), not bandwidth. Replicating content close to users reduces the physical distance data must travel, cutting latency regardless of available bandwidth. Modern CDNs also provide security services (DDoS protection, web application firewalls) and compute capabilities (edge functions that execute code at the CDN edge).

Multipath TCP and QUIC

Multipath TCP (MPTCP) allows a single TCP connection to use multiple network paths simultaneously. A smartphone can use both Wi-Fi and cellular data for a single download, increasing throughput and providing seamless handover when one connection becomes unavailable.

QUIC, developed by Google and standardized as HTTP/3's transport layer, combines the features of TCP, TLS, and HTTP/2 into a single protocol built on UDP. QUIC eliminates the TCP head-of-line blocking problem (where a single lost packet blocks all subsequent packets), reduces connection establishment latency (0-RTT or 1-RTT handshakes), and supports connection migration (continuing a connection when the client's IP address changes).

QUIC's design addresses decades of accumulated TCP limitations. TCP was designed in the 1970s when networks were simple and slow. Middleboxes (firewalls, NAT devices) have evolved to understand TCP headers, making it difficult to deploy changes to the TCP protocol because middleboxes reject packets with unknown options. By implementing the transport layer in user space on top of UDP, QUIC bypasses middlebox interference and enables rapid iteration. QUIC packets are encrypted by default (using TLS 1.3), which prevents middleboxes from inspecting or modifying transport-layer headers.

Network function virtualization (NFV)

NFV replaces dedicated hardware appliances (firewalls, load balancers, WAN accelerators) with software running on commodity servers. Network functions are deployed as virtual machines or containers that can be instantiated, scaled, and migrated dynamically.

NFV reduces capital expenditure (commodity servers are cheaper than specialized hardware), enables rapid service deployment (software can be updated in minutes), and improves resilience (failed instances can be replaced automatically).

The combination of SDN and NFV enables network slicing in 5G networks: multiple virtual networks running on the same physical infrastructure, each with different performance characteristics. A network slice for autonomous vehicles might guarantee ultra-low latency, while a slice for video streaming provides high bandwidth with higher latency tolerance. This flexibility transforms the network from a one-size-fits-all service into a customizable platform.

The Internet of Things (IoT) and edge computing

IoT devices (sensors, actuators, smart appliances) connect to the internet over constrained networks with limited bandwidth, power, and processing capability. Protocols like MQTT (Message Queuing Telemetry Transport) and CoAP (Constrained Application Protocol) are designed for these environments, using lightweight messaging and minimal overhead.

Edge computing moves processing from centralized data centers to nodes close to the data source. This reduces latency for real-time applications (autonomous vehicles, industrial control), decreases bandwidth requirements (processing data locally instead of sending it all to the cloud), and improves privacy (sensitive data need not leave the local network).

The scale of IoT creates unique network challenges. Billions of devices generate massive volumes of data, much of it time-sensitive. Traditional cloud architectures, where all data is sent to centralized data centers for processing, cannot meet the latency requirements of real-time IoT applications. Edge computing addresses this by processing data at or near the source, sending only aggregated results or anomalies to the cloud. Fog computing extends this concept by creating a hierarchy of processing nodes from the edge to the cloud.

Wireless networks and 5G

Wireless networking presents unique challenges due to the shared medium (air) and signal propagation effects (fading, interference, multipath). Wi-Fi (IEEE 802.11) uses carrier sense multiple access with collision avoidance (CSMA/CA): devices listen before transmitting and use random backoff to reduce collisions. Cellular networks (4G LTE, 5G) use scheduled access: a base station allocates time-frequency blocks to each device, providing more predictable performance.

5G introduces three categories of service: enhanced mobile broadband (eMBB) for high data rates, ultra-reliable low-latency communications (URLLC) for critical applications like autonomous driving and remote surgery, and massive machine-type communications (mMTC) for IoT deployments with millions of devices. These diverse requirements are served by a single infrastructure through network slicing, which creates isolated virtual networks on shared physical resources.

Peer-to-peer networks

Peer-to-peer (P2P) networks distribute functionality across all participating nodes, eliminating the need for centralized servers. Each node (peer) acts as both client and server. BitTorrent, the most widely used P2P protocol, splits files into small pieces and allows peers to download different pieces from multiple sources simultaneously, maximizing download speed.

The Chord DHT (distributed hash table), proposed by Stoica et al. in 2001, provides efficient key lookup in a P2P network. Keys and node identifiers are mapped to the same identifier space (a circle of $2^{160}$ points). Each node maintains a finger table with $O (lo g N)$ entries, enabling lookups in $O (lo g N)$ hops, where $N$ is the number of nodes. This logarithmic scalability makes DHTs suitable for very large networks. BitTorrent uses a similar approach (Kademlia) for its distributed tracker, allowing peers to find each other without a central tracker server.

Connections Master

Connections to operating systems

Network protocols are implemented in the operating system kernel. The TCP/IP stack runs in kernel space, providing socket APIs to user-space applications. Network device drivers manage network interface cards (NICs). The kernel handles packet buffering, segmentation, and reassembly. The performance of network-intensive applications depends heavily on how efficiently the OS manages network buffers, interrupt handling, and system call overhead.

The Berkeley sockets API, introduced in BSD Unix in 1983, became the standard interface for network programming. The socket(), bind(), listen(), accept(), connect(), send(), and recv() system calls provide a uniform interface to network communication regardless of the underlying protocol. This abstraction was so successful that virtually every operating system adopted it. Modern high-performance networking uses techniques like io_uring (Linux), which reduces the overhead of system calls by allowing applications to submit I/O requests through shared ring buffers, and kernel bypass (DPDK), which lets user-space applications access NIC hardware directly.

Connections to security

Network security is a vast field that includes encryption (TLS, IPsec), authentication (certificates, Kerberos), firewalls (packet filtering, deep packet inspection), intrusion detection, and denial-of-service mitigation. The original internet protocols were designed for a trusted environment and lacked security. Security was bolted on later, leading to the patchwork of solutions we use today.

IPsec (IP Security), standardized in 1995, provides network-layer encryption and authentication. Unlike TLS, which operates at the transport layer and protects only TCP connections, IPsec protects all IP traffic (including UDP and ICMP). IPsec is widely used for VPNs, creating secure tunnels between corporate networks over the public internet. Transport mode encrypts only the payload, while tunnel mode encrypts the entire original packet and adds a new IP header, hiding the original source and destination addresses.

Connections to distributed systems

Networks are the foundation of distributed systems. The CAP theorem, consensus algorithms (Paxos, Raft), and distributed hash tables all assume an underlying network. Network properties like latency, bandwidth, reliability, and partition behavior directly affect the design and correctness of distributed algorithms.

The fallacies of distributed computing, attributed to Peter Deutsch (1994), list assumptions that programmers new to distributed systems often make: the network is reliable, latency is zero, bandwidth is infinite, the network is secure, topology does not change, there is one administrator, transport cost is zero, and the network is homogeneous. Every one of these assumptions is false in practice, and failing to account for them leads to fragile systems.

Connections to economics

Network infrastructure involves significant economic considerations. Internet exchange points (IXPs), where networks connect to exchange traffic, reduce costs by allowing direct peering instead of routing through transit providers. The economics of peering and transit shape internet topology: large content providers (Google, Netflix, Facebook) build their own fiber networks and peer directly with ISPs to reduce latency and avoid transit costs. Net neutrality debates center on whether ISPs should be allowed to charge different rates for different types of traffic, creating economic incentives that shape network architecture.

Historical and philosophical context Master

From ARPANET to the internet

The ARPANET, created by the U.S. Department of Defense's Advanced Research Projects Agency (DARPA) in 1969, was the first operational packet-switched network. It connected four university sites and demonstrated that packet switching was a viable alternative to circuit switching for data communication.

Packet switching, proposed independently by Paul Baran (RAND Corporation, 1964) and Donald Davies (National Physical Laboratory, UK, 1965), was a radical departure from the circuit-switched telephone network. In circuit switching, a dedicated path is established between sender and receiver for the entire communication. In packet switching, data is divided into small packets that are routed independently through the network, sharing links with packets from other communications. Baran's motivation was survivability: a distributed packet-switched network could route around damage, making military communications resilient to nuclear attack. Davies's motivation was efficiency: sharing links among many communications was more economical than dedicating circuits.

The development of TCP/IP by Vint Cerf and Bob Kahn in 1974 was the key innovation that enabled the modern internet. Their design allowed heterogeneous networks (using different link-layer technologies) to interconnect through a common network-layer protocol (IP). This "internetting" concept, connecting networks of networks, gave the internet its name.

The transition from ARPANET's NCP protocol to TCP/IP on January 1, 1983, known as "flag day," was one of the largest coordinated technology migrations in history. Every host on the ARPANET had to switch simultaneously, because NCP and TCP/IP were incompatible. This successful transition demonstrated the viability of the TCP/IP protocol suite and paved the way for the broader internet.

The World Wide Web, invented by Tim Berners-Lee at CERN in 1989, was an application layer innovation that transformed the internet from a research tool into a global communication platform. The web's key inventions, HTML (HyperText Markup Language), HTTP (HyperText Transfer Protocol), and URLs (Uniform Resource Locators), created a user-friendly interface to the internet's underlying infrastructure.

The end-to-end principle

The end-to-end principle, articulated by Saltzer, Reed, and Clark in 1984, argues that application-level functions should not be built into the network itself. The network should provide a simple, general-purpose transport service, and applications should implement their own reliability, security, and other features at the endpoints.

This principle guided the design of the internet: the network (IP) provides best-effort delivery, and endpoints (TCP) add reliability. This separation of concerns allowed the internet to support a vast range of applications that its designers never imagined, from video streaming to cryptocurrency.

The principle has been challenged by modern network requirements. NAT devices violate end-to-end by modifying addresses in transit. Firewalls inspect and filter traffic based on application-layer content. CDNs cache content at intermediate points rather than retrieving it from the origin. Deep packet inspection is used for traffic management, security, and surveillance. These middleboxes improve performance and security but reduce the generality of the network, creating complications for new protocols and applications.

Net neutrality and the politics of networks

Network architecture has political implications. Net neutrality, the principle that internet service providers should treat all traffic equally, is a debate about whether the network should remain a general-purpose infrastructure or evolve toward a differentiated service model.

The end-to-end principle supports net neutrality: if the network is simple and general, innovation happens at the edges. If the network makes application-specific decisions, it favors established services over new ones. This debate continues to shape internet policy worldwide.

The social impact of connectivity

The internet has transformed nearly every aspect of human society: communication (email, messaging, social media), commerce (e-commerce, digital payments), education (online courses, digital libraries), entertainment (streaming, gaming), government (e-government services, digital citizenship), and healthcare (telemedicine, health informatics). The COVID-19 pandemic demonstrated the critical importance of internet infrastructure: billions of people relied on it for work, education, and social connection during lockdowns.

The digital divide, the gap between those with and without internet access, remains a significant challenge. Approximately 2.7 billion people remain offline, primarily in developing countries. Even in developed countries, rural areas often lack broadband access. The quality of connectivity (bandwidth, latency, reliability) varies enormously, creating a second-level digital divide where connected individuals in developing regions have access that is orders of magnitude worse than in developed regions.

Bibliography Master

Primary sources

Cerf, V.G. and Kahn, R.E. (1974). "A protocol for packet network intercommunication." IEEE Transactions on Communications, 22(5), 637-648.
Clark, D.D. (1988). "The design philosophy of the DARPA Internet protocols." ACM SIGCOMM Computer Communication Review, 18(4), 106-114.
Saltzer, J.H., Reed, D.P., and Clark, D.D. (1984). "End-to-end arguments in system design." ACM Transactions on Computer Systems, 2(4), 277-288.
Chiu, D.M. and Jain, R. (1989). "Analysis of the increase and decrease algorithms for congestion avoidance in computer networks." Computer Networks and ISDN Systems, 17(1), 1-14.
Jacobson, V. (1988). "Congestion avoidance and control." ACM SIGCOMM, 314-329.
Stoica, I. et al. (2001). "Chord: A scalable peer-to-peer lookup service for internet applications." ACM SIGCOMM, 149-160.
Iyengar, J. and Thomson, M. (2021). "QUIC: A UDP-based multiplexed and secure transport." RFC 9000.

Secondary sources

Kurose, J.F. and Ross, K.W. (2016). Computer Networking: A Top-Down Approach (7th ed.). Pearson.
Tanenbaum, A.S. and Wetherall, D.J. (2011). Computer Networks (5th ed.). Pearson.
Peterson, L.L. and Davie, B. (2011). Computer Networks: A Systems Approach (5th ed.). Morgan Kaufmann.
Varghese, G. (2005). Network Algorithmics: An Interdisciplinary Approach to Designing Fast Networked Devices. Morgan Kaufmann.
Perlman, R. (2000). Interconnections: Bridges, Routers, Switches, and Internetworking Protocols (2nd ed.). Addison-Wesley.
RFC 791 (Internet Protocol), RFC 793 (Transmission Control Protocol), RFC 2616 (HTTP/1.1), RFC 7540 (HTTP/2).

Prerequisites

25.05.01

Tier anchors

beginner: Kurose and Ross, Computer Networking: A Top-Down Approach (7e), Ch. 1-3; Peterson and Davie, Computer Networks: A Systems Approach (5e), Ch. 1-2
intermediate: Kurose and Ross, Computer Networking: A Top-Down Approach (7e), Ch. 4-6; Tanenbaum and Wetherall, Computer Networks (5e), Ch. 3-5
master: Clark, The Design Philosophy of the DARPA Internet Protocols (1988); RFC 791, 793, 2616; Varghese, Network Algorithmics

References

computer-science · Ch. 0
Kurose, J.F. and Ross, K.W., Computer Networking: A Top-Down Approach (7e, Pearson, 2016) · Ch. 1-6 · source being verified
Tanenbaum, A.S. and Wetherall, D.J., Computer Networks (5e, Pearson, 2011) · Ch. 1-5 · source being verified
Clark, D.D., The Design Philosophy of the DARPA Internet Protocols, ACM SIGCOMM 1988 · pp. 106-114 · source being verified

Estimated time

beginner: 30m
intermediate: 55m
master: 80m