← All Essays
◆ Decoded AI/Tech ~13 min read

Internet Infrastructure Decoded

Core Idea: The internet is not a cloud. It is not wireless. It is not immaterial. It is approximately 1.5 million kilometers of undersea fiber-optic cable, a few hundred massive data centers, a layered stack of protocols designed in the 1970s, and a naming system ultimately overseen by a single nonprofit in Los Angeles. Everything you do online—every search, every stream, every message—depends on physical objects in physical locations owned by specific organizations. Understanding the internet means understanding those objects, those owners, those chokepoints, and who has the power to cut the wire.

You open your laptop. You type "google.com" into your browser and press Enter. In about 300 milliseconds, a page appears. It feels instantaneous. It feels like magic. But in that fraction of a second, your request has been translated into a numerical address, routed through a dozen or more networks, possibly crossed an ocean floor via a glass fiber thinner than a human hair, reached a server in a data center the size of a football stadium, triggered a response, and made the return trip—all governed by protocols designed by a handful of engineers in the 1970s and 1980s. The internet is not magic. It is engineering. And like all engineering, it has specific constraints, specific failure modes, and specific people who control its critical junctions.

Most people use the internet every waking hour without any mental model of what it physically is. That's by design—the entire architecture is built to be invisible. But invisibility is not the same as invulnerability. When you don't understand the infrastructure, you can't evaluate the risks: who can see your data, where the system can break, who has the power to shut it down, and why the dream of a single, global, open internet may already be over.

The Physical Layer: Glass, Light, and the Ocean Floor

Let's start at the bottom. The internet's physical foundation is fiber-optic cable—thin strands of ultra-pure glass that carry data as pulses of laser light. A single strand of optical fiber can carry staggering amounts of information. Modern submarine cables use a technique called wavelength-division multiplexing, or WDM, which splits a single fiber into dozens of channels, each carrying a different wavelength of light. Think of it as turning one lane of highway into forty parallel lanes, each carrying a different color of car. A single modern submarine cable can carry over 200 terabits per second—enough to transmit the contents of the entire Library of Congress in under a second.

There are roughly 550 active submarine cable systems crisscrossing the world's oceans right now. They carry approximately 99% of all intercontinental data. Not 50%. Not 80%. Ninety-nine percent. Satellites—despite what most people assume—handle less than 1% of international traffic. Elon Musk's Starlink and other satellite constellations are impressive for reaching remote areas, but for the bulk of global communications, everything runs through glass on the ocean floor.

These cables are surprisingly modest physical objects. On the deep ocean floor, a submarine cable is typically about 17-21 millimeters in diameter—roughly the width of a garden hose. Near shore, they're armored and buried to protect against anchors and fishing trawls, but in the deep ocean they simply rest on the seabed. They're laid by a small fleet of specialized ships—maybe a dozen or so worldwide capable of the job—and repairs, when cables break, require sending one of those ships to the location, grappling the cable from the ocean floor, bringing it to the surface, and splicing it. This can take weeks.

The cables come ashore at specific locations called cable landing stations. There are roughly 1,200 of these globally. And this is where the physical vulnerability becomes clear. Many of the world's most critical data routes converge at a remarkably small number of landing points. Egypt's Mediterranean coast near Alexandria and the Suez corridor carries an estimated 17% of global internet traffic—a consequence of the geographic reality that cables connecting Europe and Asia naturally route through the Red Sea and Suez Canal. Marseille, France is another massive node. Singapore. Mumbai. A handful of points in the United States and United Kingdom.

Once ashore, data travels through terrestrial fiber-optic networks—the "backbone"—owned by a relatively small number of Tier 1 carriers: companies like Lumen, NTT, Telia Carrier, and a few others. These carriers have peering agreements—essentially handshake deals—that allow traffic to cross from one network to another. The last mile, the connection to your home or phone, uses a mix of fiber, coaxial cable, old copper phone lines, or wireless. This final stretch is almost always the bottleneck. When your video call stutters, the problem is rarely in the backbone and almost never in the undersea cables. It's in the last few kilometers.

And here's something worth internalizing: "wireless" is not wireless. Your phone connects to a cell tower via radio waves—that part is wireless. But the cell tower connects to a fiber-optic line. Your Wi-Fi router connects to a cable or fiber line. Even satellite internet requires ground stations connected to the terrestrial fiber network. "Wireless" refers to the last few hundred meters. Everything else is glass and copper, buried in the ground or lying on the ocean floor.

The Protocol Stack: Rules of the Road

Physical cables move raw pulses of light. But to turn those pulses into websites, emails, and video calls, you need protocols—agreed-upon rules for how data gets packaged, addressed, routed, and delivered. The internet's protocol architecture is layered, with each layer handling one specific job.

At the bottom is the physical and link layer—the raw transmission of bits and the rules for moving data between directly connected devices (things like Ethernet and Wi-Fi protocols). Above that sits the network layer, dominated by the Internet Protocol, or IP. IP does one thing: it assigns addresses to devices and routes packets—small chunks of data—from one address to another across networks. Every device on the internet has an IP address. The original scheme, IPv4, used 32-bit addresses, giving us about 4.3 billion possible addresses. That seemed like a lot in 1981. It wasn't. IPv4 addresses are now exhausted, and the aftermarket price for an IPv4 address block has climbed to $30-50 per address. The replacement, IPv6, uses 128-bit addresses—providing a number of possible addresses so large it's effectively infinite. The transition has been happening for over two decades and is still incomplete.

IP is deliberately simple. It makes no guarantees about delivery, order, or integrity. Packets can arrive out of order, get duplicated, or vanish entirely, and IP doesn't care. That reliability is handled by the layer above: the Transmission Control Protocol, or TCP. When you hear "TCP/IP," you're hearing the names of the two protocols that together make the internet work. Vint Cerf and Bob Kahn, often called the "fathers of the internet," published the original TCP specification in 1974. TCP adds what IP lacks: it ensures packets arrive, arrive in order, and arrive intact. It manages congestion so that senders don't overwhelm the network. It's the reason the internet feels reliable even though the underlying infrastructure is anything but. The alternative, UDP, sacrifices reliability for speed and is used where a dropped packet is acceptable—live video, online gaming, voice calls. Better a tiny glitch than a half-second delay.

At the top sits the application layer: HTTP for the web (Tim Berners-Lee's 1989 invention that transformed the internet from an academic network into a global medium), SMTP for email, and dozens of other protocols for specific tasks.

But the protocol that's perhaps most critical—and most vulnerable—is one most people have never heard of: BGP, the Border Gateway Protocol. The internet is not one network. It's approximately 75,000 independently operated networks called Autonomous Systems, or ASes. BGP is how these networks tell each other what addresses they can reach and negotiate the paths between them. If TCP/IP is the postal system, BGP is the map that tells the postal trucks which routes exist.

Here's the problem: BGP was designed in an era when the internet was small and its operators trusted each other. It runs on trust. When a network announces "I can reach these addresses," neighboring networks generally believe it. There is no built-in authentication. This means that any network can, either by accident or malice, announce false routes—claiming to be the best path to addresses it doesn't actually serve. This is called BGP hijacking, and it has happened repeatedly. In 2008, Pakistan Telecom accidentally hijacked YouTube globally while trying to block it domestically. In October 2021, Facebook's own engineers accidentally withdrew their BGP routes, effectively erasing Facebook, Instagram, and WhatsApp from the internet for nearly six hours. The entire company was unreachable—not because of a hack, but because of a routing configuration mistake. A fix called RPKI (Resource Public Key Infrastructure) adds cryptographic verification to BGP announcements, but adoption remains incomplete, and the internet's routing system remains disturbingly fragile.

The Naming System: Who Controls the Names

When you type "google.com" into your browser, your computer doesn't know what to do with that name. It needs a numerical IP address. So it asks the Domain Name System—DNS—to translate the name into a number. DNS is the internet's phone book, and its structure reveals a great deal about who ultimately controls the network.

DNS is hierarchical. At the top are the root servers—13 identities, labeled A through M, operated by 12 organizations including Verisign, the U.S. Department of Defense, NASA, and ICANN (the Internet Corporation for Assigned Names and Numbers). These 13 identities are distributed across over 1,700 physical server instances worldwide using a technique called anycast, which routes queries to the nearest instance. The root servers don't store the addresses of every website—they point to the next level down, the top-level domain servers (.com, .org, .uk, and so on), which in turn point to the authoritative nameservers for individual domains.

The organization at the center of all this is ICANN—a nonprofit headquartered in Los Angeles that coordinates the global DNS root, allocates IP address ranges, and manages protocol parameters. ICANN was created in 1998 under a contract with the U.S. Department of Commerce. For its first eighteen years, the U.S. government retained direct contractual oversight of the root zone—the master file that defines the top of the DNS hierarchy. In October 2016, this oversight was transferred to a multi-stakeholder governance model within ICANN. The U.S. government no longer has formal control.

But ICANN remains incorporated under California law, subject to U.S. jurisdiction. This structural fact—that the organization coordinating the global naming system is a U.S. legal entity—gives the United States residual leverage that other nations find uncomfortable and have repeatedly challenged. It's one of the tensions that fuels the fragmentation discussed later.

Below ICANN, IP addresses are distributed by five Regional Internet Registries: ARIN for North America, RIPE NCC for Europe and the Middle East, APNIC for Asia-Pacific, LACNIC for Latin America, and AFRINIC for Africa. These organizations allocate address blocks to ISPs and large organizations. They are, in effect, the landlords of the internet's numerical real estate.

Content Delivery: Why Netflix Doesn't Buffer

Here's a question worth asking: if Netflix has roughly 280 million subscribers worldwide, all streaming high-definition video simultaneously during peak hours, why doesn't the internet collapse? The answer reveals one of the most important architectural innovations of the modern internet: content delivery networks, or CDNs.

When you watch a show on Netflix, the video doesn't stream from some central server farm in Silicon Valley. It streams from a custom server called an Open Connect Appliance that's been physically installed inside your ISP's data center or at a nearby Internet Exchange Point. Netflix has deployed over 18,000 of these appliances in roughly 6,000 locations across 175 countries. During off-peak hours—late at night—popular content is quietly pre-loaded to these local servers. So when you hit play at 8 PM, your "stream" might travel 50 kilometers, not 5,000. The experience feels centralized, but the infrastructure is radically distributed.

CDN providers like Cloudflare, Akamai, and Fastly generalize this approach. Cloudflare alone operates servers in over 310 cities across 120+ countries. When you visit a website behind Cloudflare, you're connecting to a server that might be in your city, serving a cached copy of the page. This reduces latency (the speed of light is fast but finite—a signal takes about 67 milliseconds just to cross the Atlantic through fiber, and real-world latency is higher due to routing and processing), reduces load on backbone links, and provides resilience against distributed denial-of-service attacks by absorbing malicious traffic across thousands of edge locations.

The evolution continues with edge computing—moving not just content but actual computation to locations near users. Services like Cloudflare Workers run application code at edge nodes worldwide. For anything latency-sensitive—real-time ad auctions that must complete in under 100 milliseconds, IoT sensor processing, game state management—pushing computation to the edge is becoming standard. The trend is clear: the internet is decentralizing at the content layer even as it concentrates at the infrastructure layer.

Cloud Computing: Renting Someone Else's Computer

The phrase "the cloud" is one of the most successful pieces of marketing in technology history. It evokes something ethereal, distributed, weightless. The reality is concrete: the cloud is large rooms full of servers in specific buildings in specific places, owned by specific companies. When you store files "in the cloud," you're storing them on someone else's hard drive. When you run software "in the cloud," you're renting time on someone else's processor.

Three companies dominate cloud infrastructure: Amazon Web Services with roughly 31% market share, Microsoft Azure with about 25%, and Google Cloud Platform with approximately 11%. Together, these three control roughly two-thirds of the global market. AWS alone operates over 100 data centers across 33 geographic regions. A single large data center might house 50,000-80,000 servers, consume 30-100+ megawatts of electricity, and require millions of gallons of water daily for cooling.

This concentration creates a specific kind of systemic risk. When AWS's us-east-1 region suffered an outage in December 2021, it didn't just affect Amazon—it took down or degraded Disney+, Slack, Venmo, Ring doorbells, Roomba vacuums, and even some of Amazon's own warehouse operations. The failure of a single company's systems in a single region cascaded across seemingly unrelated services worldwide. This is what infrastructure dependency looks like: dozens of companies that appear independent but share a hidden common foundation.

There's a deeper structural tension here. Amazon operates the largest e-commerce platform in the world while simultaneously selling the cloud infrastructure that its e-commerce competitors depend on. Microsoft sells cloud services to companies that compete with Microsoft's own products. The companies that provide the foundational infrastructure of the digital economy are also participants in that economy, creating conflicts of interest that regulators are only beginning to grapple with.

Chokepoints: Where the Internet Breaks

Paul Baran, one of the internet's intellectual ancestors, designed the concept of packet-switched networking in the 1960s specifically to create a communication system that could survive a nuclear attack. The original vision was a network with so many redundant paths that no single failure—no single bomb—could bring it down. The design succeeded brilliantly at the protocol level. TCP/IP really does route around damage.

But the physical reality has diverged from the design philosophy. Traffic concentrates. Economics favor efficiency over redundancy. And the result is a network with specific, identifiable chokepoints.

Submarine cable landing points are the most obvious. The narrow corridor near Alexandria and Suez carries a disproportionate share of global traffic. In 2013, Egyptian authorities arrested three divers attempting to cut an undersea cable off Alexandria. In February 2024, Houthi-related conflict in the Red Sea damaged multiple cables, disrupting traffic between Europe and Asia. The Strait of Malacca, the Strait of Luzon, and the English Channel represent similar concentrations. The geography of the ocean floor and the economics of cable laying create natural convergence points that cannot easily be routed around.

Internet Exchange Points—physical buildings where networks interconnect—are another concentration. DE-CIX Frankfurt, AMS-IX Amsterdam, and LINX London each handle peak traffic measured in tens of terabits per second. They're efficient—direct peering is faster and cheaper than routing through transit providers—but they create geographic points of vulnerability. The TLS certificate authority system, which underpins the security of nearly all encrypted web traffic, is yet another: a compromise of a major certificate authority (as happened with DigiNotar in 2011) can undermine trust across the entire web.

Surveillance Infrastructure: Who Can See What

Every layer of the internet's architecture exposes different information to different observers, and understanding this is essential to understanding privacy in the digital age.

At the physical layer, fiber-optic cables can be tapped. The NSA's programs revealed by Edward Snowden in 2013—PRISM for data from major platforms, and Upstream for direct taps on backbone fiber—demonstrated that intelligence agencies routinely access traffic at cable landing points and major network junctions. The UK's GCHQ operated a program called TEMPORA that buffered all traffic flowing through certain cables, storing full content for three days and metadata for thirty days. Cable tapping is invisible to end users—there is no way to detect it from your device.

At the network layer, IP addresses are visible to every router in the path. Even when content is encrypted, the metadata—who is communicating with whom, when, for how long, and how much data flows—remains visible. Metadata is extraordinarily revealing. As former NSA and CIA director Michael Hayden stated bluntly: "We kill people based on metadata." You don't need to read the message if you can see that someone in a conflict zone made a call to a known weapons supplier, followed by a call to a known bomb-maker, followed by a trip to a specific location.

At the DNS layer, traditional DNS queries are sent as unencrypted plaintext. Your ISP—and anyone with access to the network between you and your DNS resolver—can see every domain name you look up. Encrypted DNS protocols (DNS-over-HTTPS and DNS-over-TLS) address this, but they shift trust rather than eliminating it: instead of your ISP seeing your queries, now your chosen DNS resolver (often Google at 8.8.8.8 or Cloudflare at 1.1.1.1) sees them. You're choosing your observer, not removing observation.

At the application layer, HTTPS encrypts data in transit—an eavesdropper between you and the server sees encrypted gibberish. But the server itself sees everything. Your emails in Gmail are accessible to Google. Your files in Dropbox are accessible to Dropbox. End-to-end encryption—used by Signal, and selectively by iMessage and WhatsApp—is the only model where even the service provider cannot read your content. This is precisely why governments worldwide repeatedly push for "lawful access" backdoors to end-to-end encryption, and why cryptographers consistently warn that a backdoor accessible to governments is a backdoor accessible to everyone.

The structural reality is this: the internet's architecture makes comprehensive surveillance technically straightforward for any entity with access to backbone infrastructure or major platform servers. Privacy is not a property of the network. It exists only where strong encryption has been deliberately deployed at the application layer—and that encryption exists only because specific engineers and advocates fought to keep it legal and widely available.

Fragmentation: The Splinternet

The internet was built on the assumption that it would be one network—global, interoperable, borderless. That assumption is fracturing.

China's Great Firewall is the most mature model of what a nationally bounded internet looks like. Using deep packet inspection, DNS poisoning, IP blacklisting, and increasingly sophisticated VPN detection, China blocks access to Google, Facebook, Twitter, Wikipedia, and thousands of other foreign services. Domestic alternatives—Baidu for search, WeChat for messaging and payments, Weibo for social media, Douyin (the domestic version of TikTok) for short video—create a parallel information ecosystem serving approximately one billion users. The system works. It demonstrates, definitively, that a national internet boundary is technically feasible at scale.

Russia is following a similar but less mature path. The 2019 "Sovereign Internet" law mandated the infrastructure for disconnecting Russia from the global internet. Deep packet inspection equipment has been deployed at ISPs and exchange points. Russia has conducted disconnection drills. The invasion of Ukraine in 2022 accelerated the program: VPN blocking intensified, Western social media access was restricted, and content filtering expanded. Russia's system is less comprehensive than China's but is evolving rapidly.

The trend extends beyond authoritarian states. India has imposed over 700 internet shutdowns between 2012 and 2023—more than any other country—mostly in Kashmir and during protests. Iran has built a National Information Network as a domestic internet alternative. The European Union's GDPR and Digital Services Act create regulatory boundaries around data flows and content standards. The United States banned TikTok's Chinese parent company from operating in the country. Even democracies are establishing digital borders—not through firewalls, but through regulation, data localization requirements, and platform governance mandates.

The trajectory points toward a world of interconnected but nationally regulated networks rather than one open global system. The underlying protocols—TCP/IP, BGP, DNS—still interoperate across borders. But the content, access, and governance layers are diverging. Three loose blocs are emerging: the U.S.-aligned internet (relatively open but with increasing surveillance and platform regulation), the Chinese model (state-controlled, walled, with parallel domestic services), and a contested middle ground of nations borrowing elements from both approaches.

The original vision—articulated by pioneers like Vint Cerf and Bob Kahn, and embodied in the architecture they designed—was a network that treated national borders as irrelevant. That vision was always partly idealistic, partly a reflection of the internet's initial user base (mostly American, mostly academic, mostly trustful). As the internet became a critical infrastructure carrying trillions of dollars in commerce, sensitive government communications, and the daily information diet of billions of people, the idea that it should exist beyond the reach of national governments became increasingly untenable. The splinternet isn't a failure of the technology. It's a consequence of the technology becoming too important to leave ungoverned.

How This Was Decoded

This analysis traced the physical path of data from the photon in the fiber to the pixel on the screen, mapping infrastructure, protocols, governance structures, and vulnerabilities at each layer. Primary sources included TeleGeography's submarine cable database, ICANN governance and IANA transition documents, BGP routing data from RIPE RIS and RouteViews, cloud market analyses from Synergy Research and Gartner, the Snowden archive as documented by The Intercept and The Guardian, and Freedom House's annual Freedom on the Net reports. The decoding method: follow the physical infrastructure, identify who owns it and who can access it, then map the governance and power structures that determine how it operates. The core pattern that emerges: the internet is not a cloud or an abstraction. It is a physical system with physical owners, physical chokepoints, and physical vulnerabilities. The architecture of the network determines the architecture of power over information. Understanding one requires understanding the other.

Want the compressed, high-density version? Read the agent/research version →

You're reading the human-friendly version Switch to Agent/Research Version →