AI Interconnects: The infrastructure behind intelligence

An array of copper, LPO, CPO, PAM4, coherent-lite, and coherent ZR can address AI data center needs.
July 26, 2025
5 min read

By Kishore Atreya / Marvell

As artificial intelligence (AI) transforms industries, its insatiable appetite for compute power is reshaping the data center from the ground up. AI workloads—especially large-scale training and inference— push the boundaries of GPUs and accelerators, placing unprecedented demands on the interconnects that bind compute, memory, and storage together. These threads of fiber and copper are now as vital as the processors they connect.

Inside the world’s most advanced data centers, AI infrastructure is no longer about standalone servers; it requires massive, tightly coupled clusters of compute acting as one cohesive system. This shift from isolated processing to interconnected intelligence puts a spotlight on the next evolution: AI interconnects.

Mapping the AI interconnect hierarchy

AI architectures mirror a hierarchy similar to memory systems that organizes interconnects by reach, bandwidth, latency, and power consumption. At the base lies scale-up connectivity: the high-performance, ultra-low latency links that directly connect GPUs and AI accelerators (XPUs) within a tray or rack. As AI models grow and training clusters expand, scale-out networks take over, linking racks, rows, and entire facilities together. Moving beyond the traditional walls of a single data center, data center interconnects (DCIs) connect campuses and geographies.

Each layer comes with a unique set of challenges and requirements. Meeting these requirements demands a range of optical and electrical technologies purpose-built to deliver performance, efficiency, and scalability at every tier.

Scaling up: From copper to integrated optics

The scale-up domain connects GPUs and XPUs at the lowest latency possible. Traditionally served by copper-based solutions, such as passive cables or traces on a PCB, this layer is hitting its physical limits. As interconnect speeds jump to 200 Gbps and beyond, copper is increasingly constrained by reach, signal integrity, and power.

A solution is found in optics. New technologies like linear pluggable optics (LPO) are extending bandwidth within the rack while maintaining the energy efficiency of copper. By shifting signal processing tasks to the host silicon and tightly co-designing the electrical and optical elements, LPO enables a drop-in optical replacement with lower power and latency than traditional optical modules.

For even tighter integration, near-packaged optics (NPO) and co-packaged optics (CPO) bring optics directly next to or inside the accelerator package. These approaches virtually eliminate electrical traces between the XPU and optical engine, reducing power and increasing bandwidth density. CPO, in particular, offers the promise of scaling cluster sizes from tens of XPUs to hundreds, even thousands, with predictable performance and lower total system power.

Scaling out: The optical fabric of AI

As AI clusters grow from a single rack to multiple rows and pods, scale-out interconnects form the optical fabric that weaves everything together. These links, often built on PAM4 optical DSPs, must support massive bandwidths, low latency, and high reliability across distances of tens to hundreds of meters.

Today’s PAM4 DSPs power the world’s most advanced Ethernet and InfiniBand networks, enabling AI workloads to move data seamlessly across switches and nodes. With bandwidth demands doubling every two years, DSP innovation keeps pace by moving to 3nm processes and 200 Gbps per-lane signaling, enabling 1.6 Tbps modules and beyond.

For distributed AI campuses, coherent-lite technology is gaining traction. Offering longer reach (2–20 km) than PAM4 but lower cost and power than traditional coherent systems, coherent-lite DSPs operating in the O-band connect buildings within a data center campus. This helps operators overcome power and space limitations within a single facility.

Across the campus and beyond: Data center interconnects

Once AI clusters outgrow the boundaries of a campus, DCIs carry the load, linking geographically distributed compute clusters across cities or continents. Coherent ZR optics are dominant, with solutions like 800G ZR/ZR+ modules enabling multi-terabit connections up to 2,500 kilometers apart1.

These coherent links maximize fiber utilization by using dense wavelength division multiplexing (DWDM) and advanced modulation. This makes them indispensable for scaling AI workloads across facilities while maintaining real-time performance and redundancy.

The future is integrated, optical, and diverse

No single interconnect technology can meet all the demands of the AI era. Instead, a spectrum of technologies, such as copper, LPO, CPO, PAM4, coherent-lite, and coherent ZR, are tailored to meet specific challenges across the hierarchy.

What ties them together is a common goal: enabling a scalable, efficient, high-performance AI infrastructure. The move to optics is accelerating across all layers, not just for bandwidth, but for power, thermal, and density benefits that are now mission-critical at AI scale.

Vendors, developers, and cloud operators must partner to co-design systems where interconnects are no longer an afterthought, but a core architectural pillar.

As AI continues its meteoric rise, it’s clear that the future of AI isn’t just about faster chips; it’s about better connections. And in this race, the winners will be those who think beyond silicon and embrace the fiber—and the architecture—that binds it all together.

Kishore Atreya is the senior director of cloud platform marketing at Marvell.

1. Marvell, Serve the Home, February 2025. 800G ZR/ZR+ modules at full bandwidth up to 1000km and 2,500km at 400G.

 

Sign up for Lightwave Newsletters
Get the latest news and updates.