How Broadcom's Three-Chip Strategy Tackles AI's Biggest Networking Bottleneck

In AI infrastructure, the network has evolved from a supporting role to center stage. As AI clusters demand unprecedented bandwidth and razor-sharp latency, traditional networking approaches are breaking down. Broadcom’s response? Three specialized Ethernet switch chips—each designed to solve specific pieces of the AI networking puzzle.

At a recent AI Infrastructure Field Day event, Broadcom executives provided in-depth technical briefings on these new chips, showcasing how their features directly address the complex demands of modern AI infrastructure. They emphasized a full-stack approach, providing not just the switch silicon but also network interface cards (NICs), NIC chiplets for custom XPU integration, physical layer products like retimers and optics, and a framework for using Ethernet efficiently for scale-up.

The Scale Challenge: Why One Chip Can’t Rule Them All

AI workloads don’t follow a single pattern. Training massive models requires different network characteristics than running real-time inference. Connecting processors within a rack demands ultra-low latency, while linking data centers across continents prioritizes reliability and distance.

Broadcom has segmented these challenges into three distinct domains:

Scale-up handles high-speed connections within single racks or small clusters, where every nanosecond of latency matters.
Scale-out connects these rack-level clusters within a data center, requiring massive bandwidth.
Scale-across links entire data centers into distributed AI supercomputers, where distance and reliability become critical factors.

Rather than forcing a single solution across all scenarios, Broadcom built purpose-designed silicon for each challenge.

Meet the Specialists: Three Chips, Three Missions

Tomahawk 6: The Bandwidth Beast

The Tomahawk 6 delivers 102.4 Tbps of switching capacity—double any competing switch. But raw bandwidth is only part of the story.

This chip’s real innovation lies in network simplification. Traditional large-scale networks require three tiers of switches, creating complexity and consuming enormous amounts of power through additional optics. Tomahawk 6 enables two-tier networks supporting up to 131,000 processors, eliminating that costly third tier entirely.

The chip comes in two configurations: one with 512 lanes at 200 Gbps each, and another with 1,024 lanes at 100 Gbps. This flexibility lets network architects adapt as processor and optical technologies evolve without requiring complete infrastructure overhauls.

Perhaps most importantly, Tomahawk 6 features “cognitive routing” with Global Load Balancing—a system that shares congestion information across the entire network in real-time, responding to failures 10,000 times faster than centralized management systems.

Tomahawk Ultra: The Latency Killer

Built from the ground up for ultra-low latency applications, the Tomahawk Ultra addresses Ethernet’s traditional weakness in high-performance computing environments. Tomahawk Ultra’s monolithic design achieves “ball-to-ball” (input pin to output pin) sub-250 nanosecond latency—a fixed figure regardless of traffic load.

This chip’s standout feature is in-network collectives, where switches perform computational operations like data reductions on-the-fly. Instead of forcing processors to handle these operations through multiple data transfers, the network itself does the work, dramatically reducing bandwidth requirements and improving overall performance.

This approach transforms network switches from passive data movers into active computational participants—a fundamental shift that could reshape how AI training workloads operate.

Jericho 4: The Distance Champion

Building AI clusters that span multiple buildings or cities requires solving problems that local networks never face. Jericho 4 tackles these challenges through four core capabilities: scale, robustness, distance, and security.

The chip’s most critical innovation is its deep buffer system based on high-bandwidth memory (HBM). This allows lossless data transmission over distances up to 100 kilometers—essential when round-trip communication times require massive buffers to prevent packet loss.

For handling large AI data flows, Jericho 4 introduces 3.2 Tbps HyperPorts that aggregate four 800 Gbps links without traditional load-balancing hash functions, improving link utilization by over 70%.

An embedded security engine provides line-rate encryption without performance penalties—crucial for distributed clusters that may cross public networks.

The Ethernet Advantage: Open Infrastructure for AI’s Future

Broadcom’s strategy represents more than just faster silicon—it’s a bet on Ethernet as the foundation for AI networking. While proprietary fabrics like InfiniBand have dominated high-performance computing, Ethernet offers compelling advantages for AI infrastructure.

The multi-vendor Ethernet ecosystem drives costs down and accelerates innovation. Organizations avoid vendor lock-in while benefiting from competition among suppliers. Using common networking technology across scale-up and scale-out domains simplifies operations—the same tools and expertise apply throughout the entire cluster.

Real-World Impact: Beyond the Specifications

These efficiency gains translate directly into business value. Eliminating network tiers reduces both capital expenses and ongoing power consumption. In-network processing cuts bandwidth requirements. Simplified architectures reduce operational complexity and the specialized expertise needed to manage them.

For organizations building AI infrastructure, this specialized approach offers a clear path forward. Rather than compromising with general-purpose solutions, network architects can now select optimal components for each part of their cluster—ultra-low latency where it matters most, massive bandwidth where throughput is king, and robust long-distance connectivity where reliability is paramount.

The Path Forward

As AI workloads continue growing in size and complexity, networking infrastructure must evolve to match these demands. Broadcom’s three-chip strategy represents a fundamental shift toward specialized solutions that address specific AI networking challenges rather than attempting universal fixes.

This approach signals a broader transformation in how the industry thinks about AI infrastructure. Network architects now have the flexibility to optimize each layer of their systems—deploying ultra-low latency solutions where microseconds matter, massive bandwidth where throughput drives performance, and robust long-distance connectivity where reliability enables new possibilities.

The implications extend beyond individual data centers. Organizations can now design truly distributed AI systems that span geographic regions while maintaining the performance characteristics needed for cutting-edge research and applications. This opens new possibilities for collaborative AI development and democratizes access to supercomputing-class resources.

You can watch all of the Broadcom presentations on the Tech Field Day website.

TECHSTRONG TV

Click full-screen to enable volume control

Watch latest episodes and shows

How Broadcom’s Three-Chip Strategy Tackles AI’s Biggest Networking Bottleneck

The Scale Challenge: Why One Chip Can’t Rule Them All