Beyond the Chip: How AI's Next Battleground Moves to Rack and POD Scale

Inside NVIDIA, CSPs, and ASIC Designers' Strategies

Jul 03, 2026

AI computing demand is growing exponentially, far outpacing Moore’s Law, while power consumption and cost pressures are escalating rapidly. The core metric for AI computing efficiency will no longer be individual chip performance, but rack-level performance per watt and performance per total cost of ownership. This signals a broader industry shift: semiconductor players must evolve from chip suppliers into providers of rack and system-level solutions. This transformation is already unfolding across every layer of AI data center infrastructure, from the expansion of SSD PODs at the storage layer, to Rack/POD scale competition at the compute and interconnect layer.

How AI Inference Is Creating New Memory Demand

TrendForce

Jun 15

Read full story

Against this backdrop, the competitive landscape for ASIC design and service providers is also changing. This article examines how NVIDIA and major CSPs are positioning themselves at the AI rack/POD scale, and how ASIC designers, including Broadcom, Marvell, MediaTek, Global Unichip (GUC), and Alchip, are leveraging interconnect technologies such as CPO to expand their reach from chip scale into the rack/POD scale arena.

Related report: AI Interconnect Outlook: NVIDIA Leads the Transition to CPO and Silicon Photonics Architectures

NVIDIA Expands Beyond the Chip to Every Layer of AI Infrastructure

First, a quick definition: an AI POD (Point of Delivery) is a standalone AI computing unit composed of compute nodes, networking, storage, and software. Scale-up interconnect (vertical scaling within a single system) is the technology that enables multiple chips, and even cross-rack configurations, to operate as a single unified system with ultra-low latency, forming the foundation for Rack/POD-scale integration.

As the primary unit of AI computing continues to expand, NVIDIA unveiled not only seven new Rubin-series chips at its March 2026 GTC, but also five MGX-series racks capable of integrating these chips into a single rack, and racks can be interconnected via switches to form a large-scale Vera Rubin POD.

To further strengthen its AI data center hardware portfolio, NVIDIA has in recent years secured key talent and technology licenses from memory-interconnect switch IC vendor Enfabrica and AI inference LPU vendor Groq, while also investing in quantum computing and optical-communication component suppliers, all in pursuit of pushing POD-scale performance to the extreme.

Seven chips of NVIDIA's Rubin Series announced at GTC 2026: Rubin GPU, Vera CPU, ConnectX-9 SuperNIC, BlueField-4 DPU, NVLink-6 Switch, Spectrum-X CPO, and Groq 3 LPU. — Seven Chips of NVIDIA’s Rubin Series. Source: NVIDIA

Five MGX rack-scale systems in NVIDIA's Vera Rubin platform: NVL72, Groq 3 LPX, Vera CPU, BlueField-4 STX Storage, and Spectrum-6 SPX. — Five MGX Racks of NVIDIA’s Rubin Series. Source: NVIDIA

Related report: NVIDIA FY1Q27 AI Server Outlook: GB/VR Rack Leads

Google, AWS and Microsoft Build Proprietary AI Rack/POD around Efficiency

To improve AI computing performance, CSPs are not only adopting large on-chip SRAM in their inference chips, but also designing proprietary AI racks and PODs. Google has been the most aggressive, adopting its own scale-up technology ICI (Inter-Chip Interconnect) as early as 2017, introducing OCS (Optical Circuit Switch) in 2021, and planning to deploy its in-house Axion CPU and Boardfly network architecture in the TPU v8 series in 2026.

AWS plans to deploy its proprietary NeuronLink and Graviton CPUs in the 2026 Trainium3 UltraServers, while Microsoft plans to deploy an Ethernet scale-up network and Cobalt 200 CPUs in its Maia 200 POD. Meta takes a different approach entirely, adopting the open OCP (Open Compute Project) standard to build a standardized data center hardware ecosystem aimed at reducing overall costs.

TrendForce's performance comparison of AI chips from NVIDIA, Google, AWS, Meta, and Microsoft, covering compute power, HBM capacity, bandwidth, TDP, and efficiency.

Rack-level comparison of AI systems from NVIDIA, Google, AWS, Meta, and Microsoft, showing chip counts per rack, compiled by TrendForce.

Related report: 2026 AI Server Outlook: CSP Rack Power Scales Up

ASICs: Broadcom and Marvell lead; MediaTek, GUC and Alchip bet on CPO

The trend of CSPs independently designing AI racks/PODs has pushed ASIC design and service providers to expand beyond chip-scale offerings into rack/POD-scale solutions. For ASIC designers, this creates an entirely new incremental market.

Broadcom and Marvell have already built out comprehensive product lines spanning switches and NICs, while also taking an early lead in CPO technology. MediaTek, GUC, and Alchip are initially focusing on establishing strong track records in AI chip development, with plans to leverage CPO integration platforms as their path into the rack/POD scale market.

As shown in TrendForce’s product line overview, CPO is the technology track, aside from AI chips, that all six major companies have committed to investing in, signaling broad industry recognition that CPO has become one of the critical differentiators for next-generation AI rack/POD performance.

TrendForce's overview of AI data center product lines across NVIDIA and five ASIC designers, covering AI chips, switches, CPO, CPC, and AOC.

For an in-depth analysis of each vendor’s strategy and product roadmap, access our full report: ASIC Design and Service Competition Shifts from Chip Scale to Rack/POD Level.

Buy the report

How AI Inference Is Creating New Memory Demand

Discussion about this post

Ready for more?