NVDA/dc_gpu/Google TPU (Trillium / Ironwood) Competitive Threat to NVIDIA

Google TPU (Trillium / Ironwood) Competitive Threat to NVIDIA

$10BKey FigureGoogle's TPU program represents the most mature and vertically integrated custom

Google's TPU program represents the most mature and vertically integrated custom silicon threat to NVIDIA's data center GPU dominance. The 7th-generation TPU v7 (Ironwood), announced April 2025 with limited availability by late 2025, delivers 4,614 FP8 TFLOPS per chip -- slightly exceeding NVIDIA B200's 4,500 TFLOPS -- with 192 GB HBM3E and 7.4 TB/s bandwidth. Ironwood's defining advantage is pod-scale: 9,216 chips interconnected via ICI in a single superpod delivering 42.5 ExaFLOPS, compared to NVLink's 72-GPU ceiling at 0.36 ExaFLOPS.

$10B
Anthropic announcement, Broadcom earning
Anthropic signed deal for up to 1 million Google TPUs with well over 1 GW of com...
$11.25B
Fubon Securities via GlobeNewsWire, Inve
Google TPU shipments projected at 2.5 million units for full year 2025 (1.8M thr...

The Anthropic deal (Oct 2025) -- up to 1M TPUs, $10B in Broadcom-manufactured Ironwood racks plus an $11B follow-on, with remaining capacity rented via GCP totaling ~$52B -- is the largest cloud compute deal ever. In January 2026, Google confirmed TPUs outshipped GPUs by volume for the first time. Google and Meta's TorchTPU collaboration (announced Dec 2025) directly targets CUDA switching costs by enabling native PyTorch execution on TPUs, though production readiness is 12-18 months away. Key limitations: Ironwood is cloud-only (cannot be purchased), ICI bandwidth per chip (1.2 TB/s) trails NVLink (1.8 TB/s), no FP4 support vs Blackwell's FP4 advantage, and the software stack -- historically JAX-only with limited external tooling -- remains inferior to CUDA's two-decade ecosystem. For NVIDIA investors, the TPU threat is most acute in inference (where Google claims 4.7x price-performance vs H100) and in capturing frontier lab spend (Anthropic, potentially Meta), but less threatening for training where NVLink's low-latency interconnect and CUDA's flexibility remain advantages..

Competitive pressure is real but bounded

Custom ASICs and AMD offer cheaper alternatives for specific workloads, but only a handful of companies can afford multi-billion-dollar chip programs. The competitive threat is structural but limited in scope.

The key question

What is the actual MLPerf benchmark performance of Ironwood vs Blackwell for common LLM inference workloads (Llama 3, Gemini, Claude)? Google has not submitted Ironwood MLPerf results, preventing apples-to-apples comparison.

Open questions

?When will TorchTPU reach production readiness, and will Google open-source it? The 12-18 month estimate (from Dec 2025) implies mid-2027 at earliest.
?What percentage of Anthropic's total compute runs on TPUs vs AWS Trainium vs NVIDIA GPUs? Anthropic maintains a multi-cloud strategy but the TPU/Trainium split is unclear.
?Will Ironwood be competitive for large-scale training (not just inference)? Google claims training capability but the ICI bandwidth disadvantage vs NVLink suggests training remains GPU-favorable.
?How does NVIDIA's Vera Rubin (2H 2026) change the competitive dynamic? If Vera Rubin delivers 10x inference cost reduction as claimed, it could neutralize TPU's price-performance advantage.