Custom ASICs represent the single largest structural threat to NVIDIA's GPU dominance, particularly in inference. All five major hyperscalers (Google, Amazon, Microsoft, Meta, OpenAI) are building custom silicon. ASIC server shipments are expected to triple by 2027, growing at 44.6% vs GPU growth at 16.1% in 2026.
Key inflection points: Anthropic closed the largest TPU deal in Google's history (hundreds of thousands of Trillium TPUs scaling toward 1M by 2027, worth tens of billions); Midjourney achieved 65% cost savings migrating inference from NVIDIA A100/H100 to TPU v6e; Broadcom has 5+ hyperscaler XPU customers with $60-90B AI revenue SAM by FY2027 and $73B order backlog; OpenAI and Broadcom are co-developing 10GW of custom 'Titan' accelerators. Meta is pursuing a multipronged strategy: NVIDIA + AMD MI450 (6GW deal) + Google TPUs + MTIA custom chips, with $115-135B AI capex in 2026. However, NVIDIA's total addressable market is expanding faster than share declines -- absolute revenue continues growing even as percentage share erodes from 87% peak (2024) toward 70-75% by 2026-2028. Custom silicon is also hard: Intel's Gaudi failed, Microsoft's Maia was delayed 6+ months, and only ~5-10 companies can justify the investment. NVIDIA's strategic responses (NVLink Fusion, Groq inference licensing deal, CUDA Tile IR) show active moat defense rather than passive decline..
Competitive pressure is real but bounded
Custom ASICs and AMD offer cheaper alternatives for specific workloads, but only a handful of companies can afford multi-billion-dollar chip programs. The competitive threat is structural but limited in scope.
What is the actual cost-per-FLOP comparison between Vera Rubin NVL72 and Google TPU v7 (Ironwood) for inference workloads? This determines whether NVIDIA can close the ASIC cost gap.
Google's TPU v7 (Ironwood) represents the most credible single-vendor ASIC threat to NVIDIA's GPU dominance, particularly for inference. With 4,614 FP8 TFLOPS per chip, 192 GB HBM3e at 7.37 TB/s bandwidth, and pods scaling to 9,216 chips (42.5 exaFLOPS), Ironwood nearly matches NVIDIA B200's per-chip compute (4,500 FP8 TFLOPS) while offering significantly better cost-performance for large-scale inference workloads. Anthropic's decision to expand TPU usage to up to 1 million chips (tens of billions of dollars, 1+ GW capacity in 2026) for training AND serving next-generation Claude models is the strongest validation that frontier AI models do NOT require NVIDIA GPUs.
Combined with Google's TorchTPU initiative (12-18 months from production readiness) which aims to eliminate PyTorch-to-TPU switching friction, and Meta's multibillion-dollar TPU rental deal, Google is systematically attacking both the hardware cost gap and the CUDA software moat simultaneously. However, NVIDIA retains critical advantages: single-chip compute density leadership (B300 at 14,000 FP4 TFLOPS), ecosystem flexibility for research/experimentation, multi-vendor availability, and ~1 year time-to-market lead per generation. TPUs remain Google Cloud-exclusive, limiting adoption by enterprises wanting on-premises or multi-cloud deployments. The bear case for NVIDIA is that inference will outpace training by ~118x in demand by 2026, and TPUs already deliver 4x better price-performance for inference workloads where 60-70% of future AI compute dollars will flow..
Competitive pressure is real but bounded
Custom ASICs and AMD offer cheaper alternatives for specific workloads, but only a handful of companies can afford multi-billion-dollar chip programs. The competitive threat is structural but limited in scope.
Broadcom is the dominant enabler of custom AI silicon, holding ~60% of the AI server compute ASIC design market through 2027 (Counterpoint Research). It now serves 6 hyperscaler XPU customers -- Google (TPU), Meta (MTIA), ByteDance, OpenAI (Titan), Anthropic (via Google TPU), and one undisclosed -- with each of the three original customers planning 1-million-XPU clusters by 2027. AI semiconductor revenue has followed a steep trajectory: $12.2B in FY2024 to ~$20B in FY2025 (+63%) to a projected ~$40-50B in FY2026 (~100-150% YoY).
The $73B AI order backlog (deliverable over 18 months) provides strong near-term revenue visibility. CEO Hock Tan has stated 'clear line of sight' to $100B+ in annual AI chip revenue by FY2027. For NVIDIA, Broadcom's XPU business represents the primary channel through which custom silicon threatens GPU market share: every dollar hyperscalers spend on Broadcom-designed ASICs is a dollar not spent on NVIDIA GPUs. However, Broadcom's networking portfolio (Tomahawk switches, Jericho routers, optical components) also grows alongside GPU deployments, creating a partially hedged relationship..
Competitive pressure is real but bounded
Custom ASICs and AMD offer cheaper alternatives for specific workloads, but only a handful of companies can afford multi-billion-dollar chip programs. The competitive threat is structural but limited in scope.
Every major hyperscaler is now building custom AI accelerators: Google (TPU, the most mature program with 1.6M+ Trillium chips in 2026), Amazon (Trainium, 500K+ Trn2 chips for Anthropic's Project Rainier, now a multi-billion-dollar business), Microsoft (Maia 200, 3nm inference chip with 10+ PFLOPS FP4 serving GPT-5.2 and Copilot), Meta (MTIA 300-500, four generations in 24 months on RISC-V, hundreds of thousands deployed for inference), and OpenAI (Titan ASIC with Broadcom, targeting 90% inference cost reduction, deploying H2 2026). Combined custom silicon represents ~10-15% of the AI accelerator market by revenue in 2026, projected to reach 15-25% by 2030. However, all hyperscalers are SIMULTANEOUSLY increasing NVIDIA GPU orders -- the trajectory is additive, not substitutive.
NVIDIA's training share exceeds 90%, while inference share is 60-75% and declining. Google uses TPUs for nearly all internal AI compute; AWS runs majority of Bedrock inference on Trainium; Meta and Microsoft are inference-first with custom silicon while relying on NVIDIA for training. The key dynamic: NVIDIA's absolute revenue continues growing ($150B+ data center in 2026E) even as percentage share declines from 87% peak to ~75%, because the total addressable market is expanding from $115B (2024) to $200B+ (2026E)..
Competitive pressure is real but bounded
Custom ASICs and AMD offer cheaper alternatives for specific workloads, but only a handful of companies can afford multi-billion-dollar chip programs. The competitive threat is structural but limited in scope.