For three years, Nvidia has operated with near-monopoly comfort in the AI training and inference datacenter market. The Blackwell GB300 Ultra, which entered mass production in February 2026, continues to be the default choice for hyperscalers building out AI infrastructure. But AMD’s Instinct MI350 — now in volume shipment as of late April 2026 — is making the strongest challenge to that dominance the industry has seen.
The MI350: What’s Actually Different
The MI350 is built on TSMC’s 3nm N3P process node, the same foundry generation as Nvidia’s Blackwell but with a die architecture AMD claims is optimized for inference-heavy workloads. AMD is reporting 1.5 petaFLOPS of FP8 inference throughput per card at a 750W TDP, compared to Nvidia’s H200 at roughly 1.1 petaFLOPS in the same power envelope. The MI350 also ships with 288GB of HBM3e memory — 20% more than the H200’s 141GB — addressing the memory bandwidth bottleneck that has constrained large language model serving at scale.
In independent benchmark runs conducted by MLCommons and reported in late March 2026, an eight-MI350 node achieved 94% of a comparable eight-H100 system’s throughput on the MLPerf Inference v4.2 suite while consuming 18% less power. For operators paying $0.08–$0.12 per kWh in European colocations, those efficiency gains translate directly to margin.
“The energy cost per billion tokens served is the number datacenter operators care about now,” said Dr. Elena Marchetti, head of infrastructure at a major European cloud provider who requested anonymity. “MI350 changes that calculus in a way no AMD product has before.”
Hyperscaler Traction — and the ROCm Problem
AMD has confirmed purchase orders from three of the top five US hyperscalers, though the company declined to name specific customers at its April 22 investor briefing. Microsoft Azure and Google Cloud have publicly listed MI350-based compute instances in their upcoming availability previews. Meta, which has historically favored Nvidia hardware for its Llama training runs, has not yet disclosed MI350 commitments.
The persistent challenge AMD faces remains software. Nvidia’s CUDA ecosystem has two decades of optimization, tooling, and developer inertia behind it. ROCm, AMD’s open-source compute platform, has improved dramatically under CEO Lisa Su’s renewed software investment — ROCm 7.0, released in January 2026, closed roughly 70% of the API compatibility gap with CUDA 12.x according to AMD’s own developer documentation. But porting production inference stacks built on CUDA is still non-trivial, and startups without legacy code are the fastest adopters.
“For greenfield inference deployments using vLLM or TensorRT-LLM alternatives, the MI350 is genuinely competitive today,” said Arjun Desai, CTO of inference optimization firm QuantumServe. “For teams with millions of lines of CUDA kernels, it’s still a two-to-three quarter migration project.”
Pricing and Market Positioning
AMD has not published official MI350 list pricing, but channel sources report enterprise contract pricing in the range of $28,000–$32,000 per card, compared to $35,000–$40,000 for the Nvidia H200 at similar volumes. The GB300 Ultra commands a premium above that. If AMD can close the software gap to within acceptable limits for common workloads, the 15–20% price differential alone may be enough to capture meaningful share of net-new AI infrastructure spending, which analysts at IDC project will reach $185 billion globally in calendar year 2026.
Nvidia has not stood still. The company’s NVLink Switch 5 fabric and the GB300 NVL72 rack-scale system create a switching cost that individual card comparisons don’t fully capture. At cluster scale — 512 GPUs and above — Nvidia’s interconnect advantage remains substantial.
The AI infrastructure market is large enough for AMD to claim a significant position without displacing Nvidia, and that appears to be the practical ambition. A 15–20% revenue share in AI accelerators would represent roughly $27–37 billion in annual revenue for AMD by 2027 — a meaningful reshaping of the company’s business mix and a genuine check on Nvidia’s pricing power.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.