DeepSeek today released R2, the long-anticipated successor to the R1 reasoning model that upended global AI pricing in early 2025. The Hangzhou-based lab is positioning R2 as a direct competitor to OpenAI’s GPT-5.5 and Anthropic’s Claude Opus 4.7 on math, code, and agentic benchmarks — but at a list price of $0.27 per million input tokens and $1.10 per million output tokens. That is roughly one-twentieth of what OpenAI charges for GPT-5.5 on premium reasoning traffic, and it lands squarely on top of last week’s Mistral Large 3 launch at $3 per million tokens.
According to the model card published on DeepSeek’s Hugging Face page, R2 scores 92.1% on AIME 2025, 87.4% on SWE-bench Verified, and 74% on GPQA Diamond. Those numbers put it within a few points of Claude Opus 4.7 on coding and within striking distance of GPT-5.5 on competition math, while remaining open weights under a permissive license.
A deliberate pricing shock
DeepSeek’s pricing is not a mistake. The lab has repeatedly argued that its Mixture-of-Experts architecture — R2 uses 671 billion total parameters with roughly 37 billion active per token — lets it serve inference at costs Western labs cannot currently match without absorbing heavy losses. Citing filings reviewed by The Information, DeepSeek is running R2 inference on domestic clusters built around Huawei Ascend 910C accelerators, sidestepping US export controls that block access to Nvidia H200 and Blackwell parts.
The company also published a technical report claiming a 43% improvement in training efficiency over R1 through a new reinforcement learning loop it calls GRPO-v2. If accurate, the report suggests R2’s full training run cost under $12 million — an order of magnitude below the reported budgets for GPT-5.5 or Gemini 3 Ultra.
Western labs face a second price compression
The timing is uncomfortable for Silicon Valley. Anthropic’s $73 billion infrastructure deal with Google and Amazon, announced earlier this week, was premised on scaling compute-heavy frontier models. OpenAI has spent most of April defending its enterprise premium by pointing to GPT-5.5’s agentic capabilities, but R2’s closed-loop planner scores 68% on AgentBench Hard — less than five points behind GPT-5.5 — while costing a fraction per task.
Early reaction from enterprise buyers has been swift. “We are running R2 evals this weekend,” one Fortune 100 head of AI told Bloomberg on background. “If the benchmarks hold in our internal suite, we will rebalance our routing by Q2 end.” Cursor and Windsurf both confirmed on X that R2 is already live in their model selectors as of publication time.
Policy fallout to follow
The US Commerce Department has not commented on R2’s launch, but the model’s apparent dependence on Huawei silicon will intensify the debate over whether current export controls are working as intended. Meanwhile, European regulators now face a parallel question: the EU AI Act’s high-risk compliance deadline lands in August, and an open-weights model released by a Chinese lab under a permissive license complicates the enforcement picture considerably.
For now, the practical takeaway for builders is simple. The frontier is no longer a two-vendor race, and the floor on reasoning-model pricing just dropped again.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.