Google Resets the Enterprise AI Stack at Cloud Next 2026
San Francisco — Google used its annual Cloud Next conference this week to announce the most sweeping overhaul of its enterprise AI product line in years, unveiling the Gemini Enterprise Agent Platform alongside a new generation of custom silicon that the company says is 80% more cost-efficient than its predecessor.
CEO Sundar Pichai opened by noting that Gemini Enterprise recorded 40% quarter-over-quarter growth in paid monthly active users during Q1 2026. “Every employee can become a builder,” Pichai said, framing the shift as organizations moving from isolated AI experiments to managing fleets of hundreds or thousands of AI agents (Google Blog).
The Gemini Enterprise Agent Platform
The centerpiece announcement is the Gemini Enterprise Agent Platform, an expansion of the existing Vertex AI infrastructure that Google Cloud CEO Thomas Kurian described as “the connective tissue between your data, your people, and all of your apps and AI agents.”
The platform offers access to more than 200 AI models, including Gemini 3.1 Pro and third-party models — notably Anthropic’s Claude Opus 4.7. New capabilities include an Agent Designer, an Inbox for monitoring agent activity, support for long-running agents, and Skills-based composability. The platform also introduces tighter governance controls, a requirement as enterprises begin managing agents at scale.
Google Cloud reported that 75% of its customers are now using at least one of its AI products. The scale of usage has become striking: 330 customers each processed more than one trillion tokens over the past year, while 35 exceeded 10 trillion. The APIs now serve 16 billion tokens per minute, up from 10 billion the previous quarter (TechWire Asia).
Real-world deployments include NASA using AI agents for flight readiness operations on the Artemis II mission, Virgin Voyages deploying over 1,000 AI agents with one deployment cutting campaign production time by 40%, and Mars Incorporated automating marketing and enterprise search workflows.
TPU 8: Training and Inference Get Dedicated Silicon
Google Cloud announced the eighth generation of its Tensor Processing Units, this time split into two purpose-built variants: TPU 8t for model training and TPU 8i for inference.
TPU 8i is the more architecturally novel of the two. It connects 1,152 TPUs in a single pod, carries 3× more on-chip SRAM than its predecessor, and is engineered specifically for the low-latency, high-throughput demands of running millions of AI agents concurrently. The full TPU 8 family delivers up to 80% better performance per dollar compared to the previous generation, according to Google Cloud.
TPU 8t, meanwhile, focuses on reducing development iteration time through higher compute throughput and memory bandwidth — critical as frontier model training runs continue to scale.
What It Means for the Enterprise Market
The announcements confirm a strategic thesis Google has been building toward: that the future of enterprise software is orchestrated agent systems, not point AI features. The Gemini Enterprise platform directly competes with Microsoft Copilot Studio, Salesforce Agentforce, and AWS Bedrock Agents.
The inclusion of Anthropic’s Claude in the model catalog is a notable signal. Rather than a walled garden, Google Cloud is positioning itself as the infrastructure layer for multi-model agent deployments — betting that enterprises will pay for orchestration, governance, and infrastructure over any single model’s output.
With TPU 8i designed specifically to run “millions of agents cost-effectively,” the infrastructure economics may prove to be Google’s strongest differentiator in a market where inference costs remain a primary constraint on agentic deployment at scale.
Discussion
Sign in to join the discussion.
No comments yet. Be the first to share your thoughts.