◼ NVIDIA GTC 2026

Jensen Huang Keynote

The Inference Inflection, Vera Rubin, and the Agent Revolution

March 17, 2026 · SAP Center, San Jose · ~2h keynote

TL;DR
Key Numbers
Vera Rubin Platform
Groq LPU Integration
Token Economics
OpenClaw. The Linux of Agents
Architecture Roadmap
Physical AI & Robotics
More Highlights
Notable Quotes
Watch the Full Keynote

Demand Pipeline

$1T+

Token Throughput

350×

Rev/GW vs Blackwell

5×

Groq Premium Tier

35×

Through 2027 · In 2 years (per GW) · Vera Rubin upgrade · Decode throughput

TL;DR: Jensen's biggest GTC yet. He unveiled the Vera Rubin platform (successor to Blackwell. 7 chips, 5 rack-scale computers, 3.6 exaflops), announced Groq LPU integration for 35× token decode throughput, declared $1 trillion+ in visible demand through 2027, and essentially crowned OpenClaw as the “Linux of agents”. every company needs an OpenClaw strategy. Oh, and a Disney Olaf robot walked on stage and roasted Jensen's height.

Vera Rubin Platform

Architecture

7 chips, 5 rack-scale computers. one unified AI supercomputer
NVLink 72 @ 260 TB/s all-to-all bandwidth
3.6 exaflops of compute
100% liquid cooled with 45°C hot water. takes pressure off data center cooling
Install time: 2 days → 2 hours (structured cables, no spaghetti)

The Five Racks

NVLink 72 Rack. GPU compute with 6th-gen NVLink scale-up
Vera CPU Rack. LPDDR5, extreme single-thread perf for agentic tool use
STX Rack. Bluefield 4 AI-native storage (KV cache, KUDF, KVS)
Groq LPX Rack. 8 LP30 chips, massive SRAM for ultra-fast decode
Spectrum X CPO. co-packaged optics scale-out (in production with TSMC)

Generational Performance Leap

Groq LPU Integration

How It Works

NVIDIA acquired the Groq team and licensed the technology. The key insight: disaggregated inference via Dynamo.

Vera Rubin handles prefill. massive math, context processing, KV cache (288 GB HBM)
Groq handles decode. deterministic dataflow, compiler-scheduled, massive SRAM (500 MB/chip) for ultra-low-latency token generation
Connected via Ethernet with a special low-latency mode (2× reduction)
Unified by Dynamo. the operating system for AI factories

Jensen's Recommendation

For most data centers:

75% Vera Rubin. handles the vast majority of workloads (high throughput)
25% Groq. for premium tier, coding, high-value token generation
If your workload is mostly batch/throughput → 100% Vera Rubin
Groq extends performance beyond NVLink 72's bandwidth limits for 1000+ tokens/sec

Token Economics

The Token Factory Pricing Spectrum

Jensen's key thesis: “Every CEO in the world will be studying their token factory throughput chart. This year's decisions show up precisely as next year's revenues.”

Tier	Price/M Tokens	Characteristics	Use Case
FREE	$0	High throughput, small models, high latency	Customer acquisition, basic queries
BASIC	$3	Medium models, reasonable speed	Consumer chatbots, content generation
PRO	$6 – $45	Larger models, higher speed, long context	Professional work, analysis, code assist
PREMIUM	$150	Frontier models, max intelligence, fast decode	Research, critical path decisions, deep reasoning
ULTRA	$150+	Ultra-fast tokens, Groq-accelerated decode	Real-time coding, long research runs

OpenClaw. The Linux of Agents

Why It Matters

Jensen devoted a significant portion of the keynote to OpenClaw, calling it:

“As big as HTML”. started the internet
“As big as Linux”. powered cloud computing
“As big as Kubernetes”. enabled mobile cloud
Most popular open-source project in history. exceeded Linux in weeks
“OpenClaw has open-sourced the operating system of agent computers”

Enterprise Stack (Nemo Claw)

Open Shell. security/privacy guardrails for corporate agents
Privacy Router. prevents sensitive data exfiltration
Policy Guard Rails. connects to existing SaaS policy engines
Every SaaS company → GaaS company (agentic-as-a-service)
Reference design downloadable and optimizable

Neimotron Coalition

Partnering to build Neimotron 4. NVIDIA's frontier open model:

Cursor

Mistral

Perplexity

LangChain

Black Forest Labs

Reflection

Sarv (India)

Mirror

Architecture Roadmap

Blackwell

NVLink 72 · FP4 · Dynamo

Shipping

Vera Rubin

3.6 EF · LP30 Groq · CPO

Sampling Now

Rubin Ultra

NVLink 144 · LP35 · NVFP4

Taping Out

Feynman

LP40 · Rosa CPU · BF5 · CX10

Next Gen

Brand new architecture every single year. Both copper and optical scale-up going forward.

Scale-Up: Copper vs Optical

Vera CPU Standalone

Physical AI & Robotics

Robo-Taxi Partnerships

New partners announced (18M cars/year combined):

BYD

Hyundai

Nissan

Joining existing partners: Mercedes, Toyota, GM. Plus Uber for multi-city deployment.

“The ChatGPT moment of self-driving cars has arrived.”

The Three Computers of Robotics

Training computer. Isaac Lab for RL policy training at scale
Simulation computer. Newton physics + Cosmos world models for synthetic data
Robot computer. Jetson, runs on the robot itself

110 robots on the GTC show floor. Every major robotics company is working with NVIDIA.

🔍

“A walking, talking Olaf robot walked on stage. trained using Newton physics simulator + Isaac Lab + Omniverse, running on Jetson. Had a full comedy exchange with Jensen ('I thought you'd be taller'). The future of Disneyland is AI-powered characters roaming the park.”

Disney's Olaf Moment at GTC 2026

More Highlights

DLSS 5 / Neuro Rendering

Fusion of 3D graphics + generative AI. controllable structured data meets probabilistic generation
“One is completely predictive, the other probabilistic yet highly realistic”
The pattern of structured data + generative AI will repeat in every industry
“Structured data is the foundation of trustworthy AI”
Computer graphics literally comes to life. Jensen showed a jaw-dropping demo

Notable Quotes

“If you have the wrong architecture, even if it's free, it's not cheap enough.”

— On why token cost per watt matters more than chip cost

“Dylan Patel accused me of sandbagging. He says it's actually 50×. And he's not wrong.”

— On Blackwell inference benchmarks (Semi Analysis study)

“Every engineer's recruiting package will include a token budget. Tokens are the new compensation.”

— On the future of knowledge work

“Every single SaaS company will become a GaaS company. an agentic-as-a-service company.”

— On the enterprise IT transformation

“Computing demand has increased by 1 million times in the last two years.”

— On the inference inflection (10,000× per-task × 100× usage)

“We see through 2027 at least $1 trillion.”

— On the demand pipeline (up from $500B last year)

Contents