NVIDIA unveils Vera Rubin platform at CES 2026 keynote, says next-gen chips in full production

On January 5, NVIDIA chief executive Jensen Huang opened CES 2026 with a keynote at the Fontainebleau Las Vegas, unveiling the Vera Rubin AI computing platform — a six-chip, extreme-codesigned architecture positioned as the successor to Blackwell. Reuters, The Verge, Yahoo Finance and the official NVIDIA CES blog reported the announcement. Huang said the new generation of chips is in full production, framing Rubin as the platform that will scale data-center capacity through the second half of 2026 and into 2027.

Rubin sits inside NVIDIA's annual cadence. Blackwell B200 and B300 ramped through 2024 and 2025; Rubin had been on the public roadmap since GTC 2024 but was first commercially detailed at CES 2026. The platform combines the new Vera CPU (an 88-core ARM design replacing Grace), the Rubin GPU, NVLink 6, and HBM4 memory. Reporting by The Verge, Tom's Hardware and SiliconAngle highlights agentic AI as the explicit thesis: Huang positioned Rubin as built around long-context, multi-step reasoning workloads rather than the chat-style throughput Blackwell was tuned for.

In an on-stage segment widely cited by trade press, Huang described NVIDIA's automotive partnership and its new reasoning model Alpamayo, used in the Mercedes-Benz CLA, as the "ChatGPT moment for physical AI." Reuters separately quoted him stating that Rubin is in "full production." NVIDIA used the same framing in its subsequent March 16 GTC release, where the company stated that "the agentic AI inflection point has arrived with Vera Rubin kicking off the greatest infrastructure buildout in history."

Architecturally, the Rubin GPU is reported to carry 336 billion transistors — up from roughly 208 billion on Blackwell — and 288 GB of HBM4 memory at peak per-GPU bandwidth above 20 TB per second, against HBM3e on Blackwell. NVLink 6 doubles per-GPU interconnect bandwidth to 3.6 TB per second. The Vera Rubin NVL72 rack packages 72 Rubin GPUs and 36 Vera CPUs and is claimed to train large mixture-of-experts models with one-fourth the GPU count required on Blackwell, while delivering up to 10x higher inference throughput per watt. NVIDIA confirmed roughly 1 trillion US dollars in combined Blackwell and Rubin orders booked through 2027, with first volume shipments scheduled for the second half of 2026.

Jensen Huang, founder and CEO of NVIDIA — Jensen Huang, 2025. Source: White House photo / Wikimedia Commons (public domain).

Industry reaction split along familiar lines. Hyperscaler buyers — AWS, Google Cloud, Microsoft Azure, Oracle — and the NVIDIA Cloud Partner cohort confirmed that Rubin-based instances will be made available starting in the second half of 2026. AMD and Intel responses are expected at their own keynotes; AMD's MI400 family is the closest comparable competitor and remains roughly a generation behind. The principal supply-side concern, raised by Bloomberg and Reuters analysts, is HBM4: Samsung, SK Hynix and Micron are all racing to certify production, and HBM allocation is already a gating factor on AI capex. A second concern is whether the trillion-dollar order book reflects sustainable demand or pulls forward future capex.

For us at Enpo Sekai, NVIDIA's roadmap matters mostly as a downstream cost signal, not a direct purchase decision. Our cost structure is dominated by inference of conversational characters, voice agents, and persona-driven gameplay — not training. Rubin's 10x inference-throughput-per-watt headline matters only as it filters into per-token API pricing at Anthropic, Google, and OpenAI, and into mid-tier cloud GPU instances we and other indie studios actually rent. We do not expect a step-down in API price within 2026; the more likely path is gradual context-length expansion at constant or rising token cost.

We will be watching three things over the next twelve months. (1) Whether the second-half-2026 Rubin shipping cadence holds up against HBM4 supply constraints, or slips into 2027 in a way that re-tightens the AI compute market. (2) How much of Rubin's efficiency gain flows into per-token API price reductions consumer studios actually see, versus being absorbed by frontier-lab capex on still-larger training runs. (3) Whether the 1 trillion dollar order book holds up as actual purchase orders rather than letters of intent — Reuters and Bloomberg analysts have flagged this as a watch item, and any softening will move sentiment across the AI infrastructure stack.

Share

Latest Updates

NVIDIA unveils Vera Rubin platform at CES 2026 keynote, says next-gen chips in full production

Related Articles

Enpo Sekai opens ORDO closed beta reservations and launches product site

Pentagon awards classified-network AI contracts to seven vendors, excludes Anthropic over usage-policy stand-off

DeepSeek releases V4 with 1.6T parameters and 1M-token context, sustains MIT-licensed open weights