NVIDIA unveiled its groundbreaking Rubin platform at CES 2026, marking a major leap in AI infrastructure. The new system features six innovative chips designed to power massive AI factories with dramatically lower costs and higher efficiency.
Company founder and CEO Jensen Huang highlighted the platform during his keynote, calling it perfectly timed for surging AI demands in training and inference. Rubin promises up to 10 times lower cost per inference token and four times fewer GPUs needed to train mixture-of-experts models compared to the prior Blackwell setup.
Six Chips Form One Supercomputer
The Rubin platform stands out through extreme co-design across its components. It includes the NVIDIA Vera CPU with 88 custom Olympus cores for agentic reasoning and data orchestration, the Rubin GPU delivering 50 petaflops of NVFP4 compute for inference, and the NVLink 6 switch offering 3.6 TB/s GPU-to-GPU bandwidth per GPU.
Additional chips cover networking and infrastructure needs. The ConnectX-9 SuperNIC handles high-throughput scale-out connections, while the BlueField-4 DPU manages security, storage, and control tasks. Rounding it out, the Spectrum-6 Ethernet switch supports efficient, reliable scale-out fabrics with co-packaged optics.
This integrated approach treats the entire rack as a unified accelerator. The flagship Vera Rubin NVL72 system combines 72 Rubin GPUs and 36 Vera CPUs in a cable-free, modular design for easier assembly and maintenance.
Performance Boosts for AI Factories
Rubin targets always-on AI production environments that process vast token volumes for reasoning and multimodal tasks. It sustains high utilization across compute, memory, and communication phases, avoiding bottlenecks that plague traditional setups.
Key advances include third-generation Transformer Engine with adaptive compression, doubled NVLink bandwidth, and HBM4 memory nearly tripling bandwidth over predecessors. Rubin also introduces Inference Context Memory Storage, a flash-based tier for key-value caches that boosts agentic AI with five times higher tokens per second and better power efficiency.
Independent tests show Rubin running three and a half times faster on training and five times on inference versus Blackwell. It achieves eight times more inference compute per watt, helping scale intelligence production reliably.
Ecosystem Momentum Builds
Leading AI players quickly endorsed Rubin. OpenAI CEO Sam Altman noted it enables scaling compute for smarter models benefiting everyone. Anthropic’s Dario Amodei praised efficiency gains for longer memory and reliable outputs.
Meta’s Mark Zuckerberg expects it to deploy advanced models to billions. xAI’s Elon Musk called it a rocket engine for frontier models. Microsoft plans Vera Rubin NVL72 in Fairwater AI superfactories scaling to hundreds of thousands of superchips.
CoreWeave will offer Rubin instances via Mission Control for flexible workloads. Dell integrates it into AI Factory solutions. Partners like AWS, Google Cloud, Oracle, HPE, Lenovo, and Supermicro prepare Rubin-based servers for late 2026.
Rack-Scale to SuperPOD Scale
The Vera Rubin NVL72 redefines rack-scale computing as a single massive accelerator. NVIDIA also offers HGX Rubin NVL8 for eight-GPU servers suited to x86 generative AI and high-performance computing.
These feed into DGX SuperPOD references for cluster-scale deployments. Full production is underway, with cloud providers like AWS, Google, Microsoft, and Oracle rolling out instances in the second half of 2026.
Rubin builds on Blackwell’s rack innovations, pushing toward predictable performance in power-constrained data centers. Features like confidential computing across CPU, GPU, and NVLink protect proprietary models, while enhanced RAS engines ensure uptime.
Huang emphasized Rubin’s role in modernizing trillions in computing infrastructure. As AI shifts to industrial-scale factories, this platform equips builders to convert power and data into intelligence efficiently.
Red Hat expanded collaboration for an optimized AI stack with Enterprise Linux, OpenShift, and Red Hat AI on Rubin. Infrastructure firms like DDN, NetApp, and WEKA align storage for next-gen needs.
Pushing AI Frontiers
Beyond chips, Rubin supports converging AI and scientific workloads with strong FP32 and FP64 performance. Vera CPU’s 1.2 TB/s memory bandwidth and NVLink-C2C enable coherent execution across devices.
Spectrum-X Ethernet Photonics delivers five times better power efficiency and uptime. BlueField-4’s ASTRA architecture secures multi-tenant operations with isolated control planes.
NVIDIA positions Rubin as the foundation for agentic AI, massive MoE inference, and video generation. Huang’s CES presentation underscored open models and physical AI integrations, but Rubin anchors the hardware revolution.
This launch signals NVIDIA’s annual cadence accelerating AI hardware evolution. With broad adoption ahead, Rubin sets the stage for gigascale factories driving mainstream intelligence.
