Category: Technology / Artificial Intelligence
Published: March 2026
At CES 2026, Nvidia introduced its next-generation AI supercomputing platform, the Vera Rubin NVL72, marking a major leap in data center performance and efficiency.
Announced by CEO Jensen Huang during the company’s keynote, the new system is designed to power the next wave of artificial intelligence applications, from advanced chatbots to robotics and autonomous systems.
A New AI Powerhouse
The Vera Rubin NVL72 is built as a rack-scale architecture, combining six specialized components: the Vera CPU, Rubin GPU, NVLink 6 interconnect, ConnectX-9 SuperNIC, BlueField-4 data processing unit, and Spectrum-6 Ethernet switch. Together, these technologies form a tightly integrated system optimized for large-scale AI workloads.
Each Rubin GPU delivers up to 50 petaflops of inference performance, offering up to five times the power of Nvidia’s previous Blackwell architecture. Training performance also sees a significant boost, reaching 35 petaflops per GPU.
Built for the AI Boom
To support increasingly complex AI models, the system includes advanced HBM4 memory with massive bandwidth and capacity, enabling faster data processing and improved efficiency. Nvidia has also introduced NVLink 6, dramatically increasing communication speeds between GPUs an essential feature for modern AI models that rely on distributed computing.
The NVL72 rack itself delivers an enormous 3.6 exaflops of inference performance, positioning it among the most powerful AI systems ever announced.
Lower Costs, Higher Efficiency
One of the standout claims is cost efficiency. Nvidia says the Vera Rubin platform can reduce the cost per token for AI inference by up to 10 times compared to previous systems. It also requires fewer GPUs to train large models, potentially lowering infrastructure costs for companies building AI at scale.
Designed for the Future of AI
The system introduces a new approach to handling AI memory through what Nvidia calls an “Inference Context Memory” layer. This helps manage the growing demand for storing and reusing data in large AI models, improving speed and responsiveness.
Security has also been enhanced, with protections extended across the entire system from chips to networking ensuring sensitive AI models remain Secure and protected.
Launch Timeline
Nvidia confirmed that all core components of the Vera Rubin platform have already been developed and tested. The company expects full-scale production of the NVL72 systems to begin in the second half of 2026.
Author.Adigun Adedoye.
