Shahin Farshchi, Lux Capital
In this interview from the Nvidia GTC AI Conference and Expo in San Jose, Charlie Boyle, vice president of DGX at NVIDIA, joins theCUBE's Dave Vellante to discuss how the Vera Rubin platform is redefining AI factory economics through extreme co-design and a relentless one-year silicon cadence. Boyle explains how two decades of CUDA software compatibility enable NVIDIA to deliver 35x generational performance leaps — a target that independent testing on the prior generation actually exceeded at 50x. He traces the critical role of fabric from the Mellanox acquisition through InfiniBand and Spectrum-X Ethernet, and reflects on the 10th anniversary of DGX, when the most common customer question was what anyone could possibly need eight GPUs for. The conversation also explores NVIDIA's expansion into agentic infrastructure, including a new STX storage reference architecture that places a Vera processor and BlueField-4 DPUs directly alongside drives to move data processing closer to where it physically resides. Boyle unpacks the DSX data center design and its Max-Q dynamic power controls, which allow operators to reclaim the roughly 40% of provisioned power that today's data centers typically waste — translating into more GPUs and dramatically lower token costs within the same power envelope. He also details the emergence of a new business model mapped to a Pareto curve of throughput versus responsiveness, where a single AI factory can serve everything from free-tier consumers to premium low-latency coding workloads. From OpenClaw democratizing agent creation through natural-language prompts to a fundamental shift in how CEOs view CapEx as a revenue accelerator rather than a cost center, Boyle outlines why every enterprise leader needs to understand where they sit on that curve.