Kevin Delane, DDN
In this segment from theCUBE + NYSE Wired’s “AI Factories – Data Centers of the Future” series, Chris Stephens, field CTO at Groq, joins theCUBE’s John Furrier from the NYSE to unpack how AI factories are reshaping enterprise infrastructure and sovereign compute strategy. Furrier notes Groq’s momentum (~$750M raised, a valuation approaching ~$7B, and a new partnership with McLaren) as Stephens outlines why inference is the “killer app” and now a market of its own. He details rapid standups of sovereign AI inference “factories” (~51 days in the Middle East, ~40-some days with Bell Canada, and ~30 days in Helsinki) and explains how telcos are leveraging trust, data and national footprints to deliver AI-scale services. The discussion explores where value is accruing across the stack: from the physical build-out to the software layer that operationalizes Groq’s LPU-based system. Stephens highlights GroqCloud (launched ~18 months ago), native MCP service support and a Compound-powered research product, all aimed at simplifying deployment and enabling secure, standards-driven agent communications. He also digs into real-world use cases (customer-facing agents and workflow automation), cross-site/sovereign interconnect considerations, and why “joules per token” is becoming a defining metric for scaling reliable, low-latency inference within power-constrained data center designs.