In this segment from theCUBE + NYSE Wired’s “AI Factories – Data Centers of the Future” series, Chris Stephens, field CTO at Groq, joins theCUBE’s John Furrier from the NYSE to unpack how AI factories are reshaping enterprise infrastructure and sovereign compute strategy. Furrier notes Groq’s momentum (~$750M raised, a valuation approaching ~$7B, and a new partnership with McLaren) as Stephens outlines why inference is the “killer app” and now a market of its own. He details rapid standups of sovereign AI inference “factories” (~51 days in the Middle East, ~40-some days with Bell Canada, and ~30 days in Helsinki) and explains how telcos are leveraging trust, data and national footprints to deliver AI-scale services.
The discussion explores where value is accruing across the stack: from the physical build-out to the software layer that operationalizes Groq’s LPU-based system. Stephens highlights GroqCloud (launched ~18 months ago), native MCP service support and a Compound-powered research product, all aimed at simplifying deployment and enabling secure, standards-driven agent communications. He also digs into real-world use cases (customer-facing agents and workflow automation), cross-site/sovereign interconnect considerations, and why “joules per token” is becoming a defining metric for scaling reliable, low-latency inference within power-constrained data center designs.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. If you don’t think you received an email check your
spam folder.
Sign in to AI Factories - Data Centers of the Future.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Register for AI Factories - Data Centers of the Future
Please fill out the information below. You will receive an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for AI Factories - Data Centers of the Future.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. If you don’t think you received an email check your
spam folder.
Sign in to AI Factories - Data Centers of the Future.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Sign in to gain access to theCUBE + NYSE Wired: AI Factories - Data Centers of the Future
Please sign in with LinkedIn to continue to theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Anne Hecht, NVIDIA
In this segment from theCUBE + NYSE Wired’s “AI Factories – Data Centers of the Future” series, Chris Stephens, field CTO at Groq, joins theCUBE’s John Furrier from the NYSE to unpack how AI factories are reshaping enterprise infrastructure and sovereign compute strategy. Furrier notes Groq’s momentum (~$750M raised, a valuation approaching ~$7B, and a new partnership with McLaren) as Stephens outlines why inference is the “killer app” and now a market of its own. He details rapid standups of sovereign AI inference “factories” (~51 days in the Middle East, ~40-some days with Bell Canada, and ~30 days in Helsinki) and explains how telcos are leveraging trust, data and national footprints to deliver AI-scale services.
The discussion explores where value is accruing across the stack: from the physical build-out to the software layer that operationalizes Groq’s LPU-based system. Stephens highlights GroqCloud (launched ~18 months ago), native MCP service support and a Compound-powered research product, all aimed at simplifying deployment and enabling secure, standards-driven agent communications. He also digs into real-world use cases (customer-facing agents and workflow automation), cross-site/sovereign interconnect considerations, and why “joules per token” is becoming a defining metric for scaling reliable, low-latency inference within power-constrained data center designs.
>> Hello, I'm John Furrier with theCUBE, your host here at our New York Stock Exchange. CUBE Studios, part of our AI Factory series, the NYSC-Wired Program and community as well. A lot of buzz around AI factories. We're going to break that down and what it means for the enterprise. Anne Hecht is here, senior director of Enterprise for NVIDIA leading the way in the GPU era, the modern era of computing. Anne, thank you so much for coming on theCUBE.
Anne Hecht
>> Yeah, John, it's a pleasure to be here. I watched your show, so I'm excited to be a part of it today.>> We really been loving the enterprise action, 16 years doing theCUBE, and I got to say the enterprise right now is going through massive transition, large scale systems. We're seeing all the mega data centers being built, but the high value area we're expecting to see unleashed a lot of value extractions. The enterprise as they move from IT systems to AI factory on-premise activity, some cloud, some on-premise distributed computing, hybrid, all kind of come into play. All the top enterprises, large, medium, small, are all looking at new architectures. They want the applications they want, they want the AI productivity, they want the AI code assist and they need the AI factory. So we are in this era, it's developing very, very fast, highly accelerated, explain what the Nvidia AI factory is for the enterprise because this is something that's not really a debate. It's more of how fast can we turn it on and what do we run on it.
Anne Hecht
>> That's right. That's right. And I think Jensen coined this a couple years ago, one of our GTCs, he started talking about AI factories and it's been this great metaphor and it is a metaphor to help us describe really the modern data center and how it needs to change and evolve to become that sort of center of excellence for AI that an enterprise can rely on to develop their IP and their AI and really drive results for their business. And so I like to describe the AI factory as it's obviously it's infrastructure, it's the accelerated computing, networking, storage, obviously you need models, there's a layer of software in there. But it really doesn't become a factory until you add that enterprise's data, their IP that is embodied in that data and you start to actually do inference and processing of that data and gaining insights from that data and creating those tokens is what we talk about results and generating results based on that data. And then you really have a factory.>> Yeah, I ask that question a lot on theCUBE. What is an AI factory? And the answers can be simply, "it's producing a lot of tokens" to more complex "it's storage fabric, it's a network fabric connecting a lot of high-powered chips and software together," so it kind of ranges. So I have to ask you for the enterprise on this era, what is the playbook? Because they love this computing revolution and evolution. It depends on how you look at it. It's a revolution on one hand, but an evolution from pre-existing IT technologies, methodologies, Mechanisms. So you're starting to see that. What are some of the build factors in this era that people should pay attention to? What is different? I mean it produces tokens, that's the lingua franca for AI. What are they looking at? What's resonating with them?
Anne Hecht
>> Yeah, that's a great question and I'm really talking to that sort of Fortune 2000, Fortune 3000 enterprises that are really trying to start their journey. And they really are moving from this model of primarily CPU compute infrastructure, right? X86-based data centers, which they have reliably refreshed every three to five years. It's a very predictable model, but now they realize AI is here and they need to change that model. And how do they do that in a really strategic and pragmatic, pragmatic way? And they realize that even enterprises that we've worked with that maybe were early adopters and they used accelerated computing for training are realizing with the onset of generative AI, Agentic AI, even the inference, even the production workload needs to be on accelerated system. So one of the things we work with Dell on for example, is going through that refresh cycle hand-in-hand with an enterprise customer, and Dell will help carve out with that customer how much of that x86 body of systems should actually be upgraded to accelerated systems. So it's not necessarily even an increase in infrastructure spend, it's just spending your infrastructure budget in a slightly more strategic way and gradually building in room for Accelerated computing so you can take advantage of these AI workloads. And we have a lot of very broad system, broad offerings of accelerated computing. So we have PCIe GPU systems, they understand that modern data center, they run the x86 applications, so you can still do data processing, rendering basic workloads on these systems as well as start doing AI on these systems as well.>> That brings up the question around adoption patterns. You mentioned they're comfortable with x86 and they want to migrate in the capabilities. A lot of the adoption we see is, "Okay, I see the future, I got the data," they see the value of the data. That's their intellectual property. They want to have these systems close to the data, that's one. Are there other patterns you're seeing in terms of sequence of events for these customers and how do they progress? What's the playbook?
Anne Hecht
>> Yeah, yeah. What we see, and this is the journey we went through, is the first thing most enterprises do is they do an audit. How are we using AI? Who's using it? Are we building it ourselves? A lot of teams will start DIY sort of chatbots and they build their own, or they have teams that are actually using third party services, but a lot of times that sometimes can put data at risk because and teams might be taking data that's proprietary to the enterprise and really protected IP and sharing it outside of the firewall. So the IT team then has this sort of hodge podge, if you will, of AI applications and workloads that are running. We actually had the same situation here at Nvidia. And so we invested in building out and are continuing to build out a really standardized platform for AI practitioners so that they can actually use low-code/no-code tools. We have a whole ecosystem of partners that we've enabled with our libraries so they can expose through these developer environments that our third party partners have created. So an AI practitioner can quickly create their own application or chatbot, even a deep research agent, but they can do it safely and then in a standard way as well. So it's supported on our data center infrastructure because once this AI workflow is actually created and the AI app is created, it needs to be deployed in the data center and managed every other application in the data center. It needs to be secured, orchestrated, managed. You need to have role-based access, identity access to that application. And so the enterprise team, if they have a common standard development environment, they can be assured that what's created on that platform can be deployed and supported in the enterprise data center. And that was one of the problems we had. We have a lot of different applications that we have really in teams have created, but it can't actually be supported in our data center. So the way to fix that is a standardized platform.>> A lot of misinformation out there. I want to get your thoughts on, because I think pilots, production workloads, we see on our side the research on the market is that there's a lot of activity, there's a lot of misinformation around failed projects and you can look at AI and the enterprise and you could squint through and slice data any way you want, but we're seeing a lot of activity because there are low hanging fruit use cases where people can get going immediately. And so I think people tend to see the big bang, I want to see the magic pixie dust instantly and see instant transformation. In the enterprise, it's nuanced. So can you share your thoughts on this? Because I think this is one thing we want to clarify is that we're seeing massive activity. Yeah, I mean, experimentations going on, what did Thomas Edison say? "I failed to create electricity 2000 times." So there's a lot going on, but what's the real story? What do you guys see in the enterprise because it's proof of concept, there's pilots, there's production workloads, I mean RAG is pretty much on a full tilt mode right now. It's going hard, escape velocity on search and that's easy for text. So what are you seeing?
Anne Hecht
>> Yeah, so we're seeing, like I mentioned this big transition as enterprises do an inventory, if we go back to that, they very quickly realize where AI is being used and where it's actually having the biggest impact on their enterprise. And I think it also helps enterprises to look at what are their processes and their workloads that would benefit the most. So here at Nvidia, obviously we do a lot of chip design. Well guess what? That was the first workload we applied AI to was let's use AI to see if we can build and new architectures for GPUs faster using ai. And we were able to do that and that's actually how now we were able to introduce new architectures every year is because we applied AI. But that's for Nvidia. Every enterprise needs to look at their business and see, okay, what is our core competency? What is our most strategic process and how do we use AI to really make that faster, better, smarter? And if you approach it from that lens, you will get better results actually really diving in and using AI in a very strategic way. You mentioned failed projects. One of the things that I think enterprises end up getting trapped in is if they do it yourself, they take an open model, they deploy it sometimes without actually doing post-training. They just take a model that's been trained on the internet and they deploy it into a chatbot for example. The challenge with doing that and why sometimes these projects don't succeed is that model doesn't really understand its business. It hasn't been trained on its the lingua franca of that enterprise understanding the processes of that enterprise. And so you need to do this sort of post-training, which is similar to if you hire a new employee, you don't expect them to be productive in the first six, even nine, months because they're on a learning curve. They have to understand the business. That's what post-training is. It's onboarding your ai, it's educating it about your business and really doing a new hire training for your AI workload, if you will. And if you do that and you invest in that, then when you deploy that into production, you actually have a really smart workflow that's going to be able to contribute results back to your business.>> Yeah. I just add that observation that we're seeing in this series and our other coverage is that once they get that discovery, it opens up new opportunities for them and they see more innovation and then that comes grounded in the data. You mentioned at the top of this interview, I want to get into the data gravity side of it because a lot of people are seeing this is not a cloud versus on-prem, it's more of the data's on-prem, we want to keep it on-prem. So you have data gravity on premises, so then you got also cloud workloads too. So you have this hybrid environment. Could you share your thoughts on that piece of it? Because in some cases it's not an either/or, in some cases it's both in some cases because of the data. So it's a unique situation where beauty's in the eye of the beholder depending upon what they have. So can you share your thoughts because the on-prem cloud debate or discussion or dialogue is really nuanced. It's situational.
Anne Hecht
>> It really is. And it's also being influenced by sovereign AI as well, right? We're working with an enterprise partner right now, a software platform partner who is standing up data centers in the regions of their customers because their customers want their compute to be where their workload is running. So I think there's a new ecosystem of actually cloud providers that is actually developing the Neo clouds, if you will. And Dell works with many of these providers to build out their infrastructure. And of course a lot of these Neo clouds, we know them because they're helping the model builders and the AI native companies, the new big AI innovators, but they're also really interested in building out capacity and the full stack of software and tooling to help enterprises also run their workloads on these more regional Neo clouds. And like you said, enterprises are going to be hybrid. Every enterprise is going to be somewhat of a snowflake. Each factory is going to be a snowflake. And that's why Nvidia's really made sure that our software stack and our tooling runs everywhere. So things like our new microservice for example, that optimized runtime for when you deploy a model in production, you can run it in the cloud, you can run it on-prem supports on a Neo cloud or . So an enterprise could actually move their workloads because maybe you start in the cloud but you deploy on-prem maybe, but you want to burst to the cloud because you're doing some post-training. So I think it depends on the economics, where the data is, and then some of the security priorities like you mentioned of the enterprise where they're going to actually want to run and deploy those workloads.>> You mentioned Dell Technologies. I'm old enough to remember when servers were great and you buy a server and you load Linux on it, "Hey, I got an OS." This whole full stack is a huge software opportunity. I want you to share more on that because when I talk to customers of Dell and Nvidia, they love the AI factory, but then the question is what are we running on it? It's kind of an open question. And of course they want to have software estate on there and pre-existing. Then I talked to Kevin Deierling at Nvidia. It's like, "Oh yeah, we were talking, KV cache is the new operating system." Jensen said it on stage at GTC that KV cache is the operating system for the AI factory. He said it. Okay, that's networking. So you have the combination of storage fabrics, network fabrics and compute, all kind of working together. It's not your old school "just load Linux on it", the operating system of a factory is super important. What's your view on this piece of it with the enterprise? Because they want these factories, there's demand, and now it's not as simple as saying just load Linux anymore. Linux is kind of everywhere. So what is the full stack? What runs the AI factory?
Anne Hecht
>> Yeah, so I'll just start layer by layer. So at your foundation layer, obviously accelerated computing, GPU servers, we work with partners like Dell. They have a number of certified servers that across our whole lineup of GPUs, Air-cooled, liquid-cooled, graphics-capable, just LLM inference focus like our B200, but a whole range of GPUs. And then of course networking like Kevin obviously already spoke about is really important for performance and moving those workloads across the enterprise, across that data center. And then storage is really important. And recently we announced a validated design for integrating compute to a storage rack so that actually you could bring the rag to the storage rack and actually turn a storage rack into a rag system. So you can talk to your data through that interface to a retrieval, augmented generation architecture, integrate into that stack or an agent that's running outside of that stack can actually call on that storage rack and all of the processing of that data, vectorizing it, tagging it, embedding, and then serving it up to a model is done really right there next to the data. So it's super secure, really efficient, and you're not replicating data across the data center, which is also really important, but that's just the foundation. Then on top of that, you need full orchestration management. So we make sure we support the full range of Kubernetes and orchestration partners from Red Hat to VMware, Canonical, Nutanix, which most enterprises are running one of those. And then we've really worked with the ecosystem to enable them to take our libraries, integrate it into their AIOps tooling, security tooling. There's a lot of great options out there for actually building your own provisioned library of models, for example, and managing those libraries. So we work with those partners with our NIM microservice so that the enterprise can curate their own development environment, low-code-no-code development environment if you will, through these AI ops partners to enable their AI practitioners to develop and create AI. And again, like I was saying earlier, in a way that it will be supported, secured, have all of the great enterprise guard rails, if you will, through identity management, et cetera. So it can be run actually and deployed in the data center safely. Because at the end of the day, for an enterprise, that's the holy grail. If they can't do AI safely, they're not going to do AI, they're not going to risk their business. They need to really make sure they do it in a safe way.>> Security and safety is huge. Anne, that's a great point about that. And I got to tell you, Dave and I and our team has been loving the innovation on NVIDIA, the systems architecture, the memory, all the speeds and fees. We love that stuff, but it's an enablement. So I have to ask you about, as you go into the enterprise, it's an ecosystem world, right? So how do customers deal with the ecosystem? I know you guys have had a lot of advancements, partnerships. If I'm a customer, I'm going to need to run other things with NVIDIA. What is the state of the market for you guys on the ecosystem? Can you share a few points?
Anne Hecht
>> So enterprises have a heterogeneous data centers. Some of them are running, if you think about it, different Kubernetes platforms, different operating systems. So we really try to support all of the enterprise options out there that are already being used by our enterprise customers. So let's just take orchestration and management. We support VMware, Red Hat, Canonical, Nutanix, and we work with those partners to make sure that they are validated, tested on our platforms. And we even took the next step of producing a validated design that publishes and documents the partners that we're working with as well as not just at that orchestration and management layer, but also up stack in terms of AI office partners. So these are partners that are building low-code/no-code development environments as well as guard railing solutions, security solutions, and they're leveraging our underlying libraries that are in our software product called NVIDIA enterprise, and they're leveraging Nemotron, our NIM, Microservice for running those models, but they're exposing them in a way that makes it super easy for an enterprise AI developer to leverage them and actually build out a full workload in their enterprise, but actually do it in a really safe way. The other way we're doing this is really open-sourcing A lot of our work, what we have been doing lately is actually we have a whole great family of models, our Nemotron models, we open-source, those models shared the data sets, so they're really easy for enterprises to do that post-training exercise on. We are sharing the recipes on how to do that and we're opening up the weights as well. So we talked about the need to customize AI. It makes it really easy and risk-free because they can see the data set for an enterprise to do some of that customization work.>> I mean, having a platform or an operating system for AI like you guys are becoming is really key enablement to have an ecosystem flourishing. I think it's a super important benchmark because it's not just partners, it's entrepreneurs, it's people building the software that will be the net new in some cases enhanced versions of pre-existing stuff. So I think that is going to be a big factor as we look at the growth of the ecosystem side of your business, which is phenomenal in the enterprise. So I have to couch that as a lead into the final question, which is agents are hot, Open AI just announced agentic feature. You start to see acceleration on the road to superintelligence, however you want to define that. People have different definitions. But clearly the world is changing super fast and the reasoning workloads that are coming out of these factories are significant game changing, altering markets. So we're seeing a lot of activity. What are some of the workloads on the agentic infrastructure side applications you're seeing? Obviously we see the low-hanging fruit search, code-development, but this is getting better faster.
Anne Hecht
>> That's right, and it's super exciting. We went from a world where AI was, "Here's a data set of cats, tell me the species of all these cats" and it was like a basic inference workload, which by the way, in most cases that was probably done on CPU infrastructure. And then we went to generative AI where the workload was produce a cat on a horse, an image of a cat on a horse, and now we're at the ability to do deep research. So now you can prompt your AI, "Give me a report on cats and tell me which cat I should buy if I have kids that are allergic." And you can see and then the AI goes off and works and researches that. And in the context of an enterprise, that agentic system can call not just the data and do research on your enterprise data, but it can call external systems if it's given permission to is where the enterprise capabilities are super important. Just like you give a person role-based access, the agent has role-based access as well and can be audited and supervised so you can manage what that agent is actually doing and how it's actually coming back and giving results to the human. And it can call on tools, it can do calculations that maybe the human didn't think about and it can get feedback from the human. And then that data, the result of that deep research can go back into the enterprise data system and continually be used to train the model and make that AI workflow even better and more intelligent. So it's super exciting area and like I had mentioned earlier, we took AI and we applied it to how we do our architecture. So our engineers are saving tremendous amount of time by using AI, getting to insights faster. We also use it to do security of our software stack. So, if there's a breach that comes in on one of the open source containers that we've leveraged, we can very quickly within minutes identify the risk to our customers and then apply a patch, fix the risk, and generate actually a new code for that piece of software. So we're able to automate supply chain, co-development, communications to employees through our internal chatbots. And I think those are the workloads where we're seeing the most promise. CodeGen is really popular supply chain, chatbots, and of course customer service is a really great area because a really good deep research agent can actually look at all of the history of a customer, make recommendations to a teller or a sales rep on what the next step should be for that customer in real time, and provide them with insights that manually they would otherwise have to do.>> Anne, it's been great to have you on sharing your perspective. I know you've seen many cycles of innovation. So I guess my final, final question would be for the folks watching that are in the middle of their journey starting it or leaning into it, Fortune 2000, Fortune 3000, they see the advantages of productivity gains and things you mentioned. What would be your advice to them? What would you say to the folks watching like, "Hey, what's the future look like? How do you do this?" What's your message?
Anne Hecht
>> I think the message is just get started. Start using it and work with your IT teams to figure out how to make a, build out a standard way to bring it into your enterprise. But Jensen has said this himself, and he's obviously a great inspiration for many of us here. Just get started. Don't wait because it is changing so fast and reasoning models. We dropped in December and now everybody's using reasoning. So who knows what's going to drop this Christmas, right? Who knows what's going to happen.>> You got to make a move. You can go faster,
Anne Hecht
>> Just bend your knees, be agile and get ready for the serve, but just get ready and start going. But don't be afraid. There's a lot of really great partners out there that you can leverage. We work with Dell with many of them, and the most important thing is to start on the journey.>> Well, it's a lot of fun too. The upside is there, upside potential. The AI factories is awesome. Thank you for sharing. As senior director head of enterprise products, you have the finger on the pulse. Thanks for coming on our program. We really appreciate it.
Anne Hecht
>> Thanks, John. It's great being here.>> Anne Hecht from Nvidia coming on talking about the importance of AI factories and the importance of getting started and really taking advantage because it's moving very, very fast. And folks that can get on the right side of history here, will take advantage and reap the rewards. Of course, we're doing our part to bring you all the data we can here on theCUBE. Thanks for watching.
>> Hello, I'm John Furrier with theCUBE, your host here at our New York Stock Exchange. CUBE Studios, part of our AI Factory series, the NYSC-Wired Program and community as well. A lot of buzz around AI factories. We're going to break that down and what it means for the enterprise. Anne Hecht is here, senior director of Enterprise for NVIDIA leading the way in the GPU era, the modern era of computing. Anne, thank you so much for coming on theCUBE.
Anne Hecht
>> Yeah, John, it's a pleasure to be here. I watched your show, so I'm excited to be a part of it today.>> We really been loving the enterprise action, 16 years doing theCUBE, and I got to say the enterprise right now is going through massive transition, large scale systems. We're seeing all the mega data centers being built, but the high value area we're expecting to see unleashed a lot of value extractions. The enterprise as they move from IT systems to AI factory on-premise activity, some cloud, some on-premise distributed computing, hybrid, all kind of come into play. All the top enterprises, large, medium, small, are all looking at new architectures. They want the applications they want, they want the AI productivity, they want the AI code assist and they need the AI factory. So we are in this era, it's developing very, very fast, highly accelerated, explain what the Nvidia AI factory is for the enterprise because this is something that's not really a debate. It's more of how fast can we turn it on and what do we run on it.
Anne Hecht
>> That's right. That's right. And I think Jensen coined this a couple years ago, one of our GTCs, he started talking about AI factories and it's been this great metaphor and it is a metaphor to help us describe really the modern data center and how it needs to change and evolve to become that sort of center of excellence for AI that an enterprise can rely on to develop their IP and their AI and really drive results for their business. And so I like to describe the AI factory as it's obviously it's infrastructure, it's the accelerated computing, networking, storage, obviously you need models, there's a layer of software in there. But it really doesn't become a factory until you add that enterprise's data, their IP that is embodied in that data and you start to actually do inference and processing of that data and gaining insights from that data and creating those tokens is what we talk about results and generating results based on that data. And then you really have a factory.>> Yeah, I ask that question a lot on theCUBE. What is an AI factory? And the answers can be simply, "it's producing a lot of tokens" to more complex "it's storage fabric, it's a network fabric connecting a lot of high-powered chips and software together," so it kind of ranges. So I have to ask you for the enterprise on this era, what is the playbook? Because they love this computing revolution and evolution. It depends on how you look at it. It's a revolution on one hand, but an evolution from pre-existing IT technologies, methodologies, Mechanisms. So you're starting to see that. What are some of the build factors in this era that people should pay attention to? What is different? I mean it produces tokens, that's the lingua franca for AI. What are they looking at? What's resonating with them?
Anne Hecht
>> Yeah, that's a great question and I'm really talking to that sort of Fortune 2000, Fortune 3000 enterprises that are really trying to start their journey. And they really are moving from this model of primarily CPU compute infrastructure, right? X86-based data centers, which they have reliably refreshed every three to five years. It's a very predictable model, but now they realize AI is here and they need to change that model. And how do they do that in a really strategic and pragmatic, pragmatic way? And they realize that even enterprises that we've worked with that maybe were early adopters and they used accelerated computing for training are realizing with the onset of generative AI, Agentic AI, even the inference, even the production workload needs to be on accelerated system. So one of the things we work with Dell on for example, is going through that refresh cycle hand-in-hand with an enterprise customer, and Dell will help carve out with that customer how much of that x86 body of systems should actually be upgraded to accelerated systems. So it's not necessarily even an increase in infrastructure spend, it's just spending your infrastructure budget in a slightly more strategic way and gradually building in room for Accelerated computing so you can take advantage of these AI workloads. And we have a lot of very broad system, broad offerings of accelerated computing. So we have PCIe GPU systems, they understand that modern data center, they run the x86 applications, so you can still do data processing, rendering basic workloads on these systems as well as start doing AI on these systems as well.>> That brings up the question around adoption patterns. You mentioned they're comfortable with x86 and they want to migrate in the capabilities. A lot of the adoption we see is, "Okay, I see the future, I got the data," they see the value of the data. That's their intellectual property. They want to have these systems close to the data, that's one. Are there other patterns you're seeing in terms of sequence of events for these customers and how do they progress? What's the playbook?
Anne Hecht
>> Yeah, yeah. What we see, and this is the journey we went through, is the first thing most enterprises do is they do an audit. How are we using AI? Who's using it? Are we building it ourselves? A lot of teams will start DIY sort of chatbots and they build their own, or they have teams that are actually using third party services, but a lot of times that sometimes can put data at risk because and teams might be taking data that's proprietary to the enterprise and really protected IP and sharing it outside of the firewall. So the IT team then has this sort of hodge podge, if you will, of AI applications and workloads that are running. We actually had the same situation here at Nvidia. And so we invested in building out and are continuing to build out a really standardized platform for AI practitioners so that they can actually use low-code/no-code tools. We have a whole ecosystem of partners that we've enabled with our libraries so they can expose through these developer environments that our third party partners have created. So an AI practitioner can quickly create their own application or chatbot, even a deep research agent, but they can do it safely and then in a standard way as well. So it's supported on our data center infrastructure because once this AI workflow is actually created and the AI app is created, it needs to be deployed in the data center and managed every other application in the data center. It needs to be secured, orchestrated, managed. You need to have role-based access, identity access to that application. And so the enterprise team, if they have a common standard development environment, they can be assured that what's created on that platform can be deployed and supported in the enterprise data center. And that was one of the problems we had. We have a lot of different applications that we have really in teams have created, but it can't actually be supported in our data center. So the way to fix that is a standardized platform.>> A lot of misinformation out there. I want to get your thoughts on, because I think pilots, production workloads, we see on our side the research on the market is that there's a lot of activity, there's a lot of misinformation around failed projects and you can look at AI and the enterprise and you could squint through and slice data any way you want, but we're seeing a lot of activity because there are low hanging fruit use cases where people can get going immediately. And so I think people tend to see the big bang, I want to see the magic pixie dust instantly and see instant transformation. In the enterprise, it's nuanced. So can you share your thoughts on this? Because I think this is one thing we want to clarify is that we're seeing massive activity. Yeah, I mean, experimentations going on, what did Thomas Edison say? "I failed to create electricity 2000 times." So there's a lot going on, but what's the real story? What do you guys see in the enterprise because it's proof of concept, there's pilots, there's production workloads, I mean RAG is pretty much on a full tilt mode right now. It's going hard, escape velocity on search and that's easy for text. So what are you seeing?
Anne Hecht
>> Yeah, so we're seeing, like I mentioned this big transition as enterprises do an inventory, if we go back to that, they very quickly realize where AI is being used and where it's actually having the biggest impact on their enterprise. And I think it also helps enterprises to look at what are their processes and their workloads that would benefit the most. So here at Nvidia, obviously we do a lot of chip design. Well guess what? That was the first workload we applied AI to was let's use AI to see if we can build and new architectures for GPUs faster using ai. And we were able to do that and that's actually how now we were able to introduce new architectures every year is because we applied AI. But that's for Nvidia. Every enterprise needs to look at their business and see, okay, what is our core competency? What is our most strategic process and how do we use AI to really make that faster, better, smarter? And if you approach it from that lens, you will get better results actually really diving in and using AI in a very strategic way. You mentioned failed projects. One of the things that I think enterprises end up getting trapped in is if they do it yourself, they take an open model, they deploy it sometimes without actually doing post-training. They just take a model that's been trained on the internet and they deploy it into a chatbot for example. The challenge with doing that and why sometimes these projects don't succeed is that model doesn't really understand its business. It hasn't been trained on its the lingua franca of that enterprise understanding the processes of that enterprise. And so you need to do this sort of post-training, which is similar to if you hire a new employee, you don't expect them to be productive in the first six, even nine, months because they're on a learning curve. They have to understand the business. That's what post-training is. It's onboarding your ai, it's educating it about your business and really doing a new hire training for your AI workload, if you will. And if you do that and you invest in that, then when you deploy that into production, you actually have a really smart workflow that's going to be able to contribute results back to your business.>> Yeah. I just add that observation that we're seeing in this series and our other coverage is that once they get that discovery, it opens up new opportunities for them and they see more innovation and then that comes grounded in the data. You mentioned at the top of this interview, I want to get into the data gravity side of it because a lot of people are seeing this is not a cloud versus on-prem, it's more of the data's on-prem, we want to keep it on-prem. So you have data gravity on premises, so then you got also cloud workloads too. So you have this hybrid environment. Could you share your thoughts on that piece of it? Because in some cases it's not an either/or, in some cases it's both in some cases because of the data. So it's a unique situation where beauty's in the eye of the beholder depending upon what they have. So can you share your thoughts because the on-prem cloud debate or discussion or dialogue is really nuanced. It's situational.
Anne Hecht
>> It really is. And it's also being influenced by sovereign AI as well, right? We're working with an enterprise partner right now, a software platform partner who is standing up data centers in the regions of their customers because their customers want their compute to be where their workload is running. So I think there's a new ecosystem of actually cloud providers that is actually developing the Neo clouds, if you will. And Dell works with many of these providers to build out their infrastructure. And of course a lot of these Neo clouds, we know them because they're helping the model builders and the AI native companies, the new big AI innovators, but they're also really interested in building out capacity and the full stack of software and tooling to help enterprises also run their workloads on these more regional Neo clouds. And like you said, enterprises are going to be hybrid. Every enterprise is going to be somewhat of a snowflake. Each factory is going to be a snowflake. And that's why Nvidia's really made sure that our software stack and our tooling runs everywhere. So things like our new microservice for example, that optimized runtime for when you deploy a model in production, you can run it in the cloud, you can run it on-prem supports on a Neo cloud or . So an enterprise could actually move their workloads because maybe you start in the cloud but you deploy on-prem maybe, but you want to burst to the cloud because you're doing some post-training. So I think it depends on the economics, where the data is, and then some of the security priorities like you mentioned of the enterprise where they're going to actually want to run and deploy those workloads.>> You mentioned Dell Technologies. I'm old enough to remember when servers were great and you buy a server and you load Linux on it, "Hey, I got an OS." This whole full stack is a huge software opportunity. I want you to share more on that because when I talk to customers of Dell and Nvidia, they love the AI factory, but then the question is what are we running on it? It's kind of an open question. And of course they want to have software estate on there and pre-existing. Then I talked to Kevin Deierling at Nvidia. It's like, "Oh yeah, we were talking, KV cache is the new operating system." Jensen said it on stage at GTC that KV cache is the operating system for the AI factory. He said it. Okay, that's networking. So you have the combination of storage fabrics, network fabrics and compute, all kind of working together. It's not your old school "just load Linux on it", the operating system of a factory is super important. What's your view on this piece of it with the enterprise? Because they want these factories, there's demand, and now it's not as simple as saying just load Linux anymore. Linux is kind of everywhere. So what is the full stack? What runs the AI factory?
Anne Hecht
>> Yeah, so I'll just start layer by layer. So at your foundation layer, obviously accelerated computing, GPU servers, we work with partners like Dell. They have a number of certified servers that across our whole lineup of GPUs, Air-cooled, liquid-cooled, graphics-capable, just LLM inference focus like our B200, but a whole range of GPUs. And then of course networking like Kevin obviously already spoke about is really important for performance and moving those workloads across the enterprise, across that data center. And then storage is really important. And recently we announced a validated design for integrating compute to a storage rack so that actually you could bring the rag to the storage rack and actually turn a storage rack into a rag system. So you can talk to your data through that interface to a retrieval, augmented generation architecture, integrate into that stack or an agent that's running outside of that stack can actually call on that storage rack and all of the processing of that data, vectorizing it, tagging it, embedding, and then serving it up to a model is done really right there next to the data. So it's super secure, really efficient, and you're not replicating data across the data center, which is also really important, but that's just the foundation. Then on top of that, you need full orchestration management. So we make sure we support the full range of Kubernetes and orchestration partners from Red Hat to VMware, Canonical, Nutanix, which most enterprises are running one of those. And then we've really worked with the ecosystem to enable them to take our libraries, integrate it into their AIOps tooling, security tooling. There's a lot of great options out there for actually building your own provisioned library of models, for example, and managing those libraries. So we work with those partners with our NIM microservice so that the enterprise can curate their own development environment, low-code-no-code development environment if you will, through these AI ops partners to enable their AI practitioners to develop and create AI. And again, like I was saying earlier, in a way that it will be supported, secured, have all of the great enterprise guard rails, if you will, through identity management, et cetera. So it can be run actually and deployed in the data center safely. Because at the end of the day, for an enterprise, that's the holy grail. If they can't do AI safely, they're not going to do AI, they're not going to risk their business. They need to really make sure they do it in a safe way.>> Security and safety is huge. Anne, that's a great point about that. And I got to tell you, Dave and I and our team has been loving the innovation on NVIDIA, the systems architecture, the memory, all the speeds and fees. We love that stuff, but it's an enablement. So I have to ask you about, as you go into the enterprise, it's an ecosystem world, right? So how do customers deal with the ecosystem? I know you guys have had a lot of advancements, partnerships. If I'm a customer, I'm going to need to run other things with NVIDIA. What is the state of the market for you guys on the ecosystem? Can you share a few points?
Anne Hecht
>> So enterprises have a heterogeneous data centers. Some of them are running, if you think about it, different Kubernetes platforms, different operating systems. So we really try to support all of the enterprise options out there that are already being used by our enterprise customers. So let's just take orchestration and management. We support VMware, Red Hat, Canonical, Nutanix, and we work with those partners to make sure that they are validated, tested on our platforms. And we even took the next step of producing a validated design that publishes and documents the partners that we're working with as well as not just at that orchestration and management layer, but also up stack in terms of AI office partners. So these are partners that are building low-code/no-code development environments as well as guard railing solutions, security solutions, and they're leveraging our underlying libraries that are in our software product called NVIDIA enterprise, and they're leveraging Nemotron, our NIM, Microservice for running those models, but they're exposing them in a way that makes it super easy for an enterprise AI developer to leverage them and actually build out a full workload in their enterprise, but actually do it in a really safe way. The other way we're doing this is really open-sourcing A lot of our work, what we have been doing lately is actually we have a whole great family of models, our Nemotron models, we open-source, those models shared the data sets, so they're really easy for enterprises to do that post-training exercise on. We are sharing the recipes on how to do that and we're opening up the weights as well. So we talked about the need to customize AI. It makes it really easy and risk-free because they can see the data set for an enterprise to do some of that customization work.>> I mean, having a platform or an operating system for AI like you guys are becoming is really key enablement to have an ecosystem flourishing. I think it's a super important benchmark because it's not just partners, it's entrepreneurs, it's people building the software that will be the net new in some cases enhanced versions of pre-existing stuff. So I think that is going to be a big factor as we look at the growth of the ecosystem side of your business, which is phenomenal in the enterprise. So I have to couch that as a lead into the final question, which is agents are hot, Open AI just announced agentic feature. You start to see acceleration on the road to superintelligence, however you want to define that. People have different definitions. But clearly the world is changing super fast and the reasoning workloads that are coming out of these factories are significant game changing, altering markets. So we're seeing a lot of activity. What are some of the workloads on the agentic infrastructure side applications you're seeing? Obviously we see the low-hanging fruit search, code-development, but this is getting better faster.
Anne Hecht
>> That's right, and it's super exciting. We went from a world where AI was, "Here's a data set of cats, tell me the species of all these cats" and it was like a basic inference workload, which by the way, in most cases that was probably done on CPU infrastructure. And then we went to generative AI where the workload was produce a cat on a horse, an image of a cat on a horse, and now we're at the ability to do deep research. So now you can prompt your AI, "Give me a report on cats and tell me which cat I should buy if I have kids that are allergic." And you can see and then the AI goes off and works and researches that. And in the context of an enterprise, that agentic system can call not just the data and do research on your enterprise data, but it can call external systems if it's given permission to is where the enterprise capabilities are super important. Just like you give a person role-based access, the agent has role-based access as well and can be audited and supervised so you can manage what that agent is actually doing and how it's actually coming back and giving results to the human. And it can call on tools, it can do calculations that maybe the human didn't think about and it can get feedback from the human. And then that data, the result of that deep research can go back into the enterprise data system and continually be used to train the model and make that AI workflow even better and more intelligent. So it's super exciting area and like I had mentioned earlier, we took AI and we applied it to how we do our architecture. So our engineers are saving tremendous amount of time by using AI, getting to insights faster. We also use it to do security of our software stack. So, if there's a breach that comes in on one of the open source containers that we've leveraged, we can very quickly within minutes identify the risk to our customers and then apply a patch, fix the risk, and generate actually a new code for that piece of software. So we're able to automate supply chain, co-development, communications to employees through our internal chatbots. And I think those are the workloads where we're seeing the most promise. CodeGen is really popular supply chain, chatbots, and of course customer service is a really great area because a really good deep research agent can actually look at all of the history of a customer, make recommendations to a teller or a sales rep on what the next step should be for that customer in real time, and provide them with insights that manually they would otherwise have to do.>> Anne, it's been great to have you on sharing your perspective. I know you've seen many cycles of innovation. So I guess my final, final question would be for the folks watching that are in the middle of their journey starting it or leaning into it, Fortune 2000, Fortune 3000, they see the advantages of productivity gains and things you mentioned. What would be your advice to them? What would you say to the folks watching like, "Hey, what's the future look like? How do you do this?" What's your message?
Anne Hecht
>> I think the message is just get started. Start using it and work with your IT teams to figure out how to make a, build out a standard way to bring it into your enterprise. But Jensen has said this himself, and he's obviously a great inspiration for many of us here. Just get started. Don't wait because it is changing so fast and reasoning models. We dropped in December and now everybody's using reasoning. So who knows what's going to drop this Christmas, right? Who knows what's going to happen.>> You got to make a move. You can go faster,
Anne Hecht
>> Just bend your knees, be agile and get ready for the serve, but just get ready and start going. But don't be afraid. There's a lot of really great partners out there that you can leverage. We work with Dell with many of them, and the most important thing is to start on the journey.>> Well, it's a lot of fun too. The upside is there, upside potential. The AI factories is awesome. Thank you for sharing. As senior director head of enterprise products, you have the finger on the pulse. Thanks for coming on our program. We really appreciate it.
Anne Hecht
>> Thanks, John. It's great being here.>> Anne Hecht from Nvidia coming on talking about the importance of AI factories and the importance of getting started and really taking advantage because it's moving very, very fast. And folks that can get on the right side of history here, will take advantage and reap the rewards. Of course, we're doing our part to bring you all the data we can here on theCUBE. Thanks for watching.