RAISE Summit 2025 | Matt Hicks, Red Hat

Clips
News
More from RAISE Summit 2025

Matt Hicks

President & CEO

Red Hat

12 signs AI infrastructure is reshaping the IPO market: theCUBE’s RAISE Summit insights

Capital markets are being reshaped by artificial intelligence, with AI infrastructure emerging as a critical pillar of how companies scale, compete and go public. What was once considered a back-end concern is now front and center in IPO roadshows and investor decks.TheCUBE’s John Furrier talks about AI infrastructure at RAISE Summit.The shift is about more than hype. Investors are scrutinizing how companies embed AI into their business models — not just to drive innovation, but to ensure long-term operational resilience. It’s no longer enough to show top-line growth; firms need to demonstrate that their AI capabilities can sustain profitability and differentiate them in crowded markets

play_circle_outline Coverage of RAIDs Conference 2025 in Paris focusing on AI infrastructure and development.

play_circle_outline Matt Hicks, CEO of Red Hat, discusses innovative AI developments and partnerships.

play_circle_outline Overcoming Multi-Step Reasoning Challenges: Red Hat’s AI Inference Server Enhances GPU Efficiency in Development and Processing

play_circle_outline Distinction between CPU servers and GPU servers for AI workloads.

Info
Transcript

Matt Hicks, Red Hat

Matt Hicks

President & CEO Red Hat

In this interview from the RAISE Summit 2025, Matt Hicks, chief executive officer at Red Hat, joins theCUBE’s John Furrier to discuss how open source and enterprise AI are converging to reshape infrastructure. Speaking live from Paris, Hicks explains Red Hat’s mission to drive down token costs with smaller, more efficient models, making large-scale agentic workloads economically viable for organizations of every size.

The conversation dives into Red Hat’s latest releases: RHEL for business developers, the Red Hat Inference Server built on VLLM, RHEL AI... Read more

explore Keep Exploring

What event is being covered in Paris and who is the host? add

What are the key achievements and contributions of Red Hat to its performance and collaboration with IBM? add

What are the challenges and solutions related to efficiently utilizing GPU resources for running open source AI models in enterprise settings? add

What are the parallels between traditional server technology and the emerging AI infrastructure? add

bolt Powered by CUBE AI

Matt Hicks, Red Hat

search

>> Welcome back everyone to theCUBE's live coverage here in Paris, France. I'm John Furrier, your host of theCUBE. We are here for the RAIDs Conference 2025. This is where the global community's meeting to really discuss the future of AI infrastructure, AI software, AI development for software, app dev, agent building. And of course, the entrepreneurial scene is hot and the leaders are all here as they expand their business opportunities in the enterprise and also out in the consumer space, AI is front and center. Matt Hicks is here. He's the CEO of Red Hat, CUBE alumni. Matt, great to see you in Paris. Normally it's at Red Hat Summit or at your offices.

Matt Hicks

>> Yeah. - Great to see you.

Matt Hicks

>> Good to see you halfway around the world for us.

>> Yeah. I just got to say, first of all, I'm super excited

>> and grateful for all the work you guys do at Red Hat. The performance of the company's been phenomenal. I've been watching the success just in the past three years, just the needle moving across the board. I saw some of the Red Hat Linux work you guys done with the mainframe at IBM. All the work that you're doing with IBM just made IBM stock probably one of the fastest performing stocks. Some will say Red Hat's a big part of it. It is a big part of it. Not the only thing, but super significant part. You guys are continuing to innovate and we're here in Paris where it's like a mainstream cloud meets on-prem, distributed computing, sovereign AI, sovereign cloud, a lot of mixing of these themes, but all point to one thing, a shift is happening. What's your vision here in France? What are you speaking about? What are some of the conversations you're having here?

Matt Hicks

>> I think a lot of it is, it's AI focused of how do enterprises, individuals, startups, entrepreneurs, how do they think about open source and AI, has been a major theme. Open source has those advantages in sovereignty and other areas as an enabler to it. And then AI can amplify open source, but it also changes how code's developed. It introduces some other challenges with it. And so, that has been the major topic of how do we get here? How do we do this well whether it's at an enterprise scale or a three-person startup.

>> Yeah. And you guys are continuing to release... I saw the news Enterprise Linux for developers, RHEL for business developers launched in line with RHEL's 10 GA, AI Inference Server, Red Hat AI, expanded support, multiple models, protocols, OpenShift, Lightspeed, GA, Advanced Developer Suite, In-Vehicle OS, close to GA, partnerships with AMD, NVIDIA, Meta, Google Cloud, Azure, Oracle, Red Hat, all strengthening your leadership. Got that out of the way. So that's all the announcements that you guys been doing with-

Matt Hicks

>> Yeah. That was good. That was good.

>> Was that good? Okay. Took some notes. I was ready.

>> Also, we've had the Red Hat Summit, so that was good. Okay. A lot going on there, right? So I guess zooming out, the big picture is you got a lot of entrepreneurial activity. The developer market is so hot. Open source has really opened up and democratized a lot of the AI development. Still some experimentation, huge wave coming into the enterprise, we're seeing enterprises doing a lot, but it's not yet fully full throttle yet, and open source is going to be the driver there. So as agents are coming, what's your view on the models as models become programmable and they interact? Which you've got MCP server, you've got A to A. These are tech features going to bring people together, agents are coming. How do you see that preparation needed for model integration into the software? How is AI helping? And from a Red Hat perspective, you're at the kernel. So if you think like a kernel developer, you look up, you see all that innovation, what's your view?

Matt Hicks

>> So for us, our role in this is pretty simple. We are incredibly passionate about making each one of those questions to an AI model or token counts as cheap as possible. That's why we focus on smaller models and open source models. The reason is when you make one call, you know what you're dealing with, but reasoning has been introduced, which does a lot of calls and a lot of tokens behind the scenes. And agentic work is going to add another order of magnitude to that. If the unit price isn't small enough, we don't want enterprises to hit a ceiling in what they can do because the costs just balloon and explode. And so that is where we play in the infrastructure layer. It's a really exciting time.

>> And what are you doing there? Because obviously,

>> the reasoning, multi-step reasoning and then reinforced learning with human feedback all is going to... And then obviously agents there, it's going to create a massive demand for tokens. So obviously, more tokens as you pointed out, which means more infrastructure. I might have to over-provision or buy a bunch of gear. You're saying get in front of that. What are you guys doing specifically because that's a good mission.

Matt Hicks

>> It's a simple Red Hat inference server, which is built on VLLM, is allow any open source model to run on any GPU provider you want and solve the enterprise problem, which is, I bought some great AMD or NVIDIA cards, I'm running this model and I can only get to 20% utilization and I don't know why. That's a kiss of death for a nascent AI project. We want to get you to 90% or 100% so you can use what you bought. And then this scaling is we want to be able to let you make models smaller to make them cheaper just with math and drive more efficiency techniques. Inference server is that any card, any model. And then we announced the LLMD work with a bunch of other partners in this to make this work across a cluster as well, which you'll need to ask a lot of questions and get a lot of answers out of an enterprise.

>> I wish I had an hour with you because one

>> of the things I've been saying on theCUBE with Dave Vellante on theCUBE Pod is it was easy. Back in the old days, you get a server, you load Linux on it. All right? Okay. That's what you do. Now you have Linux everywhere. So I'm guessing where you're going with that. So you're really thinking about distributed Linux kernels around maybe we're oversimplifying it or maybe getting it wrong, but it's not one thing. It's not like the old days where I'm running a server, everything's got servers in them.

Matt Hicks

>> Yeah. - So is that where you're going with this?

>> You could think of there are two .

Matt Hicks

>> There's the old days, which was servers that we know and love, CPU servers, and you would put Linux on them and you could run any app. And then a ton was built in that middleware, Kubernetes, all of these things. And then servers, we added things like Kubernetes that we can take a thousand servers and make them operate as a unit. We envisioned the same thing for AI where instead of CPU servers, you'll have GPU servers and different architectures. Instead of RHEL like we know it, you'll have Red Hat Inference Server or RHEL AI that is purpose-tuned for GPU models. Instead of middleware, they're going to be LLMs that you're running and you will still need to run them in a cluster. So there are a lot of parallels. The technology just happens to be different at almost every tier. But we like the de facto standards are close enough that it's in our wheelhouse. We know this space, we've been through it for 20-plus years and processes and CPUs. Being able to extend that to AI models and GPUs, it's exciting to see what people will build on that.

>> And also, one big theme.

>> I was talking to someone at an event, here outside the event at an event party or dinner, and they said they think that we're maxing it, not maxing out, but close to getting the step- up function on the hardware. Yeah, NVIDIA's getting better, CPUs are getting better. The software innovation is where the action's going to come from and then it points to DeepSeek and many other things. So software is predicted to be the innovative area. How do you see that interfacing with Red Hat's Inference Server? And by the way, power is a bounding function bounded by power. So if I have some X86 or compute, I might want to manage workloads across those two, their resource.

Matt Hicks

>> Yeah. Well, and your X86 estate is running the applications that will need to make the calls into these models. So these estates have to coexist. When you look at the Red Hat side, VLLMD to make those models as efficient from a software perspective as possible. That's new forms of memory management, KV cache capabilities. But then LLMD, when you expand that across a cluster, it's only a cluster in the word is the same. Learning how to do that well with AI models and just a vastly different structure, those are the two areas where software will unlock for enterprises, a typical enterprise, what companies like OpenAI have been doing and we're passionate about that of being able to let a company achieve the same results that the larger-

>> Where could people go

>> or code with, to connect the dots to take what you said? Because I think a lot of people are in discovery mode right now trying to figure out, okay, I'm building a generational infrastructure. I know Linux, everyone's comfortable with it. They love where Kubernetes has become de facto on orchestration, platform engineering, check, check, check. Now, the AI stack is looking a little bit different. How could people understand it differently? Is there a open source project? Are there certain things at Red Hat that people can get involved in to learn more?

Matt Hicks

>> VLLM is a technical project and you have to have NVIDIA cards and those, but it's a great starting point. If you're a sysadmin and I have infrastructure and I can deploy this, it's an incredible community. It's a really good starting point there. If you're an OpenShift user and you're trying to figure out how to make that next tier up, starting at OpenShift AI makes for a much simpler introduction for a lot of these pieces across the cluster. That's where we'll land this technology. VLLM or LLMD is a technical entry point to it. Relai as a single server or OpenShift AI would be the two starting points from a user perspective on it.

>> Got it. All right, so I have to go back in time

>> because a lot of the trends, AI Ops was hot a few years ago. A lot of conversations here at this show I would say is above more AI like native, no one's really talking about a lot of the infrastructure other than the Neo clouds and the picks and shovels, vendors and that GPU clouds. So ops have to run everything. So the big question amongst the hallway conversations is will these companies ever stay alive? Will they be durable? Because operationally, it's not just AI talking to prompts, you have to run stuff on it. As an operator, not an operator. You're CEO, of course you're not an operator in that sense.

>> I was an operator. - But as someone operating

>> infrastructure, this is

Matt Hicks

>> where I see platform engineering bringing a lot to the table. What's the operating infrastructure requirements from Red Hat's perspective and your perspective that needs to be in place? Because a lot's coming on top. One is the calls, you mentioned that. Is there anything else you can share?

Matt Hicks

>> We're starting to reimagine even our core products for this world. RHEL-10 to oversimplify to a really important feature set, it runs immutable as image mode. That's a really big step of if you're running thousands of these in containers and they're volatile, being able to roll back and forward and controllers will be critical. So rethinking the admin perspective of RHEL has been a really important area there. And then second, we announced we will be adding MCP capabilities on top of our products so that if you have orchestrators of MCP, you can talk to those RHEL instances and do things with them. I think that's a level of maturity we know of. We know you will run a lot, it'll be volatile. We're going to take what we learned in OpenShift and amplify it. It'll be different , but we know containers are a common path. We think MCP will be a very strong standard in talking across these. And then part of it is just keeping up with the pace, seeing what people do, learn from it, build it into the products, enable the next level.

>> Yeah, MCP's been a nice surprise this year.

>> It's almost become a de facto standard from the community. And I think that's a rallying point and that's what we're trying to tease out is do you see a rallying point in the developer community above Red Hat in the AI stack above you, MCP was one. AUA's getting some traction. There's some difference between state and stateless. They'll be one of those things that they'll let Jim Zemlin at the Linux Foundation figure that out. I know he's going to probably add a project there. What's the rallying point in the enterprise around the up stack more? How do you see that evolving? Because startups and enterprises, there's no yet kubernetes-like vibe. Is there something that you see or needs to be in place that could motivate the community to, "Hey, let's just all decide, let's do this and then we'll move to the next thing."

Matt Hicks

>> I'll be honest. I think the thing that will motivate communities is the first real hybrid successes of enterprises saying, "I can use frontier models, but I can also do these pieces myself on small models. " And that's very early right now in enterprises being able to solve really concrete problems with the infrastructure bets, the running them at capacity, that's where we are, is in that early stage of getting the first wins. Because I'm pretty confident if you can solve simple problems on one model with singular calls, you will move into multi-step reasoning. You will move into agentic and you will solve more and more problems. But I think most enterprises, they're not past that first phase yet of, I know my architecture, I know how to run OpenShift AI. I know my balance of NVIDIA and AMD or Grok going forward. Those are the pieces I think, we need in place that will then let a agentic work really thrive on it.

>> Yeah. Leverage the foundational bedrock, if you will. Could you explain the one call thing? Because I think that's worth revisiting and I thought that was good. So your thesis is minimize the cost per token through calls. Number of calls. Explain that.

Matt Hicks

>> It's minimize the cost per token just on a raw software perspective, because for you to do more powerful things, which we know works, will make exponentially more calls and more token usage. And so, if we can't keep the token costs at a rock bottom low efficiency level, you will never get to agentic because to solve it, your GPU costs will be higher than solving another one.

>> It's a utilization. It's just efficiency.

>> It's just good engineering. Matt, I know you got a hard stop. Thanks for coming in theCUBE, great to see you.

>> Awesome. It's great to see you too.

Matt Hicks

>> Congratulations on a very successful

>> Red Hat Summit recently. Of course, theCUBE was there. theCUBE's in Paris right now getting all the action. We've got to get those token costs down because the context windows, the reasonings required, there is a demand for compute. There's demand for GPUs. The software layers are coming. Of course, theCUBE is here. Thanks for watching.