NVIDIA GTC 2026 | Andy Pernsteiner, VAST & Ace Stryker, Solidigm & Brennen Smith, Runpod

Andy Pernsteiner, VAST & Ace Stryker, Solidigm & Brennen Smith, Runpod

This panel at NVIDIA GCT '26 examines high-density storage, disaggregated architectures and artificial intelligence, AI infrastructure. Hosted by John Furrier of SiliconANGLE Media Inc., co-founder and co-chief executive officer, CEO, the discussion assembles industry leaders to explore how storage shapes modern AI stacks and data center architecture. Andy Pernsteiner of VAST Data, field chief technology officer, CTO; Ace Stryker of Solidigm, director of AI and ecosystem marketing; and Brennen Smith of Runpod, chief technology officer, CTO, examine storage density, energy efficiency, SSD innovation and the role of software in enabling high-density commodity hardware to meet demanding AI workloads. The panel addresses disaggregation, global scale and developer-driven use cases. Speakers evaluate SSD developments such as Solidigm’s 122TB SSD and technologies such as KVCache, Parquet, Engram and DeepSeek-R1, and they consider implications for unified fabrics and AI cloud deployments. Key takeaways highlight measurable business and technical impact. Smith reports that pairing GPUs with high-quality storage increases Runpod margins by 12% and they attribute the improvement to optimized storage performance and reduced data movement. Pernsteiner emphasizes that storage is central to artificial intelligence and that software enables high-density commodity hardware to deliver performance while reducing power consumption and rack footprint, and they underscore the importance of software-defined approaches in disaggregated architectures. Stryker highlights Solidigm’s density roadmap and they discuss implications for energy usage, rack space utilization and fabric consolidation. Watch the full panel for detailed technical insights and practical guidance for architects, storage engineers and AI infrastructure teams. -R1

Share this session

Clips
More from NVIDIA GTC 2026

Ace Stryker

Director of AI & Ecosystem Marketing

Solidigm

Andy Pernsteiner

Field CTO

VAST Data

Brennen Smith

CTO

Runpod

play_circle_outline GPUs paired with high-quality storage increase margins 12% at Runpod

play_circle_outline Need to feed GPUs with fast, scalable, cost-effective storage and networking

play_circle_outline VAST: Offloading KV Cache to Commodity SSDs to Free GPUs for Faster Inference

play_circle_outline AI Clouds and Neoclouds: Developer-Driven Infrastructure for RAG, Vector Data, Engrams, and Large Context Windows

Info
Transcript

Andy Pernsteiner, VAST & Ace Stryker, Solidigm & Brennen Smith, Runpod

Ace Stryker

Director of AI & Ecosystem Marketing Solidigm

Andy Pernsteiner

Field CTO VAST Data

Brennen Smith

CTO Runpod

In this interview from the Nvidia GTC AI Conference and Expo, Andy Pernsteiner, field chief technology officer of VAST Data, joins Ace Stryker, director of AI ecosystems marketing at Solidigm, and Brennen Smith, chief technology officer of Runpod, to talk with theCUBE's John Furrier about why high-density storage has moved from afterthought to the critical enabler of AI infrastructure at scale. Smith reveals that pairing GPUs with VAST storage increases Runpod's margins by 12%, a repeatable result that proves GPUs generate more revenue when backed by fast, sc... Read more

explore Keep Exploring

What role does storage play in AI infrastructure, and can pairing GPUs with high-quality storage improve margins and enable additional use cases? add

Why is storage becoming increasingly critical in modern AI and data infrastructure? add

How can high-density commodity hardware and storage (e.g., high-density SSDs, KV Cache, Nvidia's Dynamo) combined with software optimizations be used to speed up inference, offload work from GPUs, and increase users-per-watt for inference-as-a-service? add

What does the future hold for AI and AI clouds, including the evolving role of cloud providers, data locality, and the storage and infrastructure needs? add

bolt Powered by CUBE AI

Andy Pernsteiner, VAST & Ace Stryker, Solidigm & Brennen Smith, Runpod

search

John Furrier

>> Hello, I'm John Furrier with theCUBE. We are here at GTC. We're in the VAST Data VIP lounge, a big display and activation here where all the storage deals are being done. The big infrastructure providers are all getting ready to build out that next wave of AI infrastructure. We're going to talk about the high density storage that's powering the next generation cloud as we hear the stack in AI, memory, storage in real time. AI agents all enabling this next value proposition. Andy Pernsteiner, field CTO of VAST is here. Ace Stryker, director of AI ecosystems marketing at Solidigm. And Brennen Smith, CTO of Runpod. Guys, thanks for this panel. Let's riff a little bit. Brennen, we'll start with you. What are some of the challenges you're seeing as you're building out the infrastructure, you're looking at what's going on in the market? Everybody wants to squeeze as much performance out of things as possible. Energy is bounding function. It's the lower end of the cake, as they say, the five layer cake. What are some of the challenges that you're trying to solve?

Brennen Smith

>> Yeah, it's a really good question. I mean, at the end of the day, everyone's looking for an edge. Everyone wants to find that way to get an extra byte out of a drive to be able to make that happen. We spend a lot of time talking about the shortages, but I think the other side to look at it is what does storage unlock in the AI space? At Runpod, we see that GPUs paired with high quality storage, we use VAST very heavily, that increases our margins by 12%. That's a very repeatable playbook. It's very consistent in our data. GPUs make more money when they have storage attached to them. And that's because it unlocks so many different use cases that realistically are very difficult to achieve otherwise. So this is one where what we look at is how do we work with the right partners in making sure that as we're scaling, as we're growing, we're able to leverage our storage techniques, which makes it a better experience for the customers, which at the end of the day results in a better product.

John Furrier

>> Andy, why is this storage being so critical? I mean, you got to feed the memory, you got to feed the data.

Andy Pernsteiner

>> When we started out, our main priority was to ensure that customers could get the most out of whatever hardware purchases that they were going to make across the entire stack, because obviously storage is more than just SSDs. You have to have networking, you have to have memory, CPU, you have to have interconnects, you have to have everything to bring it all the way to the application layer. And so we focused most of our energy on reducing the cost that people would have to spend to deploy the hardware infrastructure. Because we're a software company and we're not incentivized to sell hardware, but we need it in order for things to function. And so one of the things that we've been seeing over the course of the last several years as AI has gained more and more mainstream adoption is customers are realizing that they need access to more of their corpus of data to gain insight and they need a place to put it that's extremely fast, scalable, and cost effective. And so we rely on our partners to make sure that we're getting the best in breed, not only from an efficiency standpoint for cost, but also from a density standpoint, because part of the layer cake is power and people don't want to spend money on power if they don't have to. And so we're trying to reduce the footprint required to deploy infrastructure, not only at the sort of SSD layer, but also the power required to deliver-

John Furrier

>> And you guys are doing really well. What'd you think of the keynote because you guys had had real great rocket ship success, high scale velocity, as we say on theCUBE. Jeff gives us updates and won't say numbers, but we know that you guys are doing some good business. You're in the flow. Jensen talked a lot more about storage this year than ever before. I mean, you always have the storage , but this GTC, he nailed it. He's like, "We need to make that data. Humans can wait a second, but AI can't." Talk about what the impact of storage is.

Andy Pernsteiner

>> So our lens is obviously the one where we're, to be honest, scrambling to make sure our customers can get the allocations they need, right? The number of exabytes deployed last year is going to be set to triple from our standpoint this year. And the manufacturers are doing their best to keep up with everything, but the demand is through the roof. So again, our goal is let's reduce the footprint that they need to get the most that they can. But we've always felt that storage was the center of AI. It just took a little while for everybody to sort of focus energy on it. And now I think it's obvious, not only because-

John Furrier

>> And that pushes the density too. I mean, as it becomes closer, it's denser.

Andy Pernsteiner

>> Yeah.

John Furrier

>> Everything's dense.

Andy Pernsteiner

>> Well, and let's also look at it this way. A lot of times when people are deploying, let's say a gigawatt data center to deploy lots and lots of GPUs, storage used to be one of the main afterthoughts. It was something that people thought about after they were deploying, but they realize now they can't do that. But the way we can make it easier for data center providers, for builders, is to make sure that it can fit into the smallest footprint necessary and provide the most access and performance possible so that they can make sure those GPUs are continually humming.

John Furrier

>> Ace, where does Solidigm fit into this architecture? Because you guys, again, you have great product success. It's well known that the demands are high, saw the wafer, I saw the 122 terabyte small drives. What's your piece in here?

Ace Stryker

>> Well, we make the storage device itself, right? Which means our job is really to enable these guys to do the kind of work that really delights their customers, right? In my view, Runpod does a great job of making GPUs consumable and removing all the friction in that process, making it seamless for the end user, right? VAST has a particular strength in making sure those GPUs are highly utilized by feeding them data across this sort of disaggregated approach, right? And making sure that you don't have expensive compute resources that are sort of stranded and starving for data and running at 50% utilization, right? Underpinning that is the capability of the storage device itself, right? And so what our engineers are doing is in the lab battling the laws of physics to try to get to that next density point. We launched 122 terabyte SSD last year. We've announced our ambitions to double that in the near future here, right? That has very real implications on energy efficiency, which we hear from a lot of our customers is the key constraint and the key concern. Also, rack space, infrastructure complexity. There's a lot of ways in which that benefit shows up. And so whether it's the stuff that lives in a GPU server where you care about really high bandwidth and getting that hot data served to the GPUs as quickly as possible, or whether it's more of a shared storage across the network where density is the name of the game, Solidigm's always worried about those things and about how folks are using these kinds of solutions in the real world to make sure that our products can support and enable those to the extent possible.

John Furrier

>> Brennen, how does this impact you on the partnership? How does it affect Runpod?

Brennen Smith

>> I mean, it's a big one. Talking about density, it's common that there will be cases where we meet the physical requirements or the entire maximum of a DC space. If we can get more dense, more compact, that lets us be able to have a better monetization. But I think zooming out a little bit, the common theme here is speed. The speed to access, the speed to load. Something that I think is going to become more and more pressing is just all the IO stall time. What can we do to cut that down? These GPUs are expensive to run. They're expensive to have sick cold as models are loaded in. What tricks can we do, especially we have the application layer, we have the physical layer, we have the layer that connects it all together? What can we do to make it a better experience? One thing I'm excited about that VAST is doing is building out a number of ... I don't know what the proper term you guys call it is, but the features on top which let you actually meta analyze the data. So Parquet data that then you can query as like actual SQL. That's really valuable if I can offer that to a customer and say, "Hey, load your structured data in. Not only can you use it for AI, but you can also do traditional analysis on it." That bridges gaps. So I think looking at, hey, how do we cut that down, but also B, how do we bring better value to customers at scale? 500,000 users, they want these types of features. That's what I'm personally really excited about.

Andy Pernsteiner

>> That's awesome you mentioned things that I probably forgot about. But basically, one of the things that we started out with was let's use the highest density possible commodity grade hardware to give the best experience to customers. And you wouldn't typically think of commodity hardware. You wouldn't think of high density SSDs as being the way that people would go for a performance required use case. That's again, where our software comes into play to ensure that people can get the best possible experience. If you think about speed, there's a lot of other elements too. Not only is it about feeding the GPUs quickly, but it's also about allowing the GPUs to offload certain things to storage so that they don't have to do extra work and serve more users, right? KV Cache in conjunction with working with Nvidia on their Dynamo project is a good example of how we're leveraging storage to give the GPUs more time back to be able to do more inference, right? If you're building out inference as a service as a provider, the amount that you spend on GPUs is correlated to the number of users, to the amount of contacts that they have, the number of turns. But if you can allow some of that to be offloaded in a highly accelerated, highly performant way and free up the GPUs, well, that's more users you can support. That's more density you can get from a user to watt standpoint. These are terms that probably five years ago people don't think about it. Now you think about user per watt, you think about user per gigabyte, et cetera.

John Furrier

>> I mean, it's basically a huge contrast from the traditional infrastructure stacks. I mean, because you're talking about density, you talk about unified fabrics, you're talking about speed. What should people know about this new stack that's emerged? Because I mean, Nvidia is basically saying, "Look, this is what we're building. It's brand new. This gear is great. It's fast. Got to get the software stacks right." What is it about the software stack that makes it unique?

Brennen Smith

>> I mean, I think the big one that I'll be pushing VAST on is how to scale beyond the DC. I think that's going to be the big question, and I say that jokingly, I think it's going to be across the board is how do we actually get past a singular DC? How do we make it so we're bonding many, many DCs together into regions, and then at a global scale? You have to fight physics. You have to fight economics. It's arguably the hardest problem to solve. But in my mind, those who figure out how to do global scale storage to be able to maximize resources around the globe, those are going to be the winners in this game.

John Furrier

>> I mean, density doesn't go away. Certainly the AI factories, the big ones will be highly dense. Jensen said to us privately in the analyst meeting, we weren't allowed to record it, but I did write it down. He goes, "I show boring slides that no one is clapping on, but that's the future. And everyone wants to clap for the good stuff like the new Vera Rubin." And one of the things that's happened here that points to this distributed infrastructure is they showed the AI grid. It was kind of a telecom in the side corner across the street. It speaks to distributed factories. That's going to come down the pike super fast. I mean, it's not mainstream now. There's a lot of reference architectures, but this goes to distributed computing. Any thoughts, guys, on where this goes with that whole idea of fast, speed, dense, make more of it, build more of it, build it everywhere?

Andy Pernsteiner

>> Yeah. I think what we've found, and especially this week, talking with customers and potential customers, geo distribution is part of their strategy because power right now is a challenge to find in the right locations. And so kind of people are going where they need to go for power, but that isn't always so useful if they have to distribute all their data centers to random places. And so they need a way to coordinate how that data is moved. And so we do have a global namespace which allows for customers to be able to process on data, which may have originated somewhere else. A few of the clients I was talking to this week were talking about factories that would have locally sourced data, but they need their training to run wherever the GPU capacity is the cheapest at that particular moment in time. And so stitching these things together in a way that allow customers not have to worry about where the data sits, but rather where maybe the compute costs are the lowest, that's the kind of thing that we see people doing more and more. And that kind of ties into a lot of the conversations happening.

John Furrier

>> I guess my next question, our last question is, what does this mean for the future for AI, AI clouds? Because neoclouds are just turning into hyperscalers. I mean, you're going to see more and more providers. You guys are building it out. I mean, the demand curve is so high, but there's also use cases that are different. Where's the future of AI clouds?

Brennen Smith

>> Well, that's the trillion dollar, multi-trillion dollar question. I think the most fun part about the industry is we're all figuring out in real time. That's why it's fun being at these types of events because you hear some incredible use cases. I think at the end of the day, we are in the exciting Wild West time. This is where the opportunities lie. This is where the most action to be made is right now. I think the biggest question that we'll have to contend with is, frankly, it gets back to what I agree with, is locality, is how do you make the entire world's resources available? And that's an incredibly difficult problem that no single layer can solve on its own. So those who do it well. Great opportunity.

John Furrier

>> Take a stab at it.

Andy Pernsteiner

>> One more stab. I mean, what I'll say-

John Furrier

>> Philosophical....

Andy Pernsteiner

>> is that we'd love for a 244 terabyte drive tomorrow. One of the things that we started out to do early on was to make sure that once something was available from a manufacturer, we could use it. People could use it at scale. We can mix it with previous generations. So Solidigm's been making drives for years and we're able to use all of the different capacities and versions of them in the same cohesive system without losing performance and without giving anything up. So give us something new and we'll use it right away.

John Furrier

>> Ace, close us out. You're in high demand. You're the one holding all the cards. You got a lot of friends. I mean, ice cream guy's less popular than Solidigm right now here today. You guys are in high demand. big smiles.

Ace Stryker

>> It's an interesting time to be in the business because I think to an extent we are used to being an afterthought, honestly, right? Storage historically has been a relatively small part of the bill of materials in a data center. And I think Andy put it well when he said data's always been at the center of AI, but that's really showing up in newer and bigger ways in call it the last 12 months than ever before. And there are definite developments on the inference side that you can point to as demand drivers. And we've talked about a lot of them here at this show, RAG data and they need to vectorize that to make it searchable quickly. All that has to live somewhere. That's incremental bit demand. KV Cache from these models with huge context windows and agents pounding them constantly with prompts. All that's got to live somewhere. Another one on the horizon that we're kind of keeping our eye at is this Engram thing from DeepSeek that appears to be a way to build sort of a cheat sheet look up table for common token patterns. All that is data that's going to have to live somewhere, right? And so we don't anticipate the need for fast storage to wane at any point in the near future. In fact, it's headed in the opposite direction.

John Furrier

>> I mean, if you're a technologist in this era right now, it's a dream scenario. Whether you're solving energy problems, AI infrastructure, if you're a data nerd or a developer, I mean, it's a really perfect storm of innovation. I mean, I've never seen anything like it. The developer angle is so huge. That thing is coming on top of the demand curve for tokens. So you got a lot of people hungry for AI right now.

Brennen Smith

>> My capacity planning team is not sleeping right now.

John Furrier

>> You guys are doing a good job. Go faster. All right. Thanks for coming. Thanks for this panel. Appreciate it.

Brennen Smith

>> Thank you.

John Furrier

>> The demand is exploding and the developers are hungry for all the action. This is what's happening. And again, Nvidia is just getting started. The ecosystem is changing and evolving super fast of the AI stacks up and down the five layer cake. I'm John Furrier, your host of theCUBE. Thanks for watching.