theCUBE + NYSE Wired: Physical AI & Robotics Leaders QA2 | Manvinder Singh, Redis

Clips
More from theCUBE + NYSE Wired: Physical AI & Robotics Leaders QA2

Manvinder Singh

VP of AI Product

Redis

play_circle_outline Maximizing AI Performance: Redis's Essential Role in Real-Time Data Management and Modernization for Scaling Applications

play_circle_outline Overcoming Challenges in Productionizing AI: Infrastructure, Inference, and Onboarding in Enterprise Applications

play_circle_outline Empowering Developers in AI: Building Robust Architectures with Redis's Flexible APIs for Lower-Level Design

play_circle_outline Trends in AI deployment on-premises and its comparison to cloud solutions.

Info
Transcript

Manvinder Singh, Redis

Manvinder Singh

VP of AI Product Redis

Manvinder Singh, vice president of AI product at Redis Inc., joins theCUBE’s John Furrier and Dave Vellante during theCUBE + NYSE Wired: Robotics & AI Infrastructure Leaders 2025 event to explore how high-performance databases are evolving to meet the needs of modern AI. The conversation focuses on Redis’s work with semantic caching, vector search and AI-ready infrastructure.

Singh outlines how Redis is optimizing latency, inference speed and real-time data retrieval to support scalable AI applications. As demand for intelligent agents and operationa... Read more

explore Keep Exploring

What is the role of Redis in the context of AI applications and how does it relate to large language models? add

What are the current challenges customers face when productionizing AI applications? add

What is the company's approach to empowering developers with their AI architecture and product offerings? add

What experiences before joining Redis did someone have with AI and cloud technology, and how is AI being implemented on-premise? add

bolt Powered by CUBE AI

Manvinder Singh, Redis

search

>> Welcome back to theCUBE coverage here in Palo Alto of the Robotics and AI Leader series. I'm John Furrier with Dave Vellante, host of theCUBE. Got a great lineup, three days of coverage, really digging deep into the future of AI as it understands the infrastructure benefits and scales up the agents and all the data. A lot of open source content. Manvinder Singh is here, VP of AI product management at Redis, big part of the community and also enabling a lot of the value. Great to see you. Thanks for coming on the program.

Manvinder Singh

>> Thank you, John. Great to be here with you.

>> You guys have been leading the infrastructure side. We've been a customer, I think only back a decade, open source side. And then you guys have been in all the cloud native, all the development cycles. You've seen kind of that wave of SAS hit cloud SAS, you've seen the workflows, you're embedded. AI is changing the game, we just had VAST Data on. They're talking about things that they're doing. You have other leaders on robotics thinking, hey, I just want the data. I want to get the machines to work. So data and orchestration, making sure things are persistent, this is like your wheelhouse. So I have to ask you, what has changed for Redis in the past, say 18 months? Because now the game is still the same, but it's shifted a lot. What's different?

Manvinder Singh

>> Yeah, no, absolutely. So as you mentioned, we have seen the previous waves that happened. Redis has been a core part of the web application stack, the mobile application stack. And what we're doing primarily there was basically enabling real-time data and making things fast, getting rid of the spinning wheel and removing latency. And we kind of see ourselves doing the same thing with the AI wave as well, because as you know, the AI applications, whether it's agents, it's RAG applications, they largely rely on large language models, LLMs, as the main engine powering these applications. But the large language models as impressive as they are, they are generally stateless. And so you need a layer of memory that is fast that can add context to these applications, that can add long-term memory, user preferences, things like that, that you can store and achieve really fast. So that's really what we're doing in this wave as well.

>> It's interesting, the word memory has been kind of, I won't say kicked around, but changed nuance. But we see things like on chat, speed thinking or memory mode. So there's state there and then we start thinking about real memory. Memory that's like where the data is stored close to the GPUs. It's integrating into the workflow specifically around how applications are interacting with the data. What's the core challenge customers are having that you see in, I won't say migrating or I'll just say modernizing because that's kind of the key word, as they transform into AI without missing a beat? What are your customers doing? What are they implementing? Can you share your thoughts on where they're at right now?

Manvinder Singh

>> Yeah. So I would say there's probably two or three really big challenges that customers have started to see as they're productionizing AI applications. One is just supporting inference at scale. I think people have started realize the AI inference we're seeing with different large language models, be it from OpenAI, Anthropic or wherever, it's going to grow 1000 fold from here. And how do you manage the latency that comes with that inference? How do you manage the cost that comes with that inference? And we at Redis do that through a product for semantic caching we just launched, where you can cache the LLM responses. But that is one big challenge, which is managing inference growth. The second big one is how do you actually onboard these AI applications or agents? Not necessarily a challenge for Redis alone to solve, but how do you give these agents access to your ERP systems, to your HR systems? How do you maintain chain of custody? So we're seeing that the technology for building agents is almost there or it's getting better, but onboarding these agents is a challenge for companies still. And the third thing I would say is how do you make these agents act like humans where they remember things like humans? That's the memory piece that you just mentioned. How do you store the memories? What to store, how do you figure out what to store and how do you retrieve them really fast?

Dave Vellante

>> You mentioned how you guys get rid of the spinning wheel. You're known for blazing fast in memory, sub-millisecond. As you add things like vector search and embeddings, how do you maintain that sort of SLA when you have to add in those capabilities? What are you doing with the core product and architecture to accommodate that without compromising your heritage?

Manvinder Singh

>> Yeah, no, so we brought our DNA of performance to what we do in the AI stack as well. We have mentioned semantic caching, the goal there is to take out latency from inference. Our other big product is vector search, you mentioned that, Dave. And there again, we focused on performance and speed as one of our big differentiators. The way we do that is, A, well, Redis is in memory so that comes with advantages of speed of course. But also at the same time, we've made sure that we can scale our vector search horizontally really, really well. So you can take advantage from multiple shards if you have scale, you can scale vertically really, really well. So you can add more vCPUs if you need more performance, more throughput. And also there's nuances around the algorithms you use for vector search. How do you do filtering? How do you do post-filtering, pre-filtering? And really the DNA of the company is around speed and performance, and we implement that in all the different stages when we're building our products.

Dave Vellante

>> I want to ask you, it was interesting the snowbricks, I guess Tony Bear calls it. And we always learn some things and put some things into perspective. Sam Altman at Snowflake said some things that were somewhat contradictory and then Ali Goetz... And I'll share and I would love to get your feedback. You must be thinking about this from your long-term roadmap. Sam said something like, "Well, you think about today, it's LLMs talking to a database. That's not the ideal architecture. I'm not saying that there's anything coming soon, but we know the limitations of LLMs and your ability to query and the like." Ali said, "Well, nothing's really changed in database in years since I was at Berkeley and studied the AR papers."

>> A lot of hate makers being prone.

Dave Vellante

>> Yeah, a lot of hate makers. So we of course love database. So how are you guys thinking about the future of database and the interactions with LLMs and agents? Do you feel like-

>> Query engines are out there.

Dave Vellante

>> Do you feel like it's under pressure and you need to evolve or is it like a tailwind for you guys? How do you think about that?

Manvinder Singh

>> So we think of it as a tailwind for us, primarily because we don't think of ourselves as being the offline or operational data business really. Redis is really a data delivery platform that can complement whatever your data infrastructure is in the background, whether it's Snowflake, it's Databricks. Redis is seen primarily as the platform that can deliver data to your agents. One of the interesting things about agents, I was talking to a startup that was building agents, and agents rely on iterative loops to find things, to query things. Imagine giving these agents, which is a non-deterministic system, access to your operational database, think of the QPS that could lead to. So Redis can be that layer in the middle that can be in simple terms a cache, but really a delivery data delivery platform that can shield your operational systems. At the same time deliver data to your agents that they need.

>> And this is where performance comes in. And this is why I want to come back to the performance because the agent's ability to get data at the right time becomes super important on the delivery piece. So all the buzz right now, if you look at all the things that we love to talk about is KV cache. These are networking concepts. These are delivery mechanisms, moving data from point A to point B, having memory, understanding the cache. Obviously there's a power issue going on, or how do you reduce the loops it takes to whether you're jumping on across hops, whether it's storage or networking. These are like technical things, but this matters because if the agents don't get the data it's like being slow or they miss out. So that this is where hallucinations can come in or out-of-context decisions. Share your thoughts on why that's important. And two, what are customers doing to kind of inject the QA and the delivery or the quality? Because evaluation's hot right now, I got to evaluate what I'm doing first and then understand the delivery piece.

Manvinder Singh

>> Well, there's a lot of work going on in the world of guardrails for your agents. There's multiple ways people are doing this. You're implementing LLMs that can actually check the output from other LLMs. We at Redis have a product we call the Semantic Router, which basically does a vector search to figure out if certain kind of queries match a pattern that you want to avoid. I will say it's still early stages as far as that stack is concerned. But we have started to see agents, despite the concerns around hallucinations and things like that. We have started to see killer apps amongst agents start to emerge. And people are putting those in front now.

>> And I want to get your thoughts on this because obviously we've been around the block, we've seen these waves a bunch of times. There's a generational shift happening. A lot of young developers are coming in, they might not know Redis or "Hey, my dad used Redis." So how do you talk to that generation about the relevance of Redis and why you're not modern legacy or, hey, we've been there, done that and we have this today? What's the message to the young guns out there, "Ah, Redis is old school?" Talk about that because I'm not saying you are old school, definitely-

Manvinder Singh

>> No, absolutely. It's a great question. Thank you for asking that. So first of all, I would say Redis is still relevant today for the same reasons it was in the past, which is to add speed and performance to your applications. It still has been consistently ranked as either the number one or number two most admired database by Stack Overflow survey consistently through the years. I would also say it is not your dad's Redis still, there's a lot more innovation we have done. I talked about the products we build in AI with vector search, with Langchain which is our semantic caching product. There's products we're building for agent memory. So there's the whole AI side of Redis. But even on your cache side, we have a fully modernized cache stack now available for you, which includes things like Redis Flex, which is bringing Redis onto SSDs as well. So you have a true two-tier cache. We also have RDI, which is data integration for Redis. So if you're using a PostgreSQL database, we can keep your Redis in sync with a product that allows you to build these pipelines.

>> Sorry, one quick question follow up. Talk about the two-tier cache because this is where... And it's in the weeds, but I just want to get there. Why is the two-tier important?

Manvinder Singh

>> So in 2009 when Salvatore Sanfilippo he built Redis, back then the speed that you had with DDR RAM, now you can get faster speeds with SSDs. So-

>> As a storage tier?

Manvinder Singh

>> As a storage tier. So with NVMe storage you can get faster speeds.

Dave Vellante

>> For better economics, of course.

Manvinder Singh

>> Exactly. So it just made sense to bring Redis onto disk as well and offer a true two-tier cache. With Redis Flex, that's the name of our product, we keep the hot data on RAM and the less frequently accessed data on SSDs. So you get a true two-tier cache.

>> And that's important because as reasoning comes in multi-step, that's where the memory efficiency is important, is that right?

Manvinder Singh

>> You may have terabyte-scale use cases, you may want to store more memories, and so storing on disk is cheaper than storing on RAM. So it enables that whole side for you.

Dave Vellante

>> So I want to come back to shifts in architecture. And when you think three to five years out... Maybe we shouldn't be thinking that far, but thinking out and you think about the AI architecture, and I want to map that to the Redis core design principles. The GenAI was an oh crap moment for some companies, for some architectures. X86 is the most obvious one. But as you think about change in model size, both large and small, as you think about distributed inference, something you were talking about before or edge deployments, how do you think about the evolution of the core design of Redis and mapping to the AI architecture changes? How are you thinking about that?

Manvinder Singh

>> So look, our philosophy has always been to empower developers and that translates into offering them lower level APIs, for example, and giving them the chance to use Redis the way they want. So we are really building building blocks essentially or Lego blocks that the developers can then use to build the architecture of the agent that they want. And so with our semantic cache, our vector search, our agent memory products, we launched a new data type within core Redis called vector sets. So again, built by Salvatore who rejoined the company, these are all building blocks that developers can place wherever they see fit, because I feel like the architecture of the agent itself will go through a lot of evolution and developers will experiment. They'll find new ways of doing things. And so our job is really to give them that power to use Redis the way they want.

Dave Vellante

>> You and I were talking earlier this morning about the whole trend. You guys are in the cloud, you're on-prem, you've got a hybrid architecture, what are you seeing in terms of the trends for organizations? The term repatriation is something John and I never really bought into. It was a very small sort of trend. We are not repatriates, but we are definitely seeing a trend to bring AI to the data on-prem. What are you seeing? I know you've shared that you've done some benchmark work on-prem, and what's going on there?

Manvinder Singh

>> So I would say I spent 11 years working in the cloud before I joined Redis, and I was really surprised by the scale of AI actually happening on-prem as well. Some of our largest customers, large banks, for example, they are running AI on-premise. They're using NVIDIA GPUs with open weights models, and they're using Redis as a vector database. And you can replicate pretty much everything you're doing in the cloud on-prem as well. So that's a whole robust ecosystem that I think is here to stay.

Dave Vellante

>> Well, that's interesting because I would agree with you to a point, to me the weak spot there is the data platform. When you think about Snowflake, Databricks I guess has on-prem, but Snowflake is like, "We're never going on-prem." I'm like, "Never say never." So that seems to be an area. Now, maybe this means the resurgence of a company like Cloudera who's actually got a great stack but ran into some troubles. But do you agree that seems to be the one linkage, that missing link, if you will?

Manvinder Singh

>> So fair point, I think some of the leading, I would say analytics companies, the data lake companies, they haven't invested in on-premise that much. Having said that, when it comes to building customer-facing or agentic applications with AI, the stack that you need a lot of it is available on-prem. You need a vector database, you need a way to run inference. There's a whole ecosystem around technologies like vLLM, which is just enabling all the best AI models to run on-premise or whatever you want. So a lot of those components are there for the AI inference and application side, maybe not the data lake-

Dave Vellante

>> Is the expertise there? I guess it certainly is probably in the big bank, we had JPMC on the other day and were talking about how many people they hired, the best people in the world, how they were writing big checks. But most enterprises, would you agree, don't have the level of skill? So they're looking for companies like yours and your partners to do the integration and make it easy to consume. And that's going to take some time.

Manvinder Singh

>> So there's definitely high variability in the skill level. And what we've done as Redis is invest in making sure we have a team of forward-deployed engineers, who can go in and can help the customers with the help they need in defining the right architectures. We have a professional services team that can come in and help with, for example, fine-tuning of models. We have a whole practice of helping you get the RAG accuracy you need because often a challenge. So we see customers who just tell us, just spin up a database for us and we'll take it from here. On the other hand, we have customers who tell us, I need this accuracy from my chatbot, help me get there.

Dave Vellante

>> And so just to put a finer point on it, you are comfortable based on some of the reference architectures and the benchmarking that you've done, that what you have on-prem is substantially similar to what's in the cloud?

Manvinder Singh

>> For us it's exactly the same stack. So that's been one of the philosophies is we offer Redis Enterprise on-premise available as software, and then we offer Redis cloud services on AWS and GCP, and there's Azure managed Redis in Azure. So it's fundamentally the same stack. Now, there are characteristics like elasticity that differ from on-prem to the cloud, but if you're building an application with Redis on-premise, you can take that app tomorrow and run it in AWS. It'll get the same thing.

Dave Vellante

>> John, I was talking to Gee Rittenhouse who he was in here for supercloud. He just said, "How would that supercloud premise work out?" I go, "It's happening now."

>> Now AI multi-cloud. This is the advice, distributed computing basically has happened and it's happened always. And AI kind of highlights in cloud, on-premise edge, it's all one thing. It's hybrid. That game is over. Pretty much everyone there's no debate there anymore. But the culture has changed, you mentioned Salvatore coming back. I saw end of the year, you mentioned he's back. It's interesting, theCUBE started the same year Redis was kind of born, I think 2009, '10 timeframe. So we've been on that journey. So Dave and I always talk on theCUBE like, "Man, I wish I was 20 years old again, and we'd be back in coding AI." Because it is hot. If you're in computer science or you're a builder, this is the best time. So you even see Sergey Brin coming back to the office. He's a billionaire. He doesn't have to work. So you have this intoxication around the tech right now where it's just exciting.

Dave Vellante

>> FOMO.

>> It's like, well, the tooling's great, the delivery mechanisms are getting faster, the chips are better. You got software chips, geography, sovereign cloud. This is societal change. The culture inside Redis, what's changed since? You mentioned the creator's back, I'm sure he is not just bored. He's probably eager to chomp his teeth into some AI. What's the vibe there? What's the culture like at Redis, can you share? Because you're seeing a lot of this going on, people are energized by the opportunity.

Manvinder Singh

>> So no, there's a lot of energy at Redis. So first of all, everybody at the company sees this new opportunity to do what we did with the web stack, do the same with the AI stack and with our vector database product, our semantic caching product, we're doing that. You mentioned Salvatore, he's not just back, he's leading a lot of the innovation. He built vector sets, which is a completely new way of doing vector search. I would say it's the easiest way of doing vector search, and that's a core part of Redis and he built that. So there's a lot more to come on our roadmap, but we're super excited.

Dave Vellante

>> What's unique about that architecturally, can you just give us a sort of high level?

Manvinder Singh

>> Yeah. So for developers who are used to Redis, they like those simple, intuitive commands to do all kinds of operations. And with vector sets, it provides you those simple commands to do complex vector search, and it manages to hide all of that complexity behind the scenes. So it's really that intuitive interface that developers really loved with Redis and Salvatore has now brought that to vector sets as well.

>> Yeah, I see. Old school, new school all coming together. It's a systems thinking and you got the young guns coming in and the seasoned veterans who have been doing operating systems, networks at scale. It's kind of a mashup of talent, isn't it?

Manvinder Singh

>> Yeah. Which is why it's exciting .

>> Thank you for coming on. Again, love the pride, love the open source mission from day one, continues to thunder away. I'll say developers are hungry. They want to get their hands on building stuff fast, reliably, at scale. Thanks for coming on. Appreciate it, Singh.

Manvinder Singh

>> Absolutely. John, thank you so much for having me. Thank you, Dave.

>> We're here, the robotics and AI infrastructure leaders. It's theCUBE bringing you three days of coverage. I'm John Furrier, Dave Vellante, your host. Thanks for watching.