We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: Mixture of Experts Series. If you don’t think you received an email check your
spam folder.
Sign in to theCUBE + NYSE Wired: Mixture of Experts Series.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For theCUBE + NYSE Wired: Mixture of Experts Series
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for theCUBE + NYSE Wired: Mixture of Experts Series.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: Mixture of Experts Series. If you don’t think you received an email check your
spam folder.
Sign in to theCUBE + NYSE Wired: Mixture of Experts Series.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to theCUBE + NYSE Wired: Mixture of Experts Series
Please sign in with LinkedIn to continue to theCUBE + NYSE Wired: Mixture of Experts Series. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Sid Sheth, d-Matrix
In this theCUBE + NYSE Wired: Mixture of Experts segment from the New York Stock Exchange, theCUBE’s John Furrier sits down with Raj Verma, CEO of SingleStore, to unpack how the intersection of technology and finance is shaping enterprise strategy. Verma shares why SingleStore is “on course” for the public markets, reflects on brand-building through the company’s partnership with golf Hall of Famer Padraig Harrington and connects that ethos to how SingleStore helps organizations fix struggling data “swings.” The discussion zeroes in on what’s next as Wall Street watches the AI infrastructure buildout: after chips and systems, the software and data layers set the pace for value creation.
Verma outlines why enterprises must modernize “brown” data estates into “green” ones to safely bring corporate context, governance and compliance into LLM workflows via RAG – and why commoditized data-at-rest puts the advantage at the query layer that unifies data in motion with data at rest. He predicts agentic AI will gain reasoning capabilities in roughly 18 months, cites industry indicators like Google reporting ~25% of its software now built by AI and argues that high switching costs will give way to disruption as buyers reassess legacy vendors. The conversation closes with concrete momentum: ~33% YoY growth, ARR in the ~$135M range, gross dollar retention ~98%, cloud NDR ~130, ~50% of business now in the cloud, landing ~3 new customers per day, a path to cash-flow breakeven in the next two quarters and a teaser for AI-related announcements in the next two months. Listeners will find notable stats, real-world use cases and forward-looking views on how databases power reliable AI at enterprise scale.
>> Hello, I'm John Furrier with theCUBE. We are here at theCUBE Studios on the East Coast at the NYSC New York Stock Exchange. We got a great guest here and the mixture of experts in our AI factories. We break down and talk to the leaders making it happen. Sid is here. He's the founder and CEO of d-Matrix. Welcome back to theCUBE. Great to see you. It's like our sixth time. You've been on so many times.
Sid Sheth
>> It's been many times. I'm so happy to be back.
John Furrier
>> Well, a lot to talk about. I want to get into kind of the market activity around the tech, the business model you guys have and updates. But first, big news, you guys got some fresh funding. Love to see these big rounds. Congratulations. A series C. Tell us about it.
Sid Sheth
>> Yes, so we just announced our series C round of funding, which was a $275 million round at a 2 billion valuation for the company. So good outcome for the company and this brings our total funding to about 450 million cumulative and well, that gives us plenty of dry powder to go out and do the things that we want to do and service the massive opportunity in front of us.
John Furrier
>> I'm just curious, what's the valuation?
Sid Sheth
>> It's going to be 2 billion. Yeah.
John Furrier
>> 2 billion. Not bad. Double unicorn.
Sid Sheth
>> Double unicorn.
John Furrier
>> What's the use of funds? What's the thinking in new investor share? Who's participating?
Sid Sheth
>> Yeah, so this round was led by a consortium. We were fortunate. We had a global interest in this round of funding. This was co-led by three investors, Bullhound Capital from the UK, Triatomic Capital from Silicon Valley, and then we had Temasek joined the round in co-lead this round from Singapore and we have had other investors like Qatar Investment Authority, QIA, and EDBI from Singapore also joined this round, besides having the old investor like Microsoft and others participate in this round.
John Furrier
>> Well, congratulations. It's great validation.
Sid Sheth
>> Indeed. Indeed, yeah.
John Furrier
>> Again, we've been following you for a long time and you made a big bet and inference is paying off, so congratulations.
Sid Sheth
>> Thank you. No, thank you. It's been... Very excited about what the future holds.
John Furrier
>> We're in the hottest area right now in infrastructure. I was just talking to the folks over at GSA. They're having their awards coming up in December. It's sold out and the waiting list is a mile long, not like the semiconductor industry. It's like now supercomputing, the show started in 1998 when I graduated college, still around. It's hanging around doing high performance computing. Now it's essentially a supercomputing show, where actually supercomputing is actually happening.
Sid Sheth
>> Yeah, yeah.
John Furrier
>> It's an AI show. We're planning for NRF, which is a retail event that's now an AI show. Everything is an AI show and the AI infrastructure is making it all happen. Clearly the stock market here is recognizing that, but I want to get your thoughts on where you see this now because that's clear. But what that's enabling is the scale for data and you're starting to see this agentic layer and then the next path after. That's the physical AI where the full convergence of physical and digital come together in one thing. So as we're in this AI build out on the infrastructure side, there's now this next level that it's going to enable. What's your reaction to that? What's your view?
Sid Sheth
>> Well, I think the big revelation or the big understanding that people are finally getting to is inference, which is something we've been talking about for, ever since we started the company. The company was built on a foundational principle that inference is going to be the largest computing opportunity of our time. It's a generational opportunity and I think people are finally coming to that realization and every time they try to put a target on the amount of compute that will be needed to serve all of humanity, I think it just surprises them that they need a lot more than what they originally anticipated on.
John Furrier
>> Yes. And siliconangle.com we've been featuring a lot of the content around this, the word silicon, Silicon Valley, the hardware game of silicon has become more prominent and help people understand some of the state-of-the-art technology because clearly we talk about hardware and software. We're not a hardware company, we're a software company, but silicon and software kind of come together. There's still coding involved.
Sid Sheth
>> That's right.
John Furrier
>> What's the real advancement for the folks that aren't in the details of the latest silicon advancements? What's the current state of the innovation?
Sid Sheth
>> Yeah, I think the key thing is I think we are moving into a world that is going to be heterogeneous where you're going to not have just one class of compute dominating all the infrastructure build outs that are happening right now. So I think you're going to have different forms of compute. I think it's not a one size fits all world that we are moving into. We are moving into a world where inference is going to be used by all of humanity, and what that really means is you need different strokes for different folks. You're going to need different types of compute depending on the application. You're going to need different types of compute depending on what the constraints are on the application. Is this something that requires low latency? Is this something that requires, needs to be energy efficient? Is this something that needs to be more throughput oriented? Is this something that is just very cost sensitive? So I mean you have all these different constraints that need to come together and how do you build a single platform or single type of computing solution that addresses all of these diverse needs? Very difficult. So we are moving into a heterogeneous world where I think you'll see coexistence of different forms of compute, and now when you enter that world, this is different types of architectural solutions, which basically means anyone who is writing software really needs to understand the underlying infrastructure that they're writing the software for. It's not like a unifying computing solution like we had in the CPU age where you had one CPU, it was x86, you went and wrote software, the software ecosystem was very mature, so the application level developers really didn't need to worry about what the underlying hardware really looked like. I mean, it was already prepared for them. Now it's changing. It's like you're seeing a very quick and dynamic shift that is happening in the underlying infra, the underlying compute, which basically means if you want to write applications for that type of infra, you really need application developers and programmers to understand What the underlying infra looks like and how to program for it.
John Furrier
>> So I want to get your thoughts on the concept we've been tracking around distributed computing, going AI and networking. We were talking before we came on camera. I'm working on this big hyperconverged edge report for Mobile World Congress. Just two weeks ago at the NVIDIA GTC in Washington D.C. Jensen Huang keeps us amplifying this concept of extreme co-design. Now that's his marketing term for we co-design with our partners and that's how systems people work. Okay, that's not the question that's kind of sets the table. If you look at distributed computing of edge, AI factories, large cluster supercomputers working together, there's a diversity of devices and there's diversity of development kits or environments. The role of silicon will be super important because you might have a wearable, there could be a Wi-Fi access point or a converged 5, 6G with WLAN, Wi-Fi device with a smart DGX box.
Sid Sheth
>> Yep, yep.
John Furrier
>> That requires... That little device has to talk, it's going to have some horsepower but not a lot, and it has to talk to the big factory or other servers and cloud or whatever it does. That requires that the software has to run across those different diverse form factors, maybe different memory constraints. You guys have been doing a lot of work around memory, the memory wall. I think last time we talked, you talked about memory wall.
Sid Sheth
>> Yeah.
John Furrier
>> It feels like it's only going to get worse, lik, when you think about a seamless entity, I'm a user, I'm at an edge, I'm in store, I'm on the road, I'm driving in my car and I want to go back and get a model that an agent does for me. It's got to do it really, really fast. It means that the silicon has to work in that system. What is your reaction to that? What's your vision? Because that's going to require really smart silicon, really smart software, really smart system architects, almost a co-design thinking.
Sid Sheth
>> Yeah, no, absolutely. And that goes back to my previous comments, which is, this is we are entering a world where there's different forms of compute, it's heterogeneous, they all need to co-exist. Applications being developed need to be developed for different types of underlying hardware, right. So I think at d-Matrix, we looked at the inferencing problem and quickly concluded that, it is really, the core problem here is about getting access to data very quickly. Because as you said, everyone wants to do fast inference, everyone wants to do real-time inference. You want to make quick decisions and act on them, right. And how do you do that when there is so much data sitting out there, all distributed in so many different places? You need to bring that together and quickly, and then act on it, make decisions, act on it fast. So I think the inferencing problem, which is where most of this stuff happens, is really a problem where you are trying to bring compute and memory as close together in a very creative way. And that is really what the D-Matrix solution is all about, is how do we put these two things together and break through that memory wall.
John Furrier
>> Give an update on the business. What's the current momentum? Can you share some of the successes you've had since last time we talked? What's the forward progress look like?
Sid Sheth
>> Right. So I think last time when we spoke, which was six months ago, we had the product out in the market. And so we are making some tremendous progress because now the product is being piloted by multiple different customers. We are kind of getting into the phase where we are now, not a pre-product company anymore. So the company is going through this transformation where we are moving from a pre-product phase to a post-product phase, lots of pilots, and we are in the process of scaling the business across multiple customers, scaling the business across multiple products. Because most of these customers, they don't want to see just one product. They want to see, "Okay, this is great. You have a great product today, but how does this evolve?" So we are in the process of building out a roadmap and yeah, super excited about scaling this across different opportunities.
John Furrier
>> What's the business driver for you right now? Just the build out of AI infrastructure? What's the main tailwind?
Sid Sheth
>> It is. It is absolutely that. So it is like everybody wants inference. Everybody has realized that this is where we make an ROI on the investments that we make. This is where we use AI. This is where we get access to all that AI promise that we all talk about. So we have so much customer interest right now. For the company today, it is really about scaling into all of that customer interest and managing that and making sure we don't upset anybody, right. So it's about carefully managing-
John Furrier
>> ....
Sid Sheth
>> effectively. That's right.
John Furrier
>> Keep them happy.
Sid Sheth
>> Keep them happy. Yeah.
John Furrier
>> What are you optimizing for? What are some of the challenges that you have that become opportunities? Is it supply? Is it the engineering? What are some of the things that you really focused on?
Sid Sheth
>> All of the above. All of the above, it's finding the right talent, scaling with the right talent, qualifying the right customer opportunities. We are not good at everything. We are a startup. We are certainly not a GPU, where we say we are good at many different things. Making sure that we qualify the right customer with the right opportunity, the right application that would run really well on our solution and make a difference to that customer, right. And then finding the right people to help shepherd that and deploy that, right. I mean, so a lot of this is about finding people, finding the right customers, right applications and building the right product.
John Furrier
>> Yeah. Every company has a culture. Intel had Moore's Law, you know, double the performance, the cadence of Moore's Low. You heard that in the past. What's the culture at d-Matrix for your team? I mean, it's not the same, not directly like Intel, but what's... I'll say speed. How would you describe the culture?
Sid Sheth
>> I think we are a first principles culture. So I think we are very particular about making sure that we are building something that really, really makes a difference to what the customer is trying to do. We are not building it for building's sake. I think it's like, let's go really understand what are the problem that we're trying to solve and go back to first principles and address it at the core and then build a product around it. That's what we have done with inference. It just so happens that the inferencing opportunity has exploded on us and it has become a multi-trillion dollar opportunity, and now we have a solution that is foundationally built for that application.
John Furrier
>> How would you explain to someone who's learning about the market that might be on an AI committee or a competency team, you see this around all the time. Where does d-Matrix fit in the map of like, okay, we got GPUs, you got memory, you got high bandwidth memory, where do you guys sit in that kind of, where would you put you guys in that map?
Sid Sheth
>> I think the easiest way to explain that is we are an accelerator, right. So we are not a GPU. We often get called a GPU, but we are not a GPU. So we had CPUs that were general purpose. Then we had GPUs, which were general purpose acceleration. So it is not as general purpose as a CPU, but it is still general purpose when it comes to all kinds of accelerated workloads. We are a dedicated accelerator for inference, right. So we certainly don't do things like HPC and we don't do things like a whole bunch of HPC workloads and training. We don't do any of that. We are so focused on inference.
John Furrier
>> Inference only.
Sid Sheth
>> That's right.
John Furrier
>> And workloads would look like what?
Sid Sheth
>> Workloads could be any workload that needs to be inferred. Anything that needs to be inferred that looks like a transformer workload, which is most of the workloads these days. Doesn't have to be a transformer. We have optimized the software around transformer workloads. The hardware that we have built can run on pretty much any AI workload. So as long as it's a deep learning workload, we can run it-
John Furrier
>> Got it....
Sid Sheth
>> from an inference perspective.
John Furrier
>> And the secret sauce is what?
Sid Sheth
>> The secret sauce is again, back to that first principle approach is putting the computer and memory together. We have done it with building in-memory computing engines and chiplets and the way we put this thing together, it's a recipe. It's a recipe that is just so unique and targeted for inference that we don't see anybody else really doing it that way.
John Furrier
>> Yeah. You mentioned the memory wall last time, and this comes up a lot in our large-scale systems. The HBM put it close to the GPU. They want to have that fast thing. Memory. I mean, everyone's talking about memory and how do you look at accelerated inference outside of the core large-scale when it's got to go across a network?
Sid Sheth
>> Yeah, I mean, the workloads are getting pretty large, so it's no longer where a workload just fits inside a single box or a single rack. You need to scale this out, and that's where networking very, very important. And the d-Matrix team, we were all networking guys in our past lives. So we come from a very strong heritage of networking. So we recently announced a product called JetStream, which allows us to essentially, it's a NIC card, it's another accelerator card for I/O. And what we essentially do, it kind sits right next to our Corsair card, which is a compute accelerator. And what this thing does is essentially scales up and scales out the compute. And so it's all about low latency. Again, back to your original comment, John, which you said it's like, "Hey, everybody wants to do this fast and they want to do it real-time." So okay, how do we build this networking technology that is strips out all the overhead that one needs for traditional networking applications, build it only for AI, makes it extremely low latency? And that's what we have done with JetStream.
John Furrier
>> Talk about the dynamics, 'cause people have... We've talked, Dave and I talk about in our CUBE pod all the time. People are hoarding GPUs, "Give me the GPUs." There's a lot of things that you could do with compute, you don't need GPUs. So talk about how people should think about GPU usage versus compute, accelerated compute, because there are use cases that actually will work great on compute.
Sid Sheth
>> Absolutely, and I think, look, if you're starting your AI journey today and you're just getting started, I think the GPU is a great place to start, right, because you're still not figured out what specific applications are the ones that you want to really accelerate and how you want to accelerate them and what the constraints are. Start with a GPU. But as you get deeper into your AI journey, what happens is you will realize quickly that it's all about ROI. It's all about making a return on that investment you made. And that's where we come in. So we can offload a lot of the tasks that may not be very well suited for a GPU and run it on an accelerator that we have built. And then you go back to that heterogeneous computing approach where the things that run really well on a GPU continue to run on the GPU, but then there are certain things that run really well on an accelerator and you can get a much better return on investment. Then go run those on a dedicated accelerator like ours.
John Furrier
>> Sid, great to have you on. Final question. Give us an update on the business. You feeling good about things? Where are you focused? How are you guys doing? Give an update on state of the business.
Sid Sheth
>> Feeling very good, feeling very positive about where the business is at. I think just look at what's happening out there with the opportunity. We bet on this opportunity six years ago. Is there a reason not to be excited about it? I mean, every day we come out, we see the opportunity getting only bigger. So super excited. It's just about getting through a lot of these customer engagements. We have many more customers than we can handle right now, so it's just about kind of-
John Furrier
>> Good problem to have.
Sid Sheth
>> Great problem to have.
John Furrier
>> So you're hiring. What are you looking for in terms of hiring? Probably across the board-
Sid Sheth
>> Across the board. Across the board-
John Furrier
>> What does it take to work at d-Matrix? Is it like, on the engineering side, what specific skill sets do you look for?
Sid Sheth
>> Come with a mindset where you're willing to learn, come with a mindset that you're willing to go back to first principles, come with a mindset where you want to solve some really, really tough problems, d-Matrix is the right place.
John Furrier
>> How about to go-to-market piece? What's the make-up of that person?
Sid Sheth
>> The go-to-market piece, again, back to that first principle, understand what the customer is trying to solve. Again, this is, we are very early in the inference computing journey, right. So it's very, very important to understand-
John Furrier
>> The quasi-product in curiosity meets-
Sid Sheth
>> Curiosity....
John Furrier
>> project management and POCs 'cause you have a relationship motion.
Sid Sheth
>> Yes, yes. Understand what the customer wants at the application level, not just at the workload level. Work backwards from there and bring it to the solution that we are building.
John Furrier
>> Well, great to have you on. Thanks for sharing the commentary. We're get to promote you into theCUBE Collective Analyst program, the way you're bringing all the A-game. Appreciate it. Thanks for coming on again.
Sid Sheth
>> Thank you. Thank you so much, John. Thanks for having me.
John Furrier
>> Okay. We are here with d-Matrix, again, a leading company, silicon memory compute. Again, this is the future. The AI infrastructure is on rapid build-out mode. This is going to open up an era of massive innovation up and down the stack. Agenetic infrastructure, physical AI, all coming superfast, d-Matrix and others making it happen. They're the trailblazers. They're the ones making it the AI factories work, edge. I'm John Furrier, host, doing my part to bring you the data. Thanks for watching.
>> Hello, I'm John Furrier with theCUBE. We are here at theCUBE Studios on the East Coast at the NYSC New York Stock Exchange. We got a great guest here and the mixture of experts in our AI factories. We break down and talk to the leaders making it happen. Sid is here. He's the founder and CEO of d-Matrix. Welcome back to theCUBE. Great to see you. It's like our sixth time. You've been on so many times.
Sid Sheth
>> It's been many times. I'm so happy to be back.
John Furrier
>> Well, a lot to talk about. I want to get into kind of the market activity around the tech, the business model you guys have and updates. But first, big news, you guys got some fresh funding. Love to see these big rounds. Congratulations. A series C. Tell us about it.
Sid Sheth
>> Yes, so we just announced our series C round of funding, which was a $275 million round at a 2 billion valuation for the company. So good outcome for the company and this brings our total funding to about 450 million cumulative and well, that gives us plenty of dry powder to go out and do the things that we want to do and service the massive opportunity in front of us.
John Furrier
>> I'm just curious, what's the valuation?
Sid Sheth
>> It's going to be 2 billion. Yeah.
John Furrier
>> 2 billion. Not bad. Double unicorn.
Sid Sheth
>> Double unicorn.
John Furrier
>> What's the use of funds? What's the thinking in new investor share? Who's participating?
Sid Sheth
>> Yeah, so this round was led by a consortium. We were fortunate. We had a global interest in this round of funding. This was co-led by three investors, Bullhound Capital from the UK, Triatomic Capital from Silicon Valley, and then we had Temasek joined the round in co-lead this round from Singapore and we have had other investors like Qatar Investment Authority, QIA, and EDBI from Singapore also joined this round, besides having the old investor like Microsoft and others participate in this round.
John Furrier
>> Well, congratulations. It's great validation.
Sid Sheth
>> Indeed. Indeed, yeah.
John Furrier
>> Again, we've been following you for a long time and you made a big bet and inference is paying off, so congratulations.
Sid Sheth
>> Thank you. No, thank you. It's been... Very excited about what the future holds.
John Furrier
>> We're in the hottest area right now in infrastructure. I was just talking to the folks over at GSA. They're having their awards coming up in December. It's sold out and the waiting list is a mile long, not like the semiconductor industry. It's like now supercomputing, the show started in 1998 when I graduated college, still around. It's hanging around doing high performance computing. Now it's essentially a supercomputing show, where actually supercomputing is actually happening.
Sid Sheth
>> Yeah, yeah.
John Furrier
>> It's an AI show. We're planning for NRF, which is a retail event that's now an AI show. Everything is an AI show and the AI infrastructure is making it all happen. Clearly the stock market here is recognizing that, but I want to get your thoughts on where you see this now because that's clear. But what that's enabling is the scale for data and you're starting to see this agentic layer and then the next path after. That's the physical AI where the full convergence of physical and digital come together in one thing. So as we're in this AI build out on the infrastructure side, there's now this next level that it's going to enable. What's your reaction to that? What's your view?
Sid Sheth
>> Well, I think the big revelation or the big understanding that people are finally getting to is inference, which is something we've been talking about for, ever since we started the company. The company was built on a foundational principle that inference is going to be the largest computing opportunity of our time. It's a generational opportunity and I think people are finally coming to that realization and every time they try to put a target on the amount of compute that will be needed to serve all of humanity, I think it just surprises them that they need a lot more than what they originally anticipated on.
John Furrier
>> Yes. And siliconangle.com we've been featuring a lot of the content around this, the word silicon, Silicon Valley, the hardware game of silicon has become more prominent and help people understand some of the state-of-the-art technology because clearly we talk about hardware and software. We're not a hardware company, we're a software company, but silicon and software kind of come together. There's still coding involved.
Sid Sheth
>> That's right.
John Furrier
>> What's the real advancement for the folks that aren't in the details of the latest silicon advancements? What's the current state of the innovation?
Sid Sheth
>> Yeah, I think the key thing is I think we are moving into a world that is going to be heterogeneous where you're going to not have just one class of compute dominating all the infrastructure build outs that are happening right now. So I think you're going to have different forms of compute. I think it's not a one size fits all world that we are moving into. We are moving into a world where inference is going to be used by all of humanity, and what that really means is you need different strokes for different folks. You're going to need different types of compute depending on the application. You're going to need different types of compute depending on what the constraints are on the application. Is this something that requires low latency? Is this something that requires, needs to be energy efficient? Is this something that needs to be more throughput oriented? Is this something that is just very cost sensitive? So I mean you have all these different constraints that need to come together and how do you build a single platform or single type of computing solution that addresses all of these diverse needs? Very difficult. So we are moving into a heterogeneous world where I think you'll see coexistence of different forms of compute, and now when you enter that world, this is different types of architectural solutions, which basically means anyone who is writing software really needs to understand the underlying infrastructure that they're writing the software for. It's not like a unifying computing solution like we had in the CPU age where you had one CPU, it was x86, you went and wrote software, the software ecosystem was very mature, so the application level developers really didn't need to worry about what the underlying hardware really looked like. I mean, it was already prepared for them. Now it's changing. It's like you're seeing a very quick and dynamic shift that is happening in the underlying infra, the underlying compute, which basically means if you want to write applications for that type of infra, you really need application developers and programmers to understand What the underlying infra looks like and how to program for it.
John Furrier
>> So I want to get your thoughts on the concept we've been tracking around distributed computing, going AI and networking. We were talking before we came on camera. I'm working on this big hyperconverged edge report for Mobile World Congress. Just two weeks ago at the NVIDIA GTC in Washington D.C. Jensen Huang keeps us amplifying this concept of extreme co-design. Now that's his marketing term for we co-design with our partners and that's how systems people work. Okay, that's not the question that's kind of sets the table. If you look at distributed computing of edge, AI factories, large cluster supercomputers working together, there's a diversity of devices and there's diversity of development kits or environments. The role of silicon will be super important because you might have a wearable, there could be a Wi-Fi access point or a converged 5, 6G with WLAN, Wi-Fi device with a smart DGX box.
Sid Sheth
>> Yep, yep.
John Furrier
>> That requires... That little device has to talk, it's going to have some horsepower but not a lot, and it has to talk to the big factory or other servers and cloud or whatever it does. That requires that the software has to run across those different diverse form factors, maybe different memory constraints. You guys have been doing a lot of work around memory, the memory wall. I think last time we talked, you talked about memory wall.
Sid Sheth
>> Yeah.
John Furrier
>> It feels like it's only going to get worse, lik, when you think about a seamless entity, I'm a user, I'm at an edge, I'm in store, I'm on the road, I'm driving in my car and I want to go back and get a model that an agent does for me. It's got to do it really, really fast. It means that the silicon has to work in that system. What is your reaction to that? What's your vision? Because that's going to require really smart silicon, really smart software, really smart system architects, almost a co-design thinking.
Sid Sheth
>> Yeah, no, absolutely. And that goes back to my previous comments, which is, this is we are entering a world where there's different forms of compute, it's heterogeneous, they all need to co-exist. Applications being developed need to be developed for different types of underlying hardware, right. So I think at d-Matrix, we looked at the inferencing problem and quickly concluded that, it is really, the core problem here is about getting access to data very quickly. Because as you said, everyone wants to do fast inference, everyone wants to do real-time inference. You want to make quick decisions and act on them, right. And how do you do that when there is so much data sitting out there, all distributed in so many different places? You need to bring that together and quickly, and then act on it, make decisions, act on it fast. So I think the inferencing problem, which is where most of this stuff happens, is really a problem where you are trying to bring compute and memory as close together in a very creative way. And that is really what the D-Matrix solution is all about, is how do we put these two things together and break through that memory wall.
John Furrier
>> Give an update on the business. What's the current momentum? Can you share some of the successes you've had since last time we talked? What's the forward progress look like?
Sid Sheth
>> Right. So I think last time when we spoke, which was six months ago, we had the product out in the market. And so we are making some tremendous progress because now the product is being piloted by multiple different customers. We are kind of getting into the phase where we are now, not a pre-product company anymore. So the company is going through this transformation where we are moving from a pre-product phase to a post-product phase, lots of pilots, and we are in the process of scaling the business across multiple customers, scaling the business across multiple products. Because most of these customers, they don't want to see just one product. They want to see, "Okay, this is great. You have a great product today, but how does this evolve?" So we are in the process of building out a roadmap and yeah, super excited about scaling this across different opportunities.
John Furrier
>> What's the business driver for you right now? Just the build out of AI infrastructure? What's the main tailwind?
Sid Sheth
>> It is. It is absolutely that. So it is like everybody wants inference. Everybody has realized that this is where we make an ROI on the investments that we make. This is where we use AI. This is where we get access to all that AI promise that we all talk about. So we have so much customer interest right now. For the company today, it is really about scaling into all of that customer interest and managing that and making sure we don't upset anybody, right. So it's about carefully managing-
John Furrier
>> ....
Sid Sheth
>> effectively. That's right.
John Furrier
>> Keep them happy.
Sid Sheth
>> Keep them happy. Yeah.
John Furrier
>> What are you optimizing for? What are some of the challenges that you have that become opportunities? Is it supply? Is it the engineering? What are some of the things that you really focused on?
Sid Sheth
>> All of the above. All of the above, it's finding the right talent, scaling with the right talent, qualifying the right customer opportunities. We are not good at everything. We are a startup. We are certainly not a GPU, where we say we are good at many different things. Making sure that we qualify the right customer with the right opportunity, the right application that would run really well on our solution and make a difference to that customer, right. And then finding the right people to help shepherd that and deploy that, right. I mean, so a lot of this is about finding people, finding the right customers, right applications and building the right product.
John Furrier
>> Yeah. Every company has a culture. Intel had Moore's Law, you know, double the performance, the cadence of Moore's Low. You heard that in the past. What's the culture at d-Matrix for your team? I mean, it's not the same, not directly like Intel, but what's... I'll say speed. How would you describe the culture?
Sid Sheth
>> I think we are a first principles culture. So I think we are very particular about making sure that we are building something that really, really makes a difference to what the customer is trying to do. We are not building it for building's sake. I think it's like, let's go really understand what are the problem that we're trying to solve and go back to first principles and address it at the core and then build a product around it. That's what we have done with inference. It just so happens that the inferencing opportunity has exploded on us and it has become a multi-trillion dollar opportunity, and now we have a solution that is foundationally built for that application.
John Furrier
>> How would you explain to someone who's learning about the market that might be on an AI committee or a competency team, you see this around all the time. Where does d-Matrix fit in the map of like, okay, we got GPUs, you got memory, you got high bandwidth memory, where do you guys sit in that kind of, where would you put you guys in that map?
Sid Sheth
>> I think the easiest way to explain that is we are an accelerator, right. So we are not a GPU. We often get called a GPU, but we are not a GPU. So we had CPUs that were general purpose. Then we had GPUs, which were general purpose acceleration. So it is not as general purpose as a CPU, but it is still general purpose when it comes to all kinds of accelerated workloads. We are a dedicated accelerator for inference, right. So we certainly don't do things like HPC and we don't do things like a whole bunch of HPC workloads and training. We don't do any of that. We are so focused on inference.
John Furrier
>> Inference only.
Sid Sheth
>> That's right.
John Furrier
>> And workloads would look like what?
Sid Sheth
>> Workloads could be any workload that needs to be inferred. Anything that needs to be inferred that looks like a transformer workload, which is most of the workloads these days. Doesn't have to be a transformer. We have optimized the software around transformer workloads. The hardware that we have built can run on pretty much any AI workload. So as long as it's a deep learning workload, we can run it-
John Furrier
>> Got it....
Sid Sheth
>> from an inference perspective.
John Furrier
>> And the secret sauce is what?
Sid Sheth
>> The secret sauce is again, back to that first principle approach is putting the computer and memory together. We have done it with building in-memory computing engines and chiplets and the way we put this thing together, it's a recipe. It's a recipe that is just so unique and targeted for inference that we don't see anybody else really doing it that way.
John Furrier
>> Yeah. You mentioned the memory wall last time, and this comes up a lot in our large-scale systems. The HBM put it close to the GPU. They want to have that fast thing. Memory. I mean, everyone's talking about memory and how do you look at accelerated inference outside of the core large-scale when it's got to go across a network?
Sid Sheth
>> Yeah, I mean, the workloads are getting pretty large, so it's no longer where a workload just fits inside a single box or a single rack. You need to scale this out, and that's where networking very, very important. And the d-Matrix team, we were all networking guys in our past lives. So we come from a very strong heritage of networking. So we recently announced a product called JetStream, which allows us to essentially, it's a NIC card, it's another accelerator card for I/O. And what we essentially do, it kind sits right next to our Corsair card, which is a compute accelerator. And what this thing does is essentially scales up and scales out the compute. And so it's all about low latency. Again, back to your original comment, John, which you said it's like, "Hey, everybody wants to do this fast and they want to do it real-time." So okay, how do we build this networking technology that is strips out all the overhead that one needs for traditional networking applications, build it only for AI, makes it extremely low latency? And that's what we have done with JetStream.
John Furrier
>> Talk about the dynamics, 'cause people have... We've talked, Dave and I talk about in our CUBE pod all the time. People are hoarding GPUs, "Give me the GPUs." There's a lot of things that you could do with compute, you don't need GPUs. So talk about how people should think about GPU usage versus compute, accelerated compute, because there are use cases that actually will work great on compute.
Sid Sheth
>> Absolutely, and I think, look, if you're starting your AI journey today and you're just getting started, I think the GPU is a great place to start, right, because you're still not figured out what specific applications are the ones that you want to really accelerate and how you want to accelerate them and what the constraints are. Start with a GPU. But as you get deeper into your AI journey, what happens is you will realize quickly that it's all about ROI. It's all about making a return on that investment you made. And that's where we come in. So we can offload a lot of the tasks that may not be very well suited for a GPU and run it on an accelerator that we have built. And then you go back to that heterogeneous computing approach where the things that run really well on a GPU continue to run on the GPU, but then there are certain things that run really well on an accelerator and you can get a much better return on investment. Then go run those on a dedicated accelerator like ours.
John Furrier
>> Sid, great to have you on. Final question. Give us an update on the business. You feeling good about things? Where are you focused? How are you guys doing? Give an update on state of the business.
Sid Sheth
>> Feeling very good, feeling very positive about where the business is at. I think just look at what's happening out there with the opportunity. We bet on this opportunity six years ago. Is there a reason not to be excited about it? I mean, every day we come out, we see the opportunity getting only bigger. So super excited. It's just about getting through a lot of these customer engagements. We have many more customers than we can handle right now, so it's just about kind of-
John Furrier
>> Good problem to have.
Sid Sheth
>> Great problem to have.
John Furrier
>> So you're hiring. What are you looking for in terms of hiring? Probably across the board-
Sid Sheth
>> Across the board. Across the board-
John Furrier
>> What does it take to work at d-Matrix? Is it like, on the engineering side, what specific skill sets do you look for?
Sid Sheth
>> Come with a mindset where you're willing to learn, come with a mindset that you're willing to go back to first principles, come with a mindset where you want to solve some really, really tough problems, d-Matrix is the right place.
John Furrier
>> How about to go-to-market piece? What's the make-up of that person?
Sid Sheth
>> The go-to-market piece, again, back to that first principle, understand what the customer is trying to solve. Again, this is, we are very early in the inference computing journey, right. So it's very, very important to understand-
John Furrier
>> The quasi-product in curiosity meets-
Sid Sheth
>> Curiosity....
John Furrier
>> project management and POCs 'cause you have a relationship motion.
Sid Sheth
>> Yes, yes. Understand what the customer wants at the application level, not just at the workload level. Work backwards from there and bring it to the solution that we are building.
John Furrier
>> Well, great to have you on. Thanks for sharing the commentary. We're get to promote you into theCUBE Collective Analyst program, the way you're bringing all the A-game. Appreciate it. Thanks for coming on again.
Sid Sheth
>> Thank you. Thank you so much, John. Thanks for having me.
John Furrier
>> Okay. We are here with d-Matrix, again, a leading company, silicon memory compute. Again, this is the future. The AI infrastructure is on rapid build-out mode. This is going to open up an era of massive innovation up and down the stack. Agenetic infrastructure, physical AI, all coming superfast, d-Matrix and others making it happen. They're the trailblazers. They're the ones making it the AI factories work, edge. I'm John Furrier, host, doing my part to bring you the data. Thanks for watching.