In this interview from theCUBE + NYSE Wired: AI Factories – Data Centers of the Future, Ramin Hasani, chief executive officer and co-founder of Liquid AI, joins theCUBE’s Gemma Allen at the New York Stock Exchange to explain why he’s betting on smaller, more efficient foundation models instead of monolithic, cloud-only LLMs. Hasani shares how a decade of research led to Liquid Foundation Models, designed from first principles to dramatically cut power consumption while maintaining frontier-level quality, and to run seamlessly on CPUs, GPUs and NPUs in both data centers and on devices. He describes Liquid AI’s vision for a hybrid world where a software layer lets enterprises deploy generative AI wherever compute lives – inside energy-efficient data centers or at the edge in phones, laptops, cars and other OEM environments – aligning with the event’s focus on AI factories as a new backbone of digital infrastructure.
The conversation then zooms in on Liquid AI’s new partnership with Shopify and what it takes to power e-commerce search and recommendations with generative AI at massive scale, including a 20-millisecond end-to-end search experience that can support billions of transactions. Hasani contrasts Liquid’s latency and control with large LLM providers, and explores how smaller, domain-specific models enable private-cloud and on-device deployments that keep data local, reduce regulatory friction and unlock new agentic AI, offline and privacy-sensitive use cases. He also outlines Liquid AI’s roadmap across two core verticals – OEMs and e-commerce/financial services – as the 70-person company scales its go-to-market efforts to bring Liquid Foundation Models into day-to-day applications inside and outside the data center.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. If you don’t think you received an email check your
spam folder.
Sign in to AI Factories - Data Centers of the Future.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Register for AI Factories - Data Centers of the Future
Please fill out the information below. You will receive an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for AI Factories - Data Centers of the Future.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. If you don’t think you received an email check your
spam folder.
Sign in to AI Factories - Data Centers of the Future.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Sign in to gain access to theCUBE + NYSE Wired: AI Factories - Data Centers of the Future
Please sign in with LinkedIn to continue to theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Ramin Hasani, Liquid AI
In this interview from theCUBE + NYSE Wired: AI Factories – Data Centers of the Future event, Glean co-founder and CEO Arvind Jain joins theCUBE’s John Furrier to unpack what’s really working in enterprise AI today and what comes next. Jain explains why knowledge access remains the first successful AI use case at scale and how Glean’s enterprise search brings AI into everyday work. He details the past year’s lessons with AI agents – from the need for guardrails, security, evaluation and monitoring to democratizing agent building so business owners (not just data scientists) can create production-grade agents.
The conversation dives into Glean’s vision of the enterprise brain powered by an enterprise graph, highlighting the importance of deep context, human workflows and behavior to reduce “noise” and drive outcomes. Jain outlines core building blocks – hundreds of enterprise integrations and a growing actions library – that let agents securely read company knowledge and take actions across systems (e.g., CRM updates, HR tasks, calendar checks). He discusses how organizations are standing up AI Centers of Excellence, prioritizing “top 10–20” agents across functions like engineering, support and sales, and why a horizontal AI data platform that unifies structured and unstructured data – accessed conversationally and stitched together via standards like MCP – sets the foundation for AI factory-scale operations. Looking ahead, Jain says Glean’s upgraded assistant is evolving from reactive tool to proactive companion that anticipates tasks and accelerates productivity.
In this interview from theCUBE + NYSE Wired: AI Factories – Data Centers of the Future, Ramin Hasani, chief executive officer and co-founder of Liquid AI, joins theCUBE’s Gemma Allen at the New York Stock Exchange to explain why he’s betting on smaller, more efficient foundation models instead of monolithic, cloud-only LLMs. Hasani shares how a decade of research led to Liquid Foundation Models, designed from first principles to dramatically cut power consumption while maintaining frontier-level quality, and to run seamlessly on CPUs, GPUs and NPUs in both da...Read more
exploreKeep Exploring
What is Liquid AI and how does it differ from the current trends in large Language Models (LLMs)?add
What is Liquid's approach to building foundation models and how do they envision their implementation in various computing environments?add
What has been the focus of research in designing foundation models over the past decade?add
What are the challenges associated with the energy consumption of large language models, and what approach was taken to reduce this energy cost while maintaining quality in AI systems?add
What are the challenges and unique solutions involved in maintaining low latency for applications on a platform with high transaction volumes?add
>> Welcome back to theCUBE. I'm Gemma Allen here at our studio at the New York Stock Exchange, connecting Wall Street to Silicon Valley. Joining me now is Ramin Hasani, CEO and co-founder of Liquid AI. Welcome, Ramin.
Ramin Hasani
>> Thank you so much for having me.
Gemma Allen
>> So it seems as though a lot of folks are betting on large LLMs. We hear so much about the LLM space, about the monolithic world ahead, right? Just like the monolithic tech world that laid before. You are betting on something smaller, faster, more efficient, and in some ways closer to home for users. Unpack that for me. Tell me what is Liquid AI and what value does it deliver?
Ramin Hasani
>> Absolutely. At Liquid, we are building foundation models, as you mentioned, smaller version of foundation models. The goal for us is to really fundamentally, from first principles, build AI systems that are extremely efficient, and we care about where this intelligence goes. So for example, we can run a Liquid Foundation Models seamlessly on a CPU, on a GPU, or on an NPU, neural processing unit. These are much smaller computational footprints, computational blocks that you can put them that are powering devices like a phone, a laptop, let's say a car. So wherever you actually have access to any of these things, any of these formats of computers, you can bring generative AI to these kind of worlds. Today, as you mentioned, large language models and larger instances of the models are mostly sitting in the cloud. What we want to do, we want to enable a hybrid world. We want to enable basically, let's say, the fastest implementation of a foundation model inside a data center and powering also that outside of data center. In a way, we are building a software layer for any form of computer that developers can take these models and integrate them in their services and build applications on top of them, not just in data centers, but also outside of data centers.
Gemma Allen
>> Clearly, there is a huge business need and commercial appetite for this, right/ Your company, it's been on quite the trajectory. You raised a whopping series A earlier this year. I think you're two and a half years old, if I'm correct.
Ramin Hasani
>> Yes, correct.
Gemma Allen
>> Give me the OG story. Tell me how you came to be here.
Ramin Hasani
>> Yeah. Well, I mean, we've been thinking about these things. Fundamentally, a decade long research went into designing of fundamentally changing the landscape of foundation models. If you've heard about LLMs and foundation models and generative AI, you probably have heard that the unit of competition there is an architecture called transformers. What we have been thinking about for 10 years of research, we have been thinking about how can we systematically reduce the power consumption. Because one of the fundamental challenges that comes with LLMs and large language models is the fact that they consume a lot of energy, the more amount of data that they process, and this energy goes exponentially high. What we wanted to do, we wanted to take a completely different approach, design from a scratch, find out what are those mathematical operators that allows us to design very powerful and efficient AI systems, but substantially reducing the energy cost of AI without sacrificing quality. That was a vision that we had, and we managed to actually get to that vision by building Liquid Foundation Models. Today, Liquid Foundation Models can go into any kind of OEMs, like original equipment manufacturers, let's say consumer electronics, automotive, and power. Basically generative AI behavior like for task specific stuff inside of these environments. And they can also power, let's say, very low latency task specific applications of AI in data centers. Because of the efficiency gains that we achieved with the fundamental breakthroughs that we did, we can bring these systems to perform at a substantially better latency footprint and compute footprint, while drastically reducing the cost of deployment of foundation models. For real, we can now access generative AI for domain specific applications at the level of frontier levels. So you do not need to have a large scale AI systems to work on, let's say, tasks that can be done by a smaller and specialized model this time with lowest possible latency in a private deployment manner, because the models are going to be smaller. So you can host them in a private cloud, you can host them on a device, you can host them on a laptop directly, and there are substantial amount of use cases and completely new market gets enabled once you actually walk out of a data center. And it's complementary to that mission of really building a hybrid world between the data center deployment of AI in an energy efficient way and a sustainable way, and building basically whatever compute that is available for us outside of data centers. So that's kind of the vision that we want to get to.
Gemma Allen
>> Well, based on this week's new cycle, when clear use case for Liquid AI is e-com, you've just signed a pretty big partnership with Shopify. Big deal for you, and I'm sure a very big deal for Shopify too, based on everything I've been reading and learning. Tell me about that. Tell me about the opportunity here. What drove you guys to collaborate like this, and how you see this really adding value for it, especially SMB, right? Smaller business owners, smaller storefronts who sometimes unfortunately get caught up in the wave of monolithic tech like we spoke about.
Ramin Hasani
>> Absolutely. Yeah. I mean, we wanted to deliver to the promise of the value of our technology. The most efficient, the fastest version of let's say AI systems for certain applications. If you think about e-commerce, some of the most important elements of e-commerce is recommendations and search. So there's a lot of generative AI that can go inside an e-commerce platform, both from a merchant perspective and a buyer perspective. So the opportunities for generative AI, Shopify has identified one of the pioneers. They have a strong team of machine learning people, and recently with the CTO, Mikhail Parakhin, we have been contemplating these ideas of how can we bring these foundation models in a most realistic way and put them in production at scale for e-commerce applications. And we have been working with the teams for a couple of months now with each other for a few months now, and we got to very exciting places in our exploratory phase of our partnerships, like initial things. This has started early on, I think back in March, I think we get started working with Shopify on serious applications of generative AI in the commerce. And soon, we observed that Liquid Foundation Models what we are building, bringing an uneven kind of advantage compared to anything else, like let's say the open source world that is out there, open source models. Because Shopify has a very good machine learning team, they are capable of downloading, let's say, some of these permissible open source models that are out there and basically fine-tuning them for certain use cases. But then when we compare the performances of a Liquid Foundation Model as opposed to anything else that is available in the open domain, we saw a substantial gain, not just on quality, but on the efficiency and on the speed of operations. As an example, we put a project in production. This project is about, let's say, the user is asking about, let's say, an unstructured way, like asking a question on the shop app, let's say asking a question about, "Oh, I played basketball and I saw these red shoes that I really liked and I want to buy this thing." The user want to put this thing in a language model.
Like today, the search applications is going to go broke. You cannot get a really nice answer off of this system. But what you want, you can actually use a generative AI, a tech space, an LLM basically, that can parse this information and get it prepared for the search algorithm for recommending the best results that are coming out. We can complete a full search application in 20 milliseconds. So that is unheard of. The speed of operations has been really fast. So we are unlocking basically massive value. Human would not understand the 20 millisecond latency. So instantaneously, you can bring the best customer experience forward. That's one project that is actually going forward.
Gemma Allen
>> Help me for a second understand the buyer journey for Shopify or the landscape broadly. To a lot of us using Shopify products every day, it seems though it's already very fast, but I guess nothing is ever fast enough in this current moment we're living. The odds is a company like Shopify was, you either build your own, build something proprietary or work with a large scale LLM provider like an Anthropic or one of those really, again, brought bigger, faster. It's like bigger, but sometimes slower. Consequently, technology options or partner with somebody like Liquid AI. Is there anything I'm missing in that analysis? Are there other kind of players or other kind of middleware players that enter this realm at all?
Ramin Hasani
>> Yeah, great question. So if you think about it, when I was telling you, the latency matter is extremely important. So imagine if you want to put a certain application. We're not just talking about latency for a single client. We are talking about for billions of transactions that happen simultaneously on the Shopify platform. How can you maintain a lowest latency for a large language model that has billions of parameters of its own for giving a parallel access to a large volume of clients? So that becomes a much harder problem. So we are solving a unique problem here to really deliver a certain kind of latency profile that is not achievable by anything else. You see? So that's kind of the unique approach that we are taking and the unique value proposition that Shopify would get off of working with Liquid AI. You cannot get that from the giants like Anthropics and OpenAIs of the world, because that latency and that scale is something that you need to have full control over. And it is a complicated, non-trivial task to deliver at scale. Satya actually says something that nothing is a commodity at scale, and I really like that coach, because it's actually very true. When you think about delivering a service to a large volume of people, even with the small, tiniest version of a foundation model, there is a lot of work that has to go through that platform to really be able to deliver that kind of service to that many requests. So that's unique angle on this specific project.
Gemma Allen
>> And it seems as though, broadly speaking, when you look at how transferable this opportunity is outside of Shopify to any system or platform that uses recommender systems to grow business, which many, many do, there are so many opportunities remain for you and the team at Liquid AI. How are you approaching the partnership strategy? Are you thinking about food delivery, all of the ways in which we use ... We use them almost every minute of our day, right? So tell me, what else is ahead for you guys?
Ramin Hasani
>> Absolutely. As you know, recommendation is one of the core components of any online service, if it's ads, if it's search, if it's products. So anything related to your interactions online, let's say even on Netflix when you want to watch a movie, there's always recommendation systems in the back. So recommendation is one of those tremendous opportunities. And enabling these recommendations at scale is something that we are working on with Shopify. And so far in the test environments, the quality of the models, it's not just about the speed that we deliver, but also on the quality. We are actually surpassing the quality of what has been out there so far from a state-of-the-art perspective. And that kind of ignited this amazing partnership with Shopify. Now, building on top of this, as I mentioned, the recommendation is can go in any web service. So the opportunities are tremendous, and we have already had massive inbounds of businesses wanting to use the similar kind of ideas because we are walking on the state-of-the-art on the topic. We try to deliver the highest quality state-of-the-art kind of quality with the lowest latency. And these are once the Shopify engagement and partnership and the fact that our models are getting into production, that is a good testimony for Liquid to showcase how much value we can unlock for people for many, many different online businesses that are around us from the big Fortune 500 companies to middle-range, kind of middle-sized companies that they want to actually have. They have an online service in place. So the opportunities are infinite.
Gemma Allen
>> Outside of speed, what other factors really feed into this? Is it based on jurisdiction, legal requirements, sovereign AI? What other factors would really make companies and specific industries think, "Okay, this smaller, faster, closer option, AI on the edge, if you like, is much better from a roadmap perspective?"
Ramin Hasani
>> Absolutely. So you can think about once you have a small model, you can run these things local. Sometimes, you can actually run these models on device. In other engagements of ours, you can think about providing agentic AI behavior, let's say a system that can book your Uber as small as that, or a knowledge worker that runs on your device or records all of your activities. As soon as you start talking about data collection and data processing for a user with an online service, regulatory elements and privacy elements becomes important, right? So if you think about it, imagine if the computation happens on the cloud, only on the cloud, and you have to send the data of the client to the cloud, you need to comply with a certain type of regulations. But if the computation actually happens on device, you can actually be compliant. Imagine if you are providing a service to a client and it runs directly on the device of the client, and data never leaves the premise of your user. That's kind of a maximized amount of value, and you do not need to go through the regulatory cycles of getting license and access for providing a service to clients. Why? Because you are not actually requiring the client to actually upload a certain type of data to the cloud. So device AI is getting generous. That's a whole other market for private applications of AI, for offline applications of AI, for let's say regulatory sensitive applications of AI in terms of data. This is the thing that we can enable with the type of models that we have.
Gemma Allen
>> And do you think the hyperscalers, I guess, also are catching onto this and are coming for this space? Do you see more and more interest in competition in the whole AI on the edge space broadly? How do you see this playing out?
Ramin Hasani
>> At this point, I feel like not with a massive focus, but I think they have definitely an eye on it. There's some companies that actually have been in this field of billing devices like phone manufacturers. They have been always working on these matters of on-device AI has been a top of mind for phone manufacturers, for AI, PC manufacturers like AMD. This is kind of the place where it becomes ... There needs to be a movement. Somebody has to start, igniting this motion of on-device AI. So we have started on this journey of building a software layer that allows us to comfortably get out of data centers and enable on-device AI. And I feel like we are in a unique position to actually promote and walk on this path and build basically another generational company that can be complementary to that world of cloud. Because the focus of the large companies at this moment is mostly around how can we build larger data centers and infrastructure for the large and big AI, basically.
Gemma Allen
>> So tell me, the Shopify partnership is relatively new, very exciting. The ultimate proof of concept, if you like, as you say, and I'm sure the start of many to come, but what's ahead for you and the team? I mean, you seem to be a man that's all over the world based on your LinkedIn, you're in multiple countries a week. Tell me what are you guys working towards over the next 12 to 18 months?
Ramin Hasani
>> Absolutely. So there are two clusters of verticals of focus we have in mind right now. One is the original equipment manufacturers, OEMs, and the other one is e-commerce and financial services. So we see the value of our technology in these two verticals like very prominent. So our focus is to deliver the best version of an on-device AI that can go on phones, on machines, on robots, on laptops, on satellites and cars. So we have been focused working with a lot of companies in this kind of category in one cluster. And another cluster, we have been focused on e-commerce, as you saw Shopify one example. We are also working with some financial institutions on bringing the value of these extremely efficient foundation models that we have to the market and because of the speed and quality of the models. So the focus is very much into robustifying our ... We're a small company at this point, like a team of 70 people with high aspirations. Our talent is basically trying to robustify the process, going from our tech and innovations that we do on the state-of-the-art and transferring them into products, and then taking these products basically to market. We started building our go-to-market motion much stronger than before. It's a very good time. We have a lot of inbounds from the clients. So next steps for us, you're going to see more and more companies across the two clusters of verticals, OEMs and e-commerce and financial services. You're going to hear more and more how they're going to use Liquid Foundation Models on day-to-day applications, real-world applications of AI inside and outside of data centers.
Gemma Allen
>> Well, we'll sure be watching and listening. You say you're a small company, but certainly a mighty one. So we wish you all the best, you and the team. Stay in touch and hope to have you back here in the studio with us maybe in 2026 to catch up on what's happening.
Ramin Hasani
>> Yes, thank you so much for having me again.
Gemma Allen
>> Thanks for joining. I'm Gemma Allen here at our CUBE Studio at the New York Stock Exchange connecting Silicon Valley to Wall Street. Thanks so much for watching.