We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: AI Agent Conference. If you don’t think you received an email check your
spam folder.
Sign in to theCUBE + NYSE Wired: AI Agent Conference.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For theCUBE + NYSE Wired: AI Agent Conference
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for theCUBE + NYSE Wired: AI Agent Conference.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: AI Agent Conference. If you don’t think you received an email check your
spam folder.
Sign in to theCUBE + NYSE Wired: AI Agent Conference.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to theCUBE + NYSE Wired: AI Agent Conference
Please sign in with LinkedIn to continue to theCUBE + NYSE Wired: AI Agent Conference. Signing in with LinkedIn ensures a professional environment.
Exploring AI Innovations at the AI Agent Conference 2025
Sharon Zhou, founder and Chief Executive Officer of Lamini, joins theCUBE's John Furrier at the AI Agent Conference 2025. The conference bridges insights among technology innovators, offering a deep exploration of AI advancements. As a returning guest, Zhou shares expertise on steering AI developments within enterprises.
In this session, Zhou discusses Lamini's cutting-edge solutions for developers, emphasizing their innovative work in minimizing AI hallucinations through post-training...Read more
exploreKeep Exploring
What are the challenges in deploying AI models and how can they be overcome?add
What qualities are being compared between grand MoME and a fast young model?add
>> Welcome back, everyone, to theCUBE Studios. I'm John Furrier, host of theCUBE, here for a special presentation of theCUBE and the NYSE Wired open community connecting Wall Street and Silicon Valley with our studios here. Sharon Zhou is here, founder and CEO of Lamini, and a startup we covered. She was on theCUBE in 2023 at a Databricks event. Sharon, great to have you back on theCUBE in studio. Looking good, matching theCUBE colors.
Sharon Zhou
>> That's right.>> Well done.
Sharon Zhou
>> .>> Welcome back. Well, first of all, thanks for coming back.
Sharon Zhou
>> Yeah, of course.>> The world has changed a lot since we last talked, and look at just what Databricks has done and just open table formats. That whole data layer is under massive transformation because the demand is for more AI, data value. The models are out there. You're starting to see operating frameworks around distributed computing and ultimately Kubernetes and all the AI infrastructure. Everything is lining up.
Sharon Zhou
>> That's right.>> Two years ago, we were just trying to look at people excited about retrieval augmentation generation or RAG. Now it's like, okay, everyone's doing that and there's a lot more behind the curtain, so what's the big update for Lamini? Well,
Sharon Zhou
>> Well, I think one really exciting thing, so we focus on developers. We're a developer platform to make these models not hallucinate, so not make up different facts. We do so via what we call post-training techniques, so making it possible for developers everywhere to be able to steer these models on their own data via post-training techniques like fine-tuning or reinforcement learning, similar to what DeepSeek has really popularized more recently. And this makes it so that these models don't make up facts on your, I don't know, 10K's financial statements or within media, like names of people and what they do, but being able to do that across a huge number of facts is really, really important. And so we've been seeing being able to steer these models to very high factual accuracy has enabled them to touch production.>> When we talked a couple of years ago, what I was excited in that interview was RAG was not obvious, retrieval augmentation generation, but search was an instant value proposition. Steering models at data is an interesting concept because if you look at all the IP inside these large enterprises, their IP is their data, where it's originated, unstructured. SQL we know is great but not everything's SQL, but SQL and beyond where the value is. They're all trying to get value out of their data, and what DeepSeek taught us, and you mentioned them, is that they did some clever things on how they did techniques to get around some of the limitations of the hardware. Granted, different use case, but it points to a pattern, which is point your AI models at these data sets, which is a little bit not obvious. The old way was throw a bunch of Nvidia at it and see and crank it out, versus much more of a thinking through. Am I getting that right? Is that where people's heads are at right now in terms of how to think about these models? It's not just throw one model at it. Maybe think about what my data is about. Obviously, privacy issues and not putting it into the public cloud. Is that the market right now? Is that the current situation?
Sharon Zhou
>> Well, to get better intelligence, so to make your model be able to operate effectively on your domain, you effectively need two ingredients. One is compute, so Nvidia, AMD GPUs, plus that data, and that data, these pre-trained models, whether it be DeepSeek or OpenAI's GPT models or Anthropic's Claude models, they are, they're trained on the internet, so they're trained on this very general corpus of data, but they're not trained on your private data inside an enterprise. And so if you want that model to very deeply understand reason with that private data, you can additionally train it on your private data and get that data into the model. And this is especially helpful and doable with smaller models from a cost perspective and just time to iterate, iteration time perspective. So being able to do that with open source models like DeepSeek has been a very attractive approach we've seen for many enterprises to get these models, open source models, smaller models to be domain experts effectively.>> Is there a technique or model that's best appropriate that you're seeing for certain use cases? I've heard Mistral's great for this thing, or maybe use Anthropic for this. Are there patterns to the differentiation of the models? I don't know, how do I describe it, but is there a model that fits certain things, certain data sets or on-premise AI factories for example? I'm just figuring out, how do I know what to do with which model?
Sharon Zhou
>> Of course, that's always a very big challenge because which models are available changes every week because there are new models coming out constantly. The thing I often suggest is build your system with a model agnostic approach, so be able to swap models very quickly because there will be new models coming out all the time, and be able to design and architect your system in a way so that you can actually have that flexibility and experimentation capability. And so what we do is we suggest, hey, these are a few models that you can start off with, and then test it on your data. Because once you test it on your data and you train it to actually understand your data, sometimes the underlying model doesn't matter as much because the model starts to learn about your data much more specifically. And therefore, whether it's DeepSeek, Mistral or Llama underneath, when it's an open source model, it doesn't actually really change it, as long as it's within the same generation.>> And the data, the customer's data is the key.
Sharon Zhou
>> Yeah.>> Talk about what's going on with the company. How do you guys engage customers? What's the business model? Take us through a little bit of the mechanics of how you engage with the market.
Sharon Zhou
>> Yeah. So we sell a software license for developer teams inside enterprises such as like a Colgate for example, to be able to tune their models, edit these models, usually open source models, to be able to learn about facts on their data. One specific use case that has been taking off is Texas SQL, so business intelligence agent, which you can imagine is a few people in the company, whether it be like 30 or 200 people can write these SQL queries in an advanced way and really deeply understand the data and deeply understand the queries and deeply understand the calculations underlying share of market analysis for example. But to enable far more people, for example, across sales, marketing, supply chain, operations to be able to make data-driven decisions, you need more people to be able to query that highly structured data. For example, in your Snowflake or Databricks warehouse, or lakehouse, excuse me. And so enabling more people to write SQL queries through natural language is one really high value application that we've seen across the enterprise.>> So the agent summit, the AI Agent Summit that Simon and community put together, a bunch of I would say insiders, smart people making things happen like yourselves are having to deal with the trend, which is accelerated change, new models every day. What's going on? What's the latest? Vibe coding is hot. We were talking about that before-
Sharon Zhou
>> Vibe coding is hot.... >> we came on theCUBE. What's that about?
Sharon Zhou
>> So vibe coding's really exciting. It's enabling more people to be able to write code through what we call vibes or a slightly lazy approach to prompting and getting the model to write that code for you and do the heavy lifting. I'm excited about this trend taking off into other areas. For example, even vibe training models. Extending it even further, and maybe we're not just writing general code. We're actually editing the model itself, changing the model's training data and having the model actually change that on its own, but through human vibes, like human feedback essentially. So I'm really excited about that as a trend, essentially getting an agent to modify another agent and improve another agent over time, so that's essentially what that is. And something that's been really exciting internally that I've started to see is our designer, Nina, she has been able to vibe tune some of these models to get them to learn facts about the TV show, Severance, and to make that very effective. And so I just find that really exciting, to see more people be able to have access to this tooling and to be able to have the ability to steer this really magical technology.>> And you think about vibe coding, to me, it's like software always used to be a mechanism. It's static. It's not like an organism. We're humans, we vibe out, we have vibes. Software now vibing can bring in a whole nother level of intelligence that's evolving like humans, species.
Sharon Zhou
>> Right.>> So it's the difference between algae versus some species that swims in the ocean, so to speak. Software is evolving, so is vibe a gateway towards a more intelligent software layer? That could be software changing software in a good way. It's got to be programmed, but I think this might be the beginning of something that's going to be ongoing, because the messy middle between data and insights or value is going to be these agent layers. Multiple agents, people are already admitting and seeing sub-agents, master agents. There's all kinds of different configurations, mixture of experts, all these things are happening.
Sharon Zhou
>> I find agents really exciting because I feel like, and I was initially a little bit of a skeptic having been an AI researcher, an expert in this space, I have almost a model-centric view of the world. But what I realized was agents was a more human-centric view of the model, of LLMs more broadly. Because an agent is like a virtual human who can call an LLM, call multiple LLMs, have multiple prompts to the same LLM, what have you, but take actions as if it were a person. And it almost extends the Turing Test, I think. It extends it to, okay, can this agent, can this model or set of models or calls to the models actually be able to deliver us economic value? And there's really exciting work such as SWE-Lancer for example. It's a benchmark I think from OpenAI or the like, but essentially looking at, hey, can we actually get an agent to be able to complete tasks up to a million dollars on freelance sites such as Upwork, for example, and that shows, oh, this can actually deliver economic value and do work that we can attach to some number, some dollar value. And that can get towards, I think what Satya often says, is how valuable is generative AI really? How can we measure that in GDP growth? And so I think it's extending the Turing Test towards, okay, how much work productivity can we get out of these agents?>> I mean, the economic impact is phenomenal. We were talking before we came on, mixture of experts and that you have a new term that you guys are talking about called mixture of memory experts.
Sharon Zhou
>> That's right.>> Or MoME.
Sharon Zhou
>> Or MoME,>> MoME models.
Sharon Zhou
>> It's pronounced mommy.>> Yeah. I love mixture of experts because people can relate to that. A bunch of experts available at will.
Sharon Zhou
>> And a subteam of them responding at a time.>> And we had folks on at the conference. You're going to hear from folks who are doing stuff in healthcare, phenomenal work, a little bit of Matrix, Minority Report kind of vibe going on there where a throughput on operations is so much better in healthcare. So you're starting to see the efficiency translate into economics, so I think that's going to be a big piece, but I think still, people in the enterprise are trying to figure out, okay, I get it, but how do I get from point A to point B? Connect some first dots for me. And that is how do I configure models? How should I view models? So if you had to create a digital twin of yourself in an enterprise, how would you be orchestrating the teams to think about models? What's that model centric view and what Kool-Aid would you be passing out inside the enterprise?
Sharon Zhou
>> Well, maybe I'll start with the biggest mistake I've been seeing, and that I think people can very much overcome. It's that it takes a lot of experimentation and iteration to get AI to work effectively, and what I mean by that is people often view, "Oh, I got this model, or GPT4 is this model, one model, and that's one model. Maybe I'll get my one model on finance," for example. It's actually not just one model. In order to get to that one model, GPT4 was rumored to have 20,000 runs to get to that model. So it's not just about that one model. It's really figuring out how do I build out architecture, an architect, a system that can handle many models at a time? Because when I'm iterating, there are so many more things that are happening. There are so many more actual models going on. And so I would think about it from an iterative perspective and an experimentation perspective to find the right answer.>> Sharon, I love the way you think about models, because I think we're going to start to see the world where it's not about having a bunch of models that's one holistic thing. What's your view on model integration? Because I love the vibe code because that points to evolution, like species. Models can talk to each other. Are they're going to cross pollinate maybe? Do you see a world where models have relationships with each other and have reason about each other, where we don't have to worry about which model is what? There's almost an abstraction layer of orchestration or an agent for the models? Isn't that something that is obvious or is that my overthinking it?
Sharon Zhou
>> No, you're not overthinking it. I think there are two ways to think about that. One is I think the model itself doesn't need to be this monolith. If you take the mixture of experts idea, and we do this a bit with MoME, but if you take it to the extreme, then essentially, it's a bunch of little experts. There are billions of experts in the mixture, not just a few, and basically any subset of this giant model, this giant brain is, you could argue, a different model and that's all unified under this one big team, this one big mixture. And then the other lens through which I could respond to what you've posed here is a layer across, this agent layer across all these different models, and really making it agnostic to work with these different models. And I think a lot of people have taken excellent approaches towards this, creating that agnostic layer. And certainly internally at these companies, people are creating those model agnostic layers because they want to be able to play with all the possible models.>> It's funny, we use words like connective tissue all the time in tech, like, "What's the connective tissue?" But in a way, what you're getting at is that you can have a bunch of zillion of experts. The connective tissue is software. What is the connective tissue if this is the model? Is it runtime? It's an interesting question, because I'm trying to figure out, okay, if I want to deploy a model, I'd love to have a turnkey. Here's my stockpile of experts that I've verified. Now go run on something.
Sharon Zhou
>> Right, right, exactly. And I think that can happen within the model itself, and it can learn to dynamically choose for you what subteam of experts will help answer your question the best, under the constraint of what you have access to. So maybe certain experts, they learned from a data set that's private and you don't have access to it. Then you can't access those experts, right?>> Well, Sharon, great to have you on theCUBE. Thanks for coming into the studio. Great to see you again, and again, congratulations on the success. The market's spun in your direction. I'm sure you're feeding on all the AI goodness coming in.
Sharon Zhou
>> So good.>> You're speaking at the AI Agent Conference in New York. What's the topic of your talk? What's your narrative? What's your talk about?
Sharon Zhou
>> Of course, it's about agents and what they do, what they are. I think a lot of people are confused about the definition of them. I really think it's this virtual human centered view where this agent has long-term memory understanding of things, and it can take actions and reflect on them, similar to a person. And yeah, I'm seeing how we use agents internally too to help with creating training data for these models, so sharing our perspectives on that and stories about that is what I'm going to talk about.>> Well, put a plug in for Lamini. What are you working on? Are you hiring? What's going on with the company? Give a plug.
Sharon Zhou
>> I think try MoME out, because MoME knows best.>> MoME is the mixture of memory experts.
Sharon Zhou
>> That's right.>> Is that right?
Sharon Zhou
>> That's right. And there might be grand MoME coming up with reinforcement learning.>> So is grand MoME an experienced, slower model, or is it a-
Sharon Zhou
>> You're right, I did not consider that.>> No, but no, experience and efficiency.
Sharon Zhou
>> You're right. You're right. Maybe thinking something->> With age, you get more efficient.
Sharon Zhou
>> That's right, so maybe more reasoning.>> Grand meaning grander.
Sharon Zhou
>> You're right, you're right.>> I'm experienced. A little slower, more efficient that this fast young model.
Sharon Zhou
>> Yes, to get to wisdom.>> Thank you so much for coming in.
Sharon Zhou
>> Thanks so much.>> Great to see you. Congratulations. Sharon Zhou is in the center of all the action. Again, starting in 2023 in theCUBE, we're tracking the trends. Just in two years, so much has happened. The next two will be phenomenal. Again, mixture of experts, mixture of memory experts. Again, this is all converging into one big model, all written in software. Obviously the AI infrastructure is booming, and of course, agents will come out of this in the thousands and millions. This is theCUBE bringing you all the technology coverage here. Thanks for watching.