theCUBE + NYSE Wired: Mixture of Experts Series | Ali Ansari, Micro1

Ali Ansari, Micro1

In this theCUBE + NYSE Wired: Mixture of Experts segment from the New York Stock Exchange, theCUBE’s John Furrier sits down with Raj Verma, CEO of SingleStore, to unpack how the intersection of technology and finance is shaping enterprise strategy. Verma shares why SingleStore is “on course” for the public markets, reflects on brand-building through the company’s partnership with golf Hall of Famer Padraig Harrington and connects that ethos to how SingleStore helps organizations fix struggling data “swings.” The discussion zeroes in on what’s next as Wall Street watches the AI infrastructure buildout: after chips and systems, the software and data layers set the pace for value creation. Verma outlines why enterprises must modernize “brown” data estates into “green” ones to safely bring corporate context, governance and compliance into LLM workflows via RAG – and why commoditized data-at-rest puts the advantage at the query layer that unifies data in motion with data at rest. He predicts agentic AI will gain reasoning capabilities in roughly 18 months, cites industry indicators like Google reporting ~25% of its software now built by AI and argues that high switching costs will give way to disruption as buyers reassess legacy vendors. The conversation closes with concrete momentum: ~33% YoY growth, ARR in the ~$135M range, gross dollar retention ~98%, cloud NDR ~130, ~50% of business now in the cloud, landing ~3 new customers per day, a path to cash-flow breakeven in the next two quarters and a teaser for AI-related announcements in the next two months. Listeners will find notable stats, real-world use cases and forward-looking views on how databases power reliable AI at enterprise scale.

Share this session

Clips
More from theCUBE + NYSE Wired: Mixture of Experts Series

Ali Ansari

CEO

Micro1

play_circle_outline Evaluation, Not Intelligence: Micro1's Expert Data Engine for Quantifying AI Actions in High-Stakes Domains

play_circle_outline Part-Time Experts, Two-Sided AI Markets: How Side-Gig Domain Talent Powers AI Labs

play_circle_outline Government work: R&D partner building and evaluating AI agents for government use cases

play_circle_outline Differentiation: rigorous expert vetting engine and human-first expert experience as competitive moat

Info

Ali Ansari, Micro1

Ali Ansari

CEO Micro1

search

Gemma Allen

>> Welcome back to theCUBE Studio here at the New York Stock Exchange. This is Mixture of Experts, parts of our programming with NYSE Wired. And joining me now is Ali Ansari, CEO and founder of micro1. Welcome, Ali.

Ali Ansari

>> Thank you for having me.

Gemma Allen

>> So I had a look at your bio earlier and you have been a very busy man. You founded a company oil at Berkeley, graduated in three years, have grown at 200 million ARR, are doing some side work at Stanford and still pursuing a master's degree. Maybe just to start, Ali, do you ever sleep?

Ali Ansari

>> I think last night is a good example of only a few hours of sleep. So it's very limited, but trying to fix it.

Gemma Allen

>> Well, tell me about the mission here. Tell me about micro1. Fill me in. What exactly do you guys do and what's the big bold goal?

Ali Ansari

>> Yeah, absolutely. So we're building a data engine that helps AI labs improve their foundational models through post-training data generated by human experts. So the way to think about this is labs and LLMs have trained on the internet and these large scale data sets and they're sort of done training on those large scale datasets. And now we're in the era of training on structured human judgment from doctors, lawyers, engineers, et cetera. So what we do at micro1 is we recruit experts and we set up data pipelines that allows us to improve foundational models in lots of different domains.

Gemma Allen

>> So I read a headline about you from earlier this year where you say that AI's biggest problem isn't intelligence, it's evaluation. And what you're essentially saying there is we have to have a human sense check in terms of what we're putting out into the ether in this world. Talk to me a little bit about on a day-to-day basis, how you are seeing the impacts of industry outcomes where evaluations aren't being correctly met. Where do you see the risks here?

Ali Ansari

>> Yeah, so there are many risks. Of course, where AI is working and starting to work well is sort of high risk domains like legal, medical, even programming in terms of cybersecurity. And if there isn't very robust evaluations done on the LMs to improve these domains, but also more importantly, once the agent is built on top of these LLMs, evaluating agents in a very specific enterprise context, if evaluation is not done, you're not able to actually measure what good looks like. This is the problem with probabilistic software, which is there's no such thing as doing kind of basic QA engineering. You need to define what is the action space of this probabilistic software that I'm building, in other words, agents. And then you need to measure that action space very closely and quantitatively through human experts that are sort of good at those actions in their day jobs.

Gemma Allen

>> Wow. So fill me in a little bit on the metrics here for this company. You met 200 ARR very, very fast in 18 months. Talk to me a little bit about growth stage you guys are at in funding. It seems like this is a very fast and high-impact story.

Ali Ansari

>> Yeah. So we started 2025 with roughly around 5 million a year in revenue. And then we ended the year in 2025 with just under 150 million a year. And now a few months within 2026, we have surpassed 300 million in annualized run rate.

Gemma Allen

>> Wow. And in terms of the broad kind of two sides of this coin, right? In terms of recruiting people from industry, and then also having a commercial model into these LLM players, how has that been playing out? Give me a sense of a typical candidate for micro1.

Ali Ansari

>> Yeah. So there's two sides to our business. One is we have the labs and the researchers within the labs that we interface with, that we help with data that they need. And then on the other side, we interface with world-class experts across many different fields like medical, physics, biology, coding, and hundreds of other domains. And those experts are the ones that are helping actually train the models. A typical expert is someone that kind of puts in anywhere from 15 to 20 hours a week part-time on top of their day job. And they are actively working full-time somewhere and then they help as a side gig train AI models usually on the weekend and sometimes a few hours during the week as well.

Gemma Allen

>> Wow. Okay. So these are people who are really putting in this extra kind of hours from the perspective of supplemental income. So talk a little bit about the world of LLMs right now, right? Do you competitively, I guess speaking, see a world whereby they will start doing their own vetting per se? Is that something that you guys think about when you think about the competitive dynamics of the industry or do you always see a role whereby they will have some sort of objective layer?

Ali Ansari

>> Yeah. So in most cases, the companies that we work with, they do some of their own vetting. Every time we have experts, they sort of do a final layer of vetting before experts start with any given data pipeline. But the hypothesis that we have is there's likely going to be hundreds of foundational model companies with the current top ones being the reasoning layer and the more so general intelligence layer. But then lots of AI application companies that end up building their own models and also a lot of the smaller labs that end up building more niche areas, building in certain niche areas in terms of domain capabilities. And all of those companies will require data in some way and therefore experts to train their models.

Gemma Allen

>> So one of the products, I guess on your roadmap is what you call ego-centric human data. That's an interesting term in and of itself. Talk to me a little bit about what that actually means. What is the real-world application and practice for that?

Ali Ansari

>> Yeah. So about six months ago, we sort of took this bet at micro1, which is we believe that robots will also have to learn from humans. And if you sort of think about the main data structure that robots take in to improve, it's been what's called teleoperation, which is a human remote controls a robot. The robot does some optimal actions as controlled by the human. Those optimal actions are recorded and then the robot tries to replicate those autonomously. The problem with this data is it's very expensive and it's hard to replicate all the things that humans do. So the bed that we took is that instead of having this kind of teleoperation data sets as the main way that robots learn, we believe that robots will actually just learn from humans in a very pure form. And so what that means is there is kind of an initial pipelines that are being set up, which is called egocentric data that is humans are recording themselves in first person, do a bunch of actions in the real world, which is household tasks and many other kind of chores within the house. And then we use this data, which is many, many tens of thousands of hours of video with thousands of people that are working in this pipeline every single day to feed into robotics models to improve their capabilities. And the way we think about it is that long-term models that will be most useful for us is ones that can act in the real world. So it's not just about like a humanoid form factor, it's more so about AI models that can not only act in this kind of very tiny subset of the world, which is the digital world, which is the computer, but ones that can act way beyond the computer. And the way you do that is you have humans that are in these real world environments train these models.

Gemma Allen

>> So technically, what are the key challenges here? We know context is a challenge in the digital space, right? Dexterity, I'm sure, is a challenge from the perspective of humanoids. What are the real problems that you're setting out to solve here? Break it down.

Ali Ansari

>> Yeah. So one of the main challenges is the diversity of data in the real world. If you think about all the things that are possible in this digitally-constraint world, it's a very large kind of vector spend, that there's many things you can do in a computer. That's why there's so many different environments and data sets that we have to build for AI labs to improve their models in the LLM world. Now, if you remove that constraint, which is the sort of digitally-constraint, and you go into the real world, there is maybe a million times or maybe 10 million times more environments and sort of diversity of possible actions. So what becomes really difficult, the very difficult technical challenge is how can you get a very representative data set for these robots to build general reasoning and general capabilities to navigate and manipulate the world through this very large vector span of actions that exist. So the way that we're aiming to solve that is every dataset we create needs extreme diversity. So for example, if we're trying to get robots to be good at household tasks, we need to create many, many thousands of hours of videos of tons of different possible actions within the house, whether it's tidying up desks, whether it's doing some chores, whether it's fixing a bookshelf. And we need to do so with a very large combination of households that have different objects that they use, different backgrounds, different hands. So this sort of diversity is what really helps solve ultimately the robotics challenge.

Gemma Allen

>> Okay. And speaking of challenges, you've also launched a government division for your company, correct? And I think government is a very interesting space in the world of AI because it's controversial, but hugely consequential. Talk to me a little bit about what's happening in that space for micro1.

Ali Ansari

>> Yeah. So without talking about the specifics of any department, we are helping the government build AI agents. And we take a very hands-on kind of research and development partner for the government, where we help them kind of scope out the agent they're building, build the agent for them. And then most importantly, evaluate it through our data engine to make sure that it's working reliably for the use cases that they have.

Gemma Allen

>> Okay. And from a competition perspective, I know there's one particular kind of 800-pound gorilla in this space, but talk to me a little bit about how things are playing out at an industry level, right? You guys are obviously building relationships with these labs across the board. You want to become a brand synonymous with training data. How do you continue to differentiate your market message in this space?

Ali Ansari

>> Yeah, I think there's sort of two core things that we focus heavily on. One is the data quality for any data pipeline, any evaluation data that we're creating starts with the expert quality and the expert really matching any given job or any given data pipeline that we're assigning them to. So if we have a pipeline that requires expertise in M&A law, you cannot just assign a lawyer, you have to assign a lawyer that is excellent in M&A specifically. So what we do is we're building our vetting engine to be best in class and we have an interview system that can really conversate with candidates, deeply assess them in whatever skill they need to have for any given job. And this allows us to match candidates perfectly with the jobs that we have, which is sort of the start of how you build really high quality data sets. And the second thing is, in this space, one of the things that we think is sort of undermined is deeply caring about the human experience, which is, again, the experts that are on these pipelines. So the approach we take is a very human-first approach where the experts that we have and their experience comes number one, and we sort of build our product, our operations and overall processes around this fact that our experts matter more than anything. And this results in two things. One is we are building a job sector that is... We're building a new job sector, and so we're building a nice foundation for it for where people are happy working in this job sector. And that results in micro1 being the place where experts are happiest to train AI models. And so this sort of becomes a nice moat for us long term for experts to decide to only work on micro1. And the second thing is that it results in higher quality data because when someone is happy on the job, they have much more pure judgment and the sort of structured judgment that they distill into models becomes much higher quality.

Gemma Allen

>> Wow. Well, to finish out, I read also a headline that you are potentially, according to Forbes, one of the youngest self-made billionaires in tech, Ali. How do you respond to that commentary and how does it, I guess, shape how you think about your role as a CEO in the industry you're in right now?

Ali Ansari

>> Yeah. To be honest, I don't love these headlines and we're just building a great product. I think there are many ways to look at unrealized gains, but it's not something I focus on.

Gemma Allen

>> Good. Well, listen, great to chat to you. Fascinating product, fascinating time. Wish you guys all the best and hope to maybe have you here in person at the NYSE at some point.

Ali Ansari

>> Amazing. Thank you so much. I appreciate it.

Gemma Allen

>> I'm Gemma Allen. This is coming to you from theCUBE Studio here at the New York Stock Exchange. This is NYSE Wired in theCUBE Mixture of Experts. Thanks so much for watching.

Ali Ansari, Micro1

search

Gemma Allen

>> Welcome back to theCUBE Studio here at the New York Stock Exchange. This is Mixture of Experts, parts of our programming with NYSE Wired. And joining me now is Ali Ansari, CEO and founder of micro1. Welcome, Ali.

Ali Ansari

>> Thank you for having me.

Gemma Allen

>> So I had a look at your bio earlier and you have been a very busy man. You founded a company oil at Berkeley, graduated in three years, have grown at 200 million ARR, are doing some side work at Stanford and still pursuing a master's degree. Maybe just to start, Ali, do you ever sleep?

Ali Ansari

>> I think last night is a good example of only a few hours of sleep. So it's very limited, but trying to fix it.

Gemma Allen

>> Well, tell me about the mission here. Tell me about micro1. Fill me in. What exactly do you guys do and what's the big bold goal?

Ali Ansari

>> Yeah, absolutely. So we're building a data engine that helps AI labs improve their foundational models through post-training data generated by human experts. So the way to think about this is labs and LLMs have trained on the internet and these large scale data sets and they're sort of done training on those large scale datasets. And now we're in the era of training on structured human judgment from doctors, lawyers, engineers, et cetera. So what we do at micro1 is we recruit experts and we set up data pipelines that allows us to improve foundational models in lots of different domains.

Gemma Allen

>> So I read a headline about you from earlier this year where you say that AI's biggest problem isn't intelligence, it's evaluation. And what you're essentially saying there is we have to have a human sense check in terms of what we're putting out into the ether in this world. Talk to me a little bit about on a day-to-day basis, how you are seeing the impacts of industry outcomes where evaluations aren't being correctly met. Where do you see the risks here?

Ali Ansari

>> Yeah, so there are many risks. Of course, where AI is working and starting to work well is sort of high risk domains like legal, medical, even programming in terms of cybersecurity. And if there isn't very robust evaluations done on the LMs to improve these domains, but also more importantly, once the agent is built on top of these LLMs, evaluating agents in a very specific enterprise context, if evaluation is not done, you're not able to actually measure what good looks like. This is the problem with probabilistic software, which is there's no such thing as doing kind of basic QA engineering. You need to define what is the action space of this probabilistic software that I'm building, in other words, agents. And then you need to measure that action space very closely and quantitatively through human experts that are sort of good at those actions in their day jobs.

Gemma Allen

>> Wow. So fill me in a little bit on the metrics here for this company. You met 200 ARR very, very fast in 18 months. Talk to me a little bit about growth stage you guys are at in funding. It seems like this is a very fast and high-impact story.

Ali Ansari

>> Yeah. So we started 2025 with roughly around 5 million a year in revenue. And then we ended the year in 2025 with just under 150 million a year. And now a few months within 2026, we have surpassed 300 million in annualized run rate.

Gemma Allen

>> Wow. And in terms of the broad kind of two sides of this coin, right? In terms of recruiting people from industry, and then also having a commercial model into these LLM players, how has that been playing out? Give me a sense of a typical candidate for micro1.

Ali Ansari

>> Yeah. So there's two sides to our business. One is we have the labs and the researchers within the labs that we interface with, that we help with data that they need. And then on the other side, we interface with world-class experts across many different fields like medical, physics, biology, coding, and hundreds of other domains. And those experts are the ones that are helping actually train the models. A typical expert is someone that kind of puts in anywhere from 15 to 20 hours a week part-time on top of their day job. And they are actively working full-time somewhere and then they help as a side gig train AI models usually on the weekend and sometimes a few hours during the week as well.

Gemma Allen

>> Wow. Okay. So these are people who are really putting in this extra kind of hours from the perspective of supplemental income. So talk a little bit about the world of LLMs right now, right? Do you competitively, I guess speaking, see a world whereby they will start doing their own vetting per se? Is that something that you guys think about when you think about the competitive dynamics of the industry or do you always see a role whereby they will have some sort of objective layer?

Ali Ansari

>> Yeah. So in most cases, the companies that we work with, they do some of their own vetting. Every time we have experts, they sort of do a final layer of vetting before experts start with any given data pipeline. But the hypothesis that we have is there's likely going to be hundreds of foundational model companies with the current top ones being the reasoning layer and the more so general intelligence layer. But then lots of AI application companies that end up building their own models and also a lot of the smaller labs that end up building more niche areas, building in certain niche areas in terms of domain capabilities. And all of those companies will require data in some way and therefore experts to train their models.

Gemma Allen

>> So one of the products, I guess on your roadmap is what you call ego-centric human data. That's an interesting term in and of itself. Talk to me a little bit about what that actually means. What is the real-world application and practice for that?

Ali Ansari

>> Yeah. So about six months ago, we sort of took this bet at micro1, which is we believe that robots will also have to learn from humans. And if you sort of think about the main data structure that robots take in to improve, it's been what's called teleoperation, which is a human remote controls a robot. The robot does some optimal actions as controlled by the human. Those optimal actions are recorded and then the robot tries to replicate those autonomously. The problem with this data is it's very expensive and it's hard to replicate all the things that humans do. So the bed that we took is that instead of having this kind of teleoperation data sets as the main way that robots learn, we believe that robots will actually just learn from humans in a very pure form. And so what that means is there is kind of an initial pipelines that are being set up, which is called egocentric data that is humans are recording themselves in first person, do a bunch of actions in the real world, which is household tasks and many other kind of chores within the house. And then we use this data, which is many, many tens of thousands of hours of video with thousands of people that are working in this pipeline every single day to feed into robotics models to improve their capabilities. And the way we think about it is that long-term models that will be most useful for us is ones that can act in the real world. So it's not just about like a humanoid form factor, it's more so about AI models that can not only act in this kind of very tiny subset of the world, which is the digital world, which is the computer, but ones that can act way beyond the computer. And the way you do that is you have humans that are in these real world environments train these models.

Gemma Allen

>> So technically, what are the key challenges here? We know context is a challenge in the digital space, right? Dexterity, I'm sure, is a challenge from the perspective of humanoids. What are the real problems that you're setting out to solve here? Break it down.

Ali Ansari

>> Yeah. So one of the main challenges is the diversity of data in the real world. If you think about all the things that are possible in this digitally-constraint world, it's a very large kind of vector spend, that there's many things you can do in a computer. That's why there's so many different environments and data sets that we have to build for AI labs to improve their models in the LLM world. Now, if you remove that constraint, which is the sort of digitally-constraint, and you go into the real world, there is maybe a million times or maybe 10 million times more environments and sort of diversity of possible actions. So what becomes really difficult, the very difficult technical challenge is how can you get a very representative data set for these robots to build general reasoning and general capabilities to navigate and manipulate the world through this very large vector span of actions that exist. So the way that we're aiming to solve that is every dataset we create needs extreme diversity. So for example, if we're trying to get robots to be good at household tasks, we need to create many, many thousands of hours of videos of tons of different possible actions within the house, whether it's tidying up desks, whether it's doing some chores, whether it's fixing a bookshelf. And we need to do so with a very large combination of households that have different objects that they use, different backgrounds, different hands. So this sort of diversity is what really helps solve ultimately the robotics challenge.

Gemma Allen

>> Okay. And speaking of challenges, you've also launched a government division for your company, correct? And I think government is a very interesting space in the world of AI because it's controversial, but hugely consequential. Talk to me a little bit about what's happening in that space for micro1.

Ali Ansari

>> Yeah. So without talking about the specifics of any department, we are helping the government build AI agents. And we take a very hands-on kind of research and development partner for the government, where we help them kind of scope out the agent they're building, build the agent for them. And then most importantly, evaluate it through our data engine to make sure that it's working reliably for the use cases that they have.

Gemma Allen

>> Okay. And from a competition perspective, I know there's one particular kind of 800-pound gorilla in this space, but talk to me a little bit about how things are playing out at an industry level, right? You guys are obviously building relationships with these labs across the board. You want to become a brand synonymous with training data. How do you continue to differentiate your market message in this space?

Ali Ansari

>> Yeah, I think there's sort of two core things that we focus heavily on. One is the data quality for any data pipeline, any evaluation data that we're creating starts with the expert quality and the expert really matching any given job or any given data pipeline that we're assigning them to. So if we have a pipeline that requires expertise in M&A law, you cannot just assign a lawyer, you have to assign a lawyer that is excellent in M&A specifically. So what we do is we're building our vetting engine to be best in class and we have an interview system that can really conversate with candidates, deeply assess them in whatever skill they need to have for any given job. And this allows us to match candidates perfectly with the jobs that we have, which is sort of the start of how you build really high quality data sets. And the second thing is, in this space, one of the things that we think is sort of undermined is deeply caring about the human experience, which is, again, the experts that are on these pipelines. So the approach we take is a very human-first approach where the experts that we have and their experience comes number one, and we sort of build our product, our operations and overall processes around this fact that our experts matter more than anything. And this results in two things. One is we are building a job sector that is... We're building a new job sector, and so we're building a nice foundation for it for where people are happy working in this job sector. And that results in micro1 being the place where experts are happiest to train AI models. And so this sort of becomes a nice moat for us long term for experts to decide to only work on micro1. And the second thing is that it results in higher quality data because when someone is happy on the job, they have much more pure judgment and the sort of structured judgment that they distill into models becomes much higher quality.

Gemma Allen

>> Wow. Well, to finish out, I read also a headline that you are potentially, according to Forbes, one of the youngest self-made billionaires in tech, Ali. How do you respond to that commentary and how does it, I guess, shape how you think about your role as a CEO in the industry you're in right now?

Ali Ansari

>> Yeah. To be honest, I don't love these headlines and we're just building a great product. I think there are many ways to look at unrealized gains, but it's not something I focus on.

Gemma Allen

>> Good. Well, listen, great to chat to you. Fascinating product, fascinating time. Wish you guys all the best and hope to maybe have you here in person at the NYSE at some point.

Ali Ansari

>> Amazing. Thank you so much. I appreciate it.

Gemma Allen

>> I'm Gemma Allen. This is coming to you from theCUBE Studio here at the New York Stock Exchange. This is NYSE Wired in theCUBE Mixture of Experts. Thanks so much for watching.