We just sent you a verification email. Please verify your account to gain access to
AWS Marketplace Newsroom . If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Register for AWS Marketplace Newsroom
Please fill out the information below. You will receive an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for AWS Marketplace Newsroom .
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
AWS Marketplace Newsroom . If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Sign in to gain access to AWS Marketplace Newsroom
Please sign in with LinkedIn to continue to AWS Marketplace Newsroom . Signing in with LinkedIn ensures a professional environment.
In this interview from AWS re:Invent 2025, Ketan Umare, chief executive officer and co-founder of Union.ai, joins theCUBE’s Dave Vellante to discuss the fundamental shift in software economics driven by generative AI. Umare explains why traditional CI/CD pipelines are insufficient for non-deterministic, stochastic AI models, drawing on his experience building massive ML platforms at Lyft. The conversation highlights the unique challenges of modern AI development, where research becomes an intrinsic part of the software lifecycle, requiring a workflow that sup...Read more
exploreKeep Exploring
What was the background and experience of the speaker in relation to AI and machine learning development?add
What changes have occurred in the approach to software deployment and development over the years, particularly regarding CI/CD software and the shift towards AI products?add
What challenges do AI engineers face when collaborating on projects, and how can these challenges be addressed?add
What is the architecture and functionality of Union.ai's AI software development workflow for big customers using a hybrid cloud approach?add
>> Hi everybody. Welcome back to Las Vegas. My name is Dave Vellante, and I'm here picking up for John Furrier. This is day four of our wall-to-wall live coverage of re:Invent 2025, Amazon's big customer show. I think this is re:Invent 13, I think we've been there 12 years straight, which is pretty amazing. Counting COVID, of course, we were virtual during COVID. But I'm excited to be back after being in the analyst program for a few days. Ketan Umare is here, he's the CEO and co-founder of Union.ai. Welcome. Thanks for coming in.
Ketan Umare
>> Hey, Dave. Thank you for having me. It's good to be here.
Dave Vellante
>> You're very welcome. Yeah, it's good.
Ketan Umare
>> Actually, it's a little quieter here, I like it.
Dave Vellante
>> It's great. This event just keeps getting bigger and bigger.
Ketan Umare
>> Bigger.
Dave Vellante
>> It's like the universe and the cloud, it just keeps expanding, everywhere you go. Kind of getting used to it. At first, you remember when re:Invent exploded, and oh my gosh, it was so crowded. Now, we just go with the flow. You see companies taking over restaurants.
Ketan Umare
>> It's crazy, yeah.
Dave Vellante
>> And I think everybody's settling in. This is the year-end push, even though we've got a couple of events. John's over at the semiconductor show in San Jose today. We've got a media week at our New York Stock Exchange studio next week, and so we're still going strong. But we're here to talk about re:Invent. Let me start with Union.ai. Why did you and your co-founders start the company, what was the founding vision?
Ketan Umare
>> That's a very interesting question, because we started in 2021 prior to all the AI hype. Sometimes, it feels it's important to be in the right time, probably a little earlier. The reason why that happened was... So a little bit of background on me. I'm an engineer, for 20 years. I was at Amazon, prior to the first re:Invent. I worked across different industries, did a lot of infrastructure-based layer software. But then, weirdly, in 2016, I was leading a team that was building machine learning models at Lyft, and we were building models for ETAs and traffic and all of that. And what turns out is that building models is not like building software, and this was a realization I had. And there were a few observations I somehow happened to stumble upon, and one of them was that most engineers in the company at Lyft were basically doing things to power these models, like getting data in shape, cleaning up, putting the right things in the right place so that we can get the models powered. But it was extremely hard to put that in production. And over the next three years, four years, we ended up building a full machine learning platform, I ended up leading it, and we had thousands of models in production. Through that period, what I realized is there was a shift happening, and the shift is that we've actually ... If you go prior to re:Invent, 2008, '09, '10, and you go to any company and say, "Hey, do you want CI/CD software?" They'd be like, "No, I have one thing that I deploy, it's a web app or it's a website." And now, every company starts with first thing is the CI/CD software. It changed because we understood that software that is so moving fast, agile, you need a new way of deploying it. But it's still linear, every decision in a software product is made by a deterministic piece of code written by somebody. The moment you change that to an AI product, all these bets go out the window. You need a different way of building these models, because these are non-deterministic, heuristic, or in some cases, stochastic processes that are deciding the outputs. So what ends up happening is research becomes a part of the software development process, and we are not used to doing research in a software community. You know, people who are used to doing research are people who are building drugs. And if you go and talk to the CEO, "What is the best way I can help you deliver a outcome for your company?" They don't say, "Make every drug experiment I do, take it to production." They say, "Reduce the cost of my drug experimentation." That's what happens when you do research. Not everything goes to production, because not everything has the same quality that needs to go to production. What you need to do is try hundreds of different ideas and take one of them to production the moment it lights up, and this was an observation, wow.
Dave Vellante
>> Okay. So the fundamental issue that you're talking about is the software economics are changing dramatically.
Ketan Umare
>> Correct.
Dave Vellante
>> Now, normally, I was thinking token costs, COGS from cloud, that's minor relative to token costs, but you're talking about, early on in the process, the NRE of software development is changing. But I want to go back to 2021, this was two years before AI hurtled around the world, and if I understand it correctly, the problem you were trying to solve was to help developers facilitate all that nasty work that they were doing. And then, you discovered a new problem, which is that this world going from linear and deterministic to a non-deterministic AI world, GenAI world. So how much of your original work, thinking, were you able to adapt? Did you have to just completely pivot, rip everything up and just start over?
Ketan Umare
>> No, it's amazing. So GenAI has two parts. There's the thing where you actually build the model, it's the same. The biggest labs use our products to actually deliver the same models, because they are machine learning models at the core. The second part of it is using the model. But when you use the model in production, you're still bringing a non-deterministic core into your product, you still have to research. Do you think you can just take a chatbot with a new version of prompt and just deploy it to production? Every time you change it, you have to test it, you have to make sure it's going to rely, you have to put guardrails, you have to put evals and so on so that you know that those are going to work. So the problem is not changed, the core fundamental problem remains the same. But what we observed is one shift, and we adapted for that, and that shift was, in the past, all of this orchestration code, which is how do I connect, how do I get the data closer to my people? In the AI world, we have a new entity that's writing some of the code, and that is an LLM. I'm saying code as a hyperbole here. It's essentially the decisions are made dynamically by the AI agents themselves. And so, you need a fully new way of doing orchestration itself, which is dynamic, robust and on-the-fly, while bringing in the old primitives of infrastructure awareness and durability into the core. So we took that, adapted our product to be extremely agile and dynamic, and it works great for .
Dave Vellante
>> Okay. So you made this observation that you have to do research upfront in the software development process, that's novel, that's new. And then, you said you have to test, test, test, and then once you've got it right, it's almost like a YOLO run. So how does that work? When do your customers decide? How do they decide? How do you help them decide when to actually deploy?
Ketan Umare
>> We don't help them decide, they actually know when it works. What we help them do is make their ideas... Oftentimes, I tell people that two ML engineers or AI engineers, when they're working on the same idea... It's different from software engineers collaborating on one project, they have one GitHub repo, where one person checks in a little bit of code, the other person checks in... They don't work like that. They try two different versions of the same thing at the same time. Competition is collaboration. And so, how do we facilitate that so that they feel that they own that entire compute universe and they can scale as much as they want and scale down? That's what we helped them solve, get that idea deployed into this AI development infrastructure, that's what we did.
Dave Vellante
>> And you can take the best of competitor number one and the best of competitor number two and blend them?
Ketan Umare
>> We can't blend them, but what will happen is... Yeah, they can, they can see the difference, because they can run these simulations. Lots of companies run simulations on the product, and they figure out, oh, this thing actually works better than that one, so let's take this one to production. And it's always leaderboarding, it's always leaderboarding.
Dave Vellante
>> So then, do you tell the system, "Well, we like this aspect of competitor number one and this aspect of competitor number two, blend them," and that happens, or how does that work?
Ketan Umare
>> No, no, they don't blend, but you can adapt, because some of these things may need a fundamental shift in the architecture or the prompt or whatever, different things that you do.
Dave Vellante
>> And that's where human comes in the loop?
Ketan Umare
>> That's where the human comes in the loop. But you still need observability in what was happening, how did these decisions get made, how did it move, what performance... For example, let's take model training as an example. Model training happens in these things called as epochs. You go, you train in cycles, and there's a point at which the model actually improves, and then it becomes worse. You want to catch it at the point when it's improved.
Dave Vellante
>> And that's when you want to do your YOLO run.
Ketan Umare
>> Exactly, exactly.
Dave Vellante
>> Okay. There's a completely different workflow.
Ketan Umare
>> Completely different.
Dave Vellante
>> We've gone from linear to this non-linear-
Ketan Umare
>> Yep, experiment-driven....
Dave Vellante
>> of software development. Collaborative now becomes competition, right?
Ketan Umare
>> Co-opetition, maybe.
Dave Vellante
>> What did he or she do, and what can I learn from that, and so it's still collaboration.
Ketan Umare
>> You're still collaborating on the whiteboard, not in The code software itself.
Dave Vellante
>> But you're working even longer hours. Your product, Flyte, F-L-Y-
Ketan Umare
>> Flyte is the open source product, yes.
Dave Vellante
>> is the open source product and you have a managed service around that? Explain that.
Ketan Umare
>> Yes. So because we started at Lyft, Flyte is an anagram of the word Lyft, and it was open sourced from Lyft. It's a Linux Foundation graduate product. Union takes the baseline Flyte, so you write the code in the same way, Azure Flyte, it's open, free. But the core engine and the way we scale it up and some other features that we do are available only in Union. And it's available in your cloud, we don't run in some our cloud, because we realized the number one thing you have today as a company is your data. You don't want to send your data to some random cloud and say, "Yeah, let's see what happens with it," you want to keep it in your own cloud. And so, we run it in your cloud or even on-prem.
Dave Vellante
>> Okay, so you can run it on-prem as well. So you do AWS, Google, Microsoft?
Ketan Umare
>> Yeah, every cloud.
Dave Vellante
>> OCI?
Ketan Umare
>> OCI, some of the new clouds.
Dave Vellante
>> On-prem as well?
Ketan Umare
>> On-prem. Only for some of the largest customers.
Dave Vellante
>> Well, what do you see happening for... Because we talk to some of those large customers, and they say, "We do a lot in the cloud, but we're going to do some stuff on-prem too. We're going to bring some of the intelligence to the data, we're going to build our own on-prem AI stack." The big companies, you talk to a JPMC, they can hire a couple of thousand AI engineers. Most enterprises can't. But what are you seeing in the enterprise? They're capable enough to build their own on-prem stack. So some big customers are using Union.ai as their AI software development workflow, so they've got a hybrid approach, right? They're of course using cloud.
Ketan Umare
>> Correct, correct.
Dave Vellante
>> How does that look? What does that look like?
Ketan Umare
>> So we have a very novel architecture depending on our setup. One of our setups is multi-cloud setup. So we run the brain, the control plane of the system, in AWS, and we can connect to any cloud. That's where your data resides, data planes run, like your compute, your inferencing, your training cycles all run in your cloud, and we can seamlessly connect these multiple clouds into one common way so that the users actually... Just think about writing code. Literally, you write pure Python code. I've gotten to a point where I actually get Claude Code to write some of my examples that run on the system, and then, in one shot, it's able to run. It's literally YOLOing onto the platform, because it can build infrastructure as it needs dynamically, and then pre-provision it once it's done, and that's the power of this AI development infrastructure that we're building.
Dave Vellante
>> The year you started your company, it was 2021?
Ketan Umare
>> Correct.
Dave Vellante
>> re:Invent actually was held that year, it was right before Omicron. And so, we went to re:Invent, we went to Omicron, and everything shut down again. But I remember that was the year, guys, you'll remember this, we came up with Supercloud. The idea of Supercloud was an abstraction layer across all the clouds, what multi-cloud should have been is what we called it. Of course, we got a lot of heat for that term, but it was fun. But the whole idea is abstracting the underlying complexities, make it simple. So my question is, as you think about that multi-cloud capability, you take care of the security and the governance, how far do you go there?
Ketan Umare
>> We use containerized Kubernetes underneath, so we take care of containerization, movement of data, security, because each agent or each pipeline or each whatever you're building, the software process can run with its own set of permissions, every instance of it can run with its own set of permissions. We take care of the observability, so you get a full observability stack, so you see how things are running, where they're running, how data is moving. And then, finally, we take care of scheduling, because GPUs are very complicated resources underneath. They're not available everywhere. Some are available here, some are available there. And if you want to do crazy training runs, you need extremely interesting, intricate networking protocols. So we have to set up all of that. So we take care of that, the goo underneath. What ends up happening for the user is it looks like writing regular Python code that runs locally, and with the click of a button, containerizes, moves and boom, it's running, and we can get it running. And our goal, call time, is one second, from local to .
Dave Vellante
>> So all the heavy lifting that developers would normally have to do around security and governance and all that, what'd you say, goo?
Ketan Umare
>> Yes. Because if you think about an you said they work long hours, because they're working with such complicated beasts of products, you want to simplify their life. If you go and talk to an AI engineer and say, "By the way, to take your agent to production, now first write Terraform, and then go deploy, and then write a CI/CD pipeline." They'll go, "Come on, I'm not going to do all of this, because I don't even know if this thing works."
Dave Vellante
>> Right. No, I get it.
Ketan Umare
>> It's too complicated, it's too complicated.
Dave Vellante
>> At that point, the developer says, "I'm just not excited about doing this. I'm going to go home, maybe grab some sleep, I work better in the morning," or whatever. Whereas if they're able to focus... We've all been there, the adrenaline pumps, if you're able to focus on something, that gets you to a major step function further, you're going to get just pumped up, start doing more coffee, smoke some butts, whatever it is.
Ketan Umare
>> I'm not advising ill health things, but your personal-
Dave Vellante
>> That's the journalist in me, the journalists still smoke cigarettes. I don't know, do developers smoke cigarettes? Probably not.
Ketan Umare
>> Probably not.
Dave Vellante
>> Okay. So talk me through the value proposition. Obviously, there's a new type of software development workflow. So you come into your clients and say, "Hey, we understand this." I'm sure it's a spectrum, some understand it, some are just getting into it, so you can help connect there. But there's also the cost aspect of doing the research upfront.
Ketan Umare
>> Correct.
Dave Vellante
>> So take me through the value proposition, how do you sell this?
Ketan Umare
>> Great question. I think the important thing to understand is there's no way out of doing research, because this is a research-based product, it's non-deterministic. But you can reduce the cost by doing smart things. Let's say if two people, as I said, they're working in competition, collaboration, but they're using 80% of their product is still the same. Let's say you're working on some data set that you've now found, maybe the S&P 500 for the last 10 years, I want to analyze it, I want to do something with it or I want to ask questions on it. That data set is the same. I may use different modalities to explore it, but it's the same dataset. The system automatically understands that, caches the results, and when you run it, it's like, "Oh yeah, that guy built this data set already, here you go, you can just reuse it." So that's one. Second, this was actually an observation we had when we were at Lyft, we were running so many experiments and so many models all the time, we never had enough compute. We never had enough compute. But that doesn't mean we could just go and get compute. We talked to the people, "Is it okay if you can wait if we can guarantee your things will run?" And they were all okay. So what we did is we actually bought a pool of compute, AWS has great reserved instances, chopped them up in different ways, and as people came in, they got their compute and the other one queued up behind. When the next one goes, that gets in and that's getting... And in some cases, they can't wait. So we can automatically partition the system so that there are areas which are going always critically. So that allows you to use a limited pool of resource, makes you feel as if you have infinite, but trades off time in terms of cost. And the final thing is the ROI. We give you full compliance, because eventually, you want to check ROI, you want to see that all these experiments are amounting to something, and that's one of the biggest one of the biggest gaps today in the industry. And how do you do that? You do that by going and making sure that we've actually done the best practices, we have actually done evals and checked those evals. So we give you that compliance layer underneath, where you can go and audit it anytime and say, "Oh yeah, this thing actually did work better. Here's a leaderboard, here are the two things that we ran, and one of them is better." And these are the ways that people can reduce the cost.
Dave Vellante
>> All right. So you're four or five years in, give us the stats on the company. What can you share with us? Money raised, VCs, headcount, what can you share?
Ketan Umare
>> Yeah. We've remained a much smaller-focused company, team size, we are only 46 people in the company, so not very big, but we punch above our weight. We are a Series A company, backed by NEA and Nava Ventures, and we're very happy with our progress. This year alone, we have quadrupled our revenue. And we are not like a typical AI company where the revenue comes at the expense of a lot of COGS, because we actually run software that runs your software, so most of our revenue comes at a very small cost to us.
Dave Vellante
>> So you're still capital efficient.
Ketan Umare
>> Very capital efficient.
Dave Vellante
>> So I don't know if this is accurate, it says you did a seed round at 10 million .
Ketan Umare
>> Yeah, that was in 2021.
Dave Vellante
>> Okay. And then, Series A was 19 million.
Ketan Umare
>> Prior to ChatGPT, 2023.
Dave Vellante
>> May of '23. And you haven't raised since?
Ketan Umare
>> No, no.
Dave Vellante
>> Really? So very capital efficient.
Ketan Umare
>> Very capital efficient.
Dave Vellante
>> And presumably, you're extensively using AI. A whole narrative now is you can build a billion-dollar company with a lot less people, and that's exactly what you're doing.
Ketan Umare
>> We are seeing that more and more. Some of the infrastructure layer, you cannot really use AI because people rely on us. We don't want to tell them that, "By the way, most of the software was written by AI." I don't think people will trust that. But we do get efficiencies of open source. We have amazing open source partners, like LinkedIn, Spotify, Stripe, all of these guys use our product in open source. The feed cycle, just learning, and the scale things that we learn from them is amazing, and that automatically multiplies.
Dave Vellante
>> And testing too. Presumably, it's all the mundane tasks of...
Ketan Umare
>> Yeah. We might raise another round, but not looking right now.
Dave Vellante
>> Well, congratulations, really interesting story and best luck to you. Appreciate you coming on theCUBE.
Ketan Umare
>> No, thank you, appreciate it.
Dave Vellante
>> Yeah, you bet.
Ketan Umare
>> Thank you for having me.
Dave Vellante
>> All right. Keep it right there. This Dave Vellante. We're live at Las Vegas here in The Venetian in our little side room, what are we, 3708? Come by and see us. I'm Dave Vellante, you're watching theCUBE. We'll be right back, our live coverage of re:Invent 2025 from Las Vegas. Keep it right there.