In this Crypto TrailBlazers segment from the New York Stock Exchange, Starburst CEO and co-founder Justin Borgman joins theCUBE’s Dave Vellante to unpack the company’s hybrid data strategy and its implications for the next wave of AI-driven enterprise innovation. Borgman revisits Starburst’s origins in the open source Trino project and explains how its hybrid deployment model is enabling large organizations to query data where it lives, whether on-premises, in the cloud, or across both. He details why demand for on-prem infrastructure is resurging, driven by privacy, compliance, cost efficiency and AI workloads, highlighting Starburst’s OEM partnership with Dell on the Dell Lakehouse appliance.
The conversation explores how financial services leaders are building air-gapped, hybrid AI architectures to maintain data sovereignty while accelerating analytics and AI adoption. Borgman discusses the convergence of AI and analytics, the rise of open table formats such as Iceberg and the governance challenges around metadata and access control. He offers candid views on the shifting competitive landscape as application vendors and data platforms move into each other’s domains, the role of agentic AI in the enterprise and why access to proprietary data will be the decisive advantage in future digital ecosystems.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: Mixture of Experts Series. If you don’t think you received an email check your
spam folder.
Sign in to theCUBE + NYSE Wired: Mixture of Experts Series.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For theCUBE + NYSE Wired: Mixture of Experts Series
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for theCUBE + NYSE Wired: Mixture of Experts Series.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: Mixture of Experts Series. If you don’t think you received an email check your
spam folder.
Sign in to theCUBE + NYSE Wired: Mixture of Experts Series.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to theCUBE + NYSE Wired: Mixture of Experts Series
Please sign in with LinkedIn to continue to theCUBE + NYSE Wired: Mixture of Experts Series. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Justin Borgman, Starburst
In this Crypto TrailBlazers segment from the New York Stock Exchange, Starburst CEO and co-founder Justin Borgman joins theCUBE’s Dave Vellante to unpack the company’s hybrid data strategy and its implications for the next wave of AI-driven enterprise innovation. Borgman revisits Starburst’s origins in the open source Trino project and explains how its hybrid deployment model is enabling large organizations to query data where it lives, whether on-premises, in the cloud, or across both. He details why demand for on-prem infrastructure is resurging, driven by privacy, compliance, cost efficiency and AI workloads, highlighting Starburst’s OEM partnership with Dell on the Dell Lakehouse appliance.
The conversation explores how financial services leaders are building air-gapped, hybrid AI architectures to maintain data sovereignty while accelerating analytics and AI adoption. Borgman discusses the convergence of AI and analytics, the rise of open table formats such as Iceberg and the governance challenges around metadata and access control. He offers candid views on the shifting competitive landscape as application vendors and data platforms move into each other’s domains, the role of agentic AI in the enterprise and why access to proprietary data will be the decisive advantage in future digital ecosystems.
In this Crypto TrailBlazers segment from the New York Stock Exchange, Starburst CEO and co-founder Justin Borgman joins theCUBE’s Dave Vellante to unpack the company’s hybrid data strategy and its implications for the next wave of AI-driven enterprise innovation. Borgman revisits Starburst’s origins in the open source Trino project and explains how its hybrid deployment model is enabling large organizations to query data where it lives, whether on-premises, in the cloud, or across both. He details why demand for on-prem infrastructure is resurging, driven by ...Read more
exploreKeep Exploring
What is Justin Borgman's role and relationship to Starburst?add
What was the founding premise of Starburst and how has it evolved over time?add
What are the current trends in the on-prem market and what factors are influencing its growth?add
What are the considerations for implementing on-premises or hybrid architectures in large banking institutions?add
What factors are influencing customers to prefer on-premises or hybrid AI deployments?add
What factors are influencing customers to choose between hybrid or on-prem deployment versus cloud solutions?add
>> Hi, everybody. Welcome back to the New York Stock Exchange. We're here in the Buttonwood Podium. My name is Dave Vellante. We're overlooking the options exchange. Welcome to our mixture of experts series, John Furrier and I, we're really excited to have Justin Borgman here, both friend of theCUBE, longtime friend of theCUBE, I should say, still young man. Great to see you. CEO and Chairman of Starburst, and a co-founder? Yes?>> Yes.
Dave Vellante
>> Yes, okay. It doesn't have that in your title. Too many, like IBM, they got the titles that don't end. But I thought so, co-founder as well.>> Yep.
Dave Vellante
>> So, how you doing?>> Good, good. Thanks for having me.
Dave Vellante
>> Yeah, you bet. It was great to have you. And I want to go back to the founding premise of Starburst. I mean, we met when you first had Hadapt, and you started playing around, trying to eliminate connectors, and doing some really good work there, but learning a lot about customers that didn't want to move data, and the pain of that. And so, what was the founding premise of Starburst, and how has it evolved?>> Yeah, so when we founded Starburst, it was really initially the commercialization of an open source project today known as Trino. Back then, it was called Presto. It had come out of Facebook. My co-founders were the creators of that project. And what we recognized was an opportunity to query data where it lives, as well as data lakes and data lake houses. But part of what made us unique from the start is, while a lot of companies at the time were cloud only, we were hybrid, and that allowed us to deploy on-prem in the cloud, any combination of the two. And that served us really well, particularly within large enterprises, that tend to have data in multiple form factors.
Dave Vellante
>> Yeah, you think about financial services. We were just sort of speaking off-camera, a lot of systems that, for whatever reason, latency, security, compliance, or just they don't want to migrate the data, it makes no sense, cost as well, and so they're hybrid. It's interesting, right, because when you talk to companies like Snowflake, you said, "Nope." And Frank Slootman has said on theCUBE, "We're not doing a halfway house. We are cloud only, always." And giving up a big piece of the TAM, that's a part of the TAM that you could do cloud.>> That's right.
Dave Vellante
>> But you're also going after that piece of the market, that some others that were born in the cloud aren't interested in. How have you seen that play out?>> I mean, interestingly, we see the on-prem market actually growing, and I think, if anything, there's renewed enthusiasm and interest in on-prem data centers because of the AI factor. And you touched on both privacy and security concerns, but also cost as another dimension. We have a partnership now with Dell, where Dell is delivering a product they call the Dell Lakehouse, which is actually Starburst inside. It's an OEM agreement, and that's selling directly into data centers and on-prem customers. I think that highlights the demand that exists there. That has been, in a lot of ways, neglected by that cloud-only approach that you were referencing.
Dave Vellante
>> So, we've talked on theCUBE a lot about organizations, particularly large banks, are building their own on-prem stacks. They have to essentially replicate what they can get, at least not the entire suite, not the whole ecosystem, but they have to have highly capable capabilities. Tooling, certainly the GPU piece of it, but data platforms is fundamental. So, what are you seeing in terms of... So, we talked about some of the drivers, cost, latency, et cetera. What are you seeing in terms of where are organizations on the maturity curve? I know you do a lot in financial services. Where are we in terms of on-prem and hybrid AI?>> Yeah, and you're right, we serve 8 of the 10 largest banks in the world today. Where are we on the maturity curve? I think that we're starting to see real projects come online. And I think, as it pertains to that on-prem or hybrid side of things, you need an architecture that can be air-gapped, that can be cut off from the outside world. And there are multiple motivations for that. In some cases, it is strictly compliance. You don't want to send very sensitive data to some external LLM to complete that job. So, you have your own air-gapped LLM, you've got LlamaDeploy on-prem, or whatever the case may be. And so, you're seeing a proliferation of those types of architectures that gives customers control and ownership of their data and the environment surrounding it.
Dave Vellante
>> So, I feel like the banks, Justin, are like the penguin on the iceberg. They go first, the other banks start diving in. And then, maybe broader enterprises. I'm interested in your thoughts on the expertise levels that clearly exist within large banks, like JPMC. Actually, we saw JPMC up on the stage at Dell Tech World this year. I don't know if you guys... I know you guys were there. I don't know if you were personally... And as an audience member, I said, "Okay, that's JPMC." What about the rest of the enterprises? How are you making it easier for them?>> Yeah, I mean, that is part of our top priority, I would say, is how we take this technology and make it easy to consume. The Dell partnership is one vehicle that's an appliance. So, you just plug it in and it turns on, it's already pre-configured and integrated. But even with our software itself, deploying it on hardware of your own choosing, we try to make it as simple as possible. And I think anybody that has had experiences over the past decade or so, with earlier data lake technologies, like Hadoop, back when we first met, in the earlier innings of big data, those skill sets do translate to modern data lakes and modern data lake houses. The difference is, instead of Hadoop, it's now object storage, but you have so many choices of object storage that you can deploy, that are essentially S3 compatible. And I think the real opportunity there is to create your on-prem stack and such, your fashion that looks exactly as your cloud stack would be, that gives you that portability and transferability of workloads and projects.
Dave Vellante
>> So, we just heard the options bell ringing, so that's good. A nice little bonus here for our interview. So, when you talk about that on-prem stack, I guess my question would be, as we were sort of referring to before, it's not likely you're going to have the massive optionality that you might have inside of an AWS, but the key is the functionality, is it not? And so, what is it that customers are demanding in that sort of on-prem, hybrid AI?>> Yeah, so there are a few dimensions to it. One is cost, and that shouldn't be overlooked. That being able to make a CapEx investment in your hardware infrastructure, and then amortize that over time, has a lot of benefits for a customer, particularly when you're talking about large scale, at which all of these banks are thinking in a very large scale. So, that's one advantage. Another is, again, on the data privacy, data security, data sovereignty constraints. You have much more control over your environment, and how that perimeter is set up, how data is isolated, how resources are managed. And so, that's another advantage that we see. So, those are some of the things that we see driving customers to consider hybrid or on-prem deployment.
Dave Vellante
>> When the cloud hit, because we really hit steep part of the S-curve in the middle of the Hadoop wave, and it caught some companies off guard, as you well know. At the time, though, one could argue and make the case that cloud was less expensive because there was so much margin in the on-prem piece. Has that flipped? You hear a lot of debates, "Oh, the IDC study shows that the TCO for on-prem..." It depends on the workload, I know, but am I correct that that gap has clearly closed?>> Yes, I think so. And you're absolutely right that it is workload dependent. Some workloads lend themselves well to elasticity, where you can get that efficiency out of shrinking your cluster, and scaling it up again on demand. And other workloads are going to be more continuous and running all the time. And in those cases, that's where you're going to get a lot of advantage out of those on-prem CapEx investments.
Dave Vellante
>> I'm interested in your thoughts on the convergence of AI and analytics. It seems that everybody says you can't have good AI without good data. It seems like analytics people have put a lot of effort. I mean, we've got a $50 billion BI business, just to get metrics and dimensions right. So, that's actually a tailwind to bring AI to that data, if the data is clean. What are you seeing there in terms of that convergence? How does Starburst play?>> Yeah, so I think you're absolutely right. AI is a data story, fundamentally, and your AI is only as good as the data that you have access to. And so, I think that the pieces of the analytics story that translate are access itself. Do you have access to the data that you need, that's going to make the AI smarter and give better answers, more accurate answers to the questions that you ask it? But also, the access control component, which is very inherent to analytics as well. You want to make sure that Dave sees the right data that he's allowed to see, and that's different data than I can see, based on our roles and responsibilities. And so, that fine-grained access control has to apply in the AI world as well, maybe even more so, because now you're talking about not just what is presented to the end user, but also what is potentially passing on to an LLM. So, model access management is a new frontier as well.
Dave Vellante
>> And I want to get to Agentic. We're 20 minutes in, and I haven't said Agentic yet, but so I know you're leveraging Trino, but I'd be remiss if I didn't ask you about open table formats. What are you seeing in terms of iceberg adoption? How are organizations governing those open table formats? There seems to be a lot of confusion there. You see the tug of war between Snowflakes and bricks of the world. Adding to that confusion, you're kind of Switzerland in that whole equation. So, that's why I think it's a good question for you. What are you seeing?>> Yeah, so definitely, I would say adoption of Iceberg has only continued to accelerate. And I think, last summer, a little over a year ago, was a major inflection point, when all the data platforms adopted Iceberg, and that made it a de facto standard. And so, since then, you've had 12 months of customers just embracing it wholeheartedly. And pretty much every new project we see is Iceberg today at this point. So, I think that has solidified the format war. And then, to your point about access controls, yeah, now it becomes an interesting question of where are those access control policies going to be defined? Where is the metadata going to be stored? What catalog are you going to choose? And that's where I think customers, again, need to look at how they preserve optionality. In the same way that the open format of Iceberg provides optionality at the storage layer, you need to think about that, again, at both the access control and metadata/catalog layer as well.
Dave Vellante
>> Okay, so this is where it gets interesting. So, I'm going to run a scenario, buy you a premise thesis that George Gilbert and I have been working on. We've been doing tons of work here in the last couple of years, but we observed a while ago that the locus of control is sort of moving up the stack, from the database layer into the data catalog and governance layers, but a lot of that's becoming open source. And so, the value is even going further up the stack. How do you think about that? First of all, is it a valid premise? How do you think about that? What is Starburst's role in that migration?>> Yeah, I agree from the standpoint of the areas where you're going to compete, and have competitive differentiation is starting to move up the stack. However, what I would say also is, I think a lot of these components become features, rather than products of their own. And what I mean by that is, I think access control is a feature of a data platform. I think catalogs are a feature of a data platform, rather than standalone products in perpetuity. And so, these platforms are essentially integrating these capabilities. And so, you have choices of integrated functionality, and then you have this natural choice of, to what degree do I want to disaggregate that integration to provide greater choice and optionality? And I think that's a challenge that every customer is wrestling with and making a choice. And I think, for some customers, they'll be highly disaggregated, and they'll choose different vendors or technologies for every layer of the stack, and others will choose that more integrated solution, depending on their needs.
Dave Vellante
>> Interesting. Okay, so there's a moat there, that integrated solution, but that moat is a thanks. Now, I guess in combination to Ali Ghodsi, you got to give him some credit for the way he said, "Hey, let's just bring the data to any compute, to any data." It was sort of an interesting gauntlet that he laid down, and I think the industry bought it, and OTFs have obviously taken off. And so, that brings me to Agentic. So, as you go up the stack, for those who want the less, maybe, I mean everybody wants integrated, but they also want to protect themselves from lock-in. But as you go up the stack, how do you see agents fitting? Because now you're into the domain of Salesforce, ServiceNow, Oracle, and SAP, and that's a new competitive domain for a lot of the data platforms. How do you think about that?>> I think this is one of the most exciting parts of the story right now, because you're right, there's a convergence of players that have never had to meet each other in the market before in any meaningful way. You have the application vendors, as you pointed out. I think ServiceNow, Salesforce, you could add SAP to that mix, that, for the first time, have to decide, are we data companies? Are we data platform companies? To what degree are we going to move further into the more data processing, data platform, part of the stack? And similarly, you have data platforms now building Agentic capabilities on top, that could potentially threaten a lot of what those application providers do. So, it's a totally new frontier. From my vantage point, what I would say, and this is maybe the most controversial, or not. Thing that I could say is, what I see at the application layer right now, is that all of those companies that we mentioned, they're all afraid to overtly compete with companies like Databricks today. And so, you see a lot of partnerships being announced, and it's all, "We love each other, everything's great."
I think that the mistake that they may be making is that they may not want to compete with Databricks today, but Databricks is definitely going to compete with them. And the faster that they realize that, that this overlap is going to be a contentious territory, that only one will win, the faster they can adapt and change their own strategies and investments. I think, right now, they're ceding almost too much control to Databricks in that.
Dave Vellante
>> Yeah, certainly the cloud guys are, I mean, that Databricks do a lot of business in the cloud. I would love to dig into this with you a little bit, because if you juxtapose Snowflake and Databricks, everybody puts them together all the time, but they're actually quite different. Snowflake's got that integrated database, and that's really their moat, Databrick's blowing that apart. But both of them have, I like the way you phrased it, it's almost like the SaaS vendors are coming into their domain. But what if you flip that, and think about, actually, Databricks and Snowflake moving up? Because, right now, they've got BI vendors, like Hex and Sigma, and obviously Tableau participating, and they've got the metrics and dimensions there. They've got this sort of systems of intelligence layer emerging, which is going to interpret those metrics and dimensions in the business context, and actually feed up the agents. And so, my argument would be, in order to do that, you have to have sort of a new architecture, not things that databases understand and strings. You've got to have things that humans can understand, because that's the new programming language. So, people places things, and I would say activities, like processes, where's the process knowledge, and the metadata associated within the business logic? That actually lives in the application domain. So, my point is, I'm flipping the... And I was trying to get you into that mindset to see what you think about that, that brings Snowflake and Databricks up into new domains where they don't necessarily have access to that data, they have to get it. You probably saw Celonis sued SAP. Maybe you did see that.>> Yeah.
Dave Vellante
>> Yeah, to get access, and it looks like there's a preliminary injunction there. But that, to your point, underscores the competitive tension that's going on there. So, what do you think about that bit flip that I just laid out?>> Yeah, no, I think this is what every application company needs to be thinking about, thinking about their strategy and what their play is here. From our vantage point, and obviously, we have a vested interest in saying this, we think it's whoever has access to the most data that has the greatest advantage in that battle, because all of these types of enterprise agent interactions are ultimately going to be dependent upon the context provided to it. And that context exists in data silos all throughout the enterprise. And it's really that proprietary data. If Salesforce is approaching this with their Agentforce offering, they really only have the data in their CRM. But that's a very limited view.
Dave Vellante
>> Informatica might change that, though.>> Exactly. And that's why I think they made that move, which is very interesting. So, now what are the other players going to do, right? And that's where I think this part of the stack is very exciting.
Dave Vellante
>> Well, and you're all about data access, so that's the value that you bring, and to the extent that you can make that data access facile.>> Exactly.
Dave Vellante
>> Starburst becomes more valuable, especially in a hybrid world. I mean, that to me is the premise behind your business.>> Yep.
Dave Vellante
>> So, you don't really care about the database wars, right? I mean, maybe not withstanding Oracle. We'll get its fair share. So, your North Star is facile access, is it not?>> That's right. That's exactly right. And we think that's what sets us apart from the other players in the industry, fast, governed access to data that you need for analytics and AI.
Dave Vellante
>> I was talking to Tony Baer at one of these conferences, and he said Salesforce should buy a database company, a data platform. And, to your point, I said, "I don't know. They want to be partnering with those guys. I'm not sure that they want to go down market. I think they may be better off up market." But that, again, brings me back to Agentic. So, what's required to make Agentic work? So, George and I did a piece, a lot of this is George's thought leadership. We laid out the three eras of Agentic. One was the ChatGPT consumer era. The other one was the coding era, with Cursor and Anthropic. We'll see if ChatGPT-5 changes that. We're getting a lot of mixed reviews about GPT-5. I don't code, but people who tell me say it's okay. But anyway, and then the third area, enterprise agents, and that's really where you fit.>> That's right.
Dave Vellante
>> And we wrote a piece why JP Morgan Chase, or why Jamie Dimon, is Sam Altman's biggest competitor, because he's got all the proprietary data. A couple of weeks ago, we wrote a piece on how, and how is this 4D map, and basically open weights, which OpenAI just announced. We'll see how that plays out with maybe a less functional model. So, what do you think about that sort of digital representation of the enterprise and the necessity to have that? Again, from your standpoint, it's all about access. But organizations, do you agree, must have that instantiation of their data, that digital twin, if you will, of the enterprise, then it powers the agents? What do you think about that?>> A hundred percent. I completely agree with the premise, and I think that is what is going to help those companies compete and differentiate from their competitors. JPMC's ability to access their data and leverage that to power their Agentic interactions with their customers is going to distinguish them from other banks, for example. So, I think everybody is facing a pseudo-existential sort of reality, where they need to think about, "What am I doing to distinguish myself from everybody else who has access to the same technology?" And I think the key is going to be the data, like how well you have organized your data, the access that you're providing to that data, is ultimately going to drive your proprietary advantage.
Dave Vellante
>> And what's interesting is, when you read a lot of academic papers, and you certainly hear in the press, the talk is around the LLMs, GPT-4, GPT-5, and the algorithms, but often these papers forget about the data and the importance of that data. It's funny, I mean, when you look at the LLM business, the economics are brutal. Pharmaceutical industry has a decade of patent protection. LLMs have a few months of lead. I mean, even the latest, have you played it with GPT-5 much?>> I have, yeah, I was on the train. I mean, I think it's very interesting, and it solves a class of problems. But again, to the subject of data, it's limited by what it has access to. It's essentially trained on the internet, and I think in the enterprise context, that needs to be enriched by their own proprietary enterprise data.
Dave Vellante
>> Yeah, so we totally agree. So, what is your agent play?>> So, our agent play is twofold. Number one, we have built our own agent. It's a conversational interface that is behind the scenes, taking your natural language text, and converting that into SQL queries for execution. And it's actually pretty good. We're pretty impressed with how far it's come. It builds on a component of our product we call data products, which is a way of creating curated data sets that are governed and curated, that have appropriate metadata, and that extra information needed to ensure that you don't have hall hallucinations and get accurate answers. And then, separately to that, we're also building the capability for others to build their own agents on top of our platform. And that's where we think there is a bright future for a very long time, where customers are going to ultimately start to build their own tailored agents for very specific custom tasks that need access to data. And we want to be that data access point for the agents that they built.
Dave Vellante
>> Data access, obviously hybrid data, big bet on hybrid. What can you tell us about the company, the momentum, any metrics you can share? What could you tell us about the business?>> Yeah, we're off to a great start. We're about 450 employees globally. We serve hundreds of the largest customers in the world. We focus, in particular, on large enterprises, again, 8 of the 10 largest financial institutions. And broadly speaking, Fortune 2000 is our ideal customer profile.
Dave Vellante
>> Yeah, it's been such a pleasure watching your ascendancy with Starburst and growing a team, and so congratulations on your success. The best is yet to come, Justin.>> I appreciate it.
Dave Vellante
>> Thanks for coming back in theCUBE. It's great to see you again.>> Yeah, my pleasure.
Dave Vellante
>> Okay. And thank you for watching our program here. This is Dave Vellante. John Furrier is up next. Keep it right there at theCUBE plus NYSE Wired from the New York Stock Exchange. We'll be right back.