theCUBE + NYSE Wired: Mixture of Experts Series | Ori Goshen, AI21 Labs

In this video, Ori Goshen, Chief Executive Officer of AI21 Labs, joins theCUBE at their new studio set at the New York Stock Exchange. As a prominent figure in the artificial intelligence domain, Goshen discusses transformative AI advancements with hosts from theCUBE and analysts from theCUBE Research, offering a deep dive into innovative developments occurring at the intersection of Silicon Valley and Wall Street. Goshen is acknowledged for their leadership at AI21 Labs, aiming to create reliable AI systems. During this engaging session, they share insights from the recent HumanX conference and NVIDIA's GPU Technology Conference event. The discussion spotlights the company's journey since its inception in 2017, the evolution of deep learning, the unique challenges of generative AI, and the critical role AI plays in modern enterprises. Throughout the video, viewers gain an understanding of the major takeaways from the conversation, including Goshen’s views on AI democratization and innovative AI orchestration layers. They mention that the enterprise adoption of AI technologies is still in its infancy, with less than 10% of projects reaching production. Goshen emphasizes the need for reducing AI’s probabilistic nature and increasing reliability, which are essential for leveraging AI for mission-critical applications. Find more SiliconANGLE news and analysis https://siliconangle.com/ Follow theCUBE's wall-to-wall event coverage https://siliconangle.com/events/ Learn about the latest theCUBE events https://www.thecube.net/ 00:00 - Intro 00:06 - Journey Through Innovation: The Early Days of AI21 03:26 - Advancing AI: Integration and Evolution 05:55 - Navigating Trust in Pragmatic AI Systems 08:02 - AI Evolution: Orchestration and Origins 12:18 - Reviving Trust in Artificial Intelligence Systems 16:18 - Announcing Maestro: AI Planning and Orchestration 18:58 - AI21's Vision: Navigating Challenges and Innovations in AI Efficiency and Storage #AI21Labs #theCUBE #NYSE #HumanXConference #NVIDIA #ArtificialIntelligence #MachineLearning #GenerativeAI #DeepLearning #EnterpriseAI #AIOrchestration #TechInnovations #DataTechnology

Ori Goshen, AI21 Labs

In this video, Ori Goshen, Chief Executive Officer of AI21 Labs, joins theCUBE at their new studio set at the New York Stock Exchange. As a prominent figure in the artificial intelligence domain, Goshen discusses transformative AI advancements with hosts from theCUBE and analysts from theCUBE Research, offering a deep dive into innovative developments occurring at the intersection of Silicon Valley and Wall Street. Goshen is acknowledged for their leadership at AI21 Labs, aiming to create reliable AI systems. During this engaging session, they share insights from the recent HumanX conference and NVIDIA's GPU Technology Conference event. The discussion spotlights the company's journey since its inception in 2017, the evolution of deep learning, the unique challenges of generative AI, and the critical role AI plays in modern enterprises. Throughout the video, viewers gain an understanding of the major takeaways from the conversation, including Goshen’s views on AI democratization and innovative AI orchestration layers. They mention that the enterprise adoption of AI technologies is still in its infancy, with less than 10% of projects reaching production. Goshen emphasizes the need for reducing AI’s probabilistic nature and increasing reliability, which are essential for leveraging AI for mission-critical applications. Find more SiliconANGLE news and analysis https://siliconangle.com/ Follow theCUBE's wall-to-wall event coverage https://siliconangle.com/events/ Learn about the latest theCUBE events https://www.thecube.net/ 00:00 - Intro 00:06 - Journey Through Innovation: The Early Days of AI21 03:26 - Advancing AI: Integration and Evolution 05:55 - Navigating Trust in Pragmatic AI Systems 08:02 - AI Evolution: Orchestration and Origins 12:18 - Reviving Trust in Artificial Intelligence Systems 16:18 - Announcing Maestro: AI Planning and Orchestration 18:58 - AI21's Vision: Navigating Challenges and Innovations in AI Efficiency and Storage #AI21Labs #theCUBE #NYSE #HumanXConference #NVIDIA #ArtificialIntelligence #MachineLearning #GenerativeAI #DeepLearning #EnterpriseAI #AIOrchestration #TechInnovations #DataTechnology

Share this session

Clips
More from theCUBE + NYSE Wired: Mixture of Experts Series

Ori Goshen

CEO

AI21 Labs

play_circle_outline Bridging the Gap: Practical AI Systems for Enterprise Adoption Success

play_circle_outline From Predictive Models to Accessible AI Tools: The Need for a New Planning Layer

play_circle_outline Origin story of AI21 and focus on trustworthy AI systems

play_circle_outline Introduction of Maestro for AI planning and orchestration

play_circle_outline Customer base includes Fortune 100 companies in retail, finance, and energy

Info
Transcript

Ori Goshen, AI21 Labs

Ori Goshen

CEO AI21 Labs

In this conversation from theCUBE + NYSE Wired's Mixture of Experts series, theCUBE’s John Furrier sits down with Ori Goshen, CEO and cofounder of AI21, to unpack where enterprise AI is headed as Silicon Valley innovation meets Wall Street realities on the NYSE trading floor. Goshen traces AI21’s origins (“AI for the 21st century”), notes investors including NVIDIA and Google and explains why only ~5–6% of Gen AI projects reach production today. From the momentum of NVIDIA GTC (Blackwell, Vera Rubin) to the growing two-sided ecosystem, the discussion centers ... Read more

explore Keep Exploring

What is the history and mission of the company, and what problem are they currently focusing on addressing? add

What opportunity is seen in creating a new layer in the stack responsible for planning and orchestrating models and tools within the enterprise in order to support complex tasks? add

What was the original mission of the company AI21 and why did they choose that name? add

What is Maestro, the AI planning and orchestration system announced by HumanX? add

What types of companies does the organization work with in terms of adopting AI technology? add

bolt Powered by CUBE AI

Ori Goshen, AI21 Labs

search

>> Hello, I'm John Furrier here at theCUBE at our New York Stock Exchange. New studio set. We're here breaking down all the action connecting Silicon Valley and Wall Street with theCUBE. Silicon Valley and Palo Alto here in the New York Stock Exchange. Ori Goshen is here, the CEO of AI21, CUBE alumni. Just saw him at the HumanX conference. Ori, great to see you. Thanks for coming in and being our first non-CUBE Pod interview here in our new set.

Ori Goshen

>> Oh, that's exciting. Thank you for having me. Yeah, great.

>> And we've talked many times on theCUBE and we saw each other at HumanX. First of all, congratulations to you and your company. Great job seeing the success you and your co-founders have put together. Obviously, you got in early on the front end of this wave, continuing to see the traction at GTC NVIDIA's conference last week. I mean, it felt like an Amazon Web Services event 2013, 2014, when everyone kind of figured it out like cloud's here, and then the ecosystem is there, the ecosystem's thriving. And you're starting to see NVIDIA's got that two-sided marketplace. It's got to do supply chain and the billions in CapEx. But now on the other side you have this massive ecosystem. Partners, joint development partners, more AI infusing into the system. This is going to be, I think, open up the whole data layer as the performance of more Blackwells, Vera Rubin, the chips that were announced in the systems. The systems are coming and they're here hitting the table. What are you guys looking at right now for your company? Where are you at in this movement on the wave? How do you describe where you are right now and what you guys are offering?

Ori Goshen

>> Right, so first it's always great to look at where we are coming from, or the origin stories of the company, because it's many ways really related to what we do right now and what we're focusing on. We started the company, and me and my co-founders back in 2017, early days for deep learning and pre-LLM days. And our mission was to create these trustworthy artificial intelligence systems. And in many ways that mission is more relevant today than it was back then when we started it. And what we're seeing right now is just incredible adoption on the consumer side, on the individual level. I mean, my mom uses generative AI tools, and I think people are realizing how much potential there are there. But on the enterprise side it's a very different story. I mean, just look at the numbers. Some of the reports say only five, 6% of generative AI projects actually make it to production. And we believe it's not just a new technology wave and there is a learning curve and so forth. There's actually some fundamental things about the technology that they're more to be built to get it to enterprise grade and also get it to the reliability level that's required to supporting all these mission-critical applications. That's where we started the company and that's what we're focusing, the problem we're trying to address right now.

>> And the production workloads, it's easy to hit the low-hanging fruit, we saw call centers, anyone with data will get it. Now you got to see the security discussion kicking in. And the one thing I did walk away from the NVIDIA side of the equation was is that the AI processing power is Jensen's really big discussion around reasoning, right? You're talking about how hard that is to do. And as you start getting into that layer, it's going to bring the next layer above the hardware, these systems, the smart software. And you guys have been always building these services. How do you make it, the word democratization has been used so many times. I'll just say it for the sense of it is a word people use, but in true sense democratization means making it easy. So, making it easy and simple to understand. If you're deploying, can you talk about what's going on now for models integrating with each other? Because you're starting to see people come through the realization that it's a data problem. Data will talk to each other, the AIs will be doing the work, and the AIs are getting better every day at software. Agents are only going to scale that to levels we've never seen before. What's going on in that layer of AI where, "Okay, I got the horsepower, I got to get smarter." What are some of the core things that you see that needs to be in the sequence of events of innovation? Is it model styling the models? Is it better software, smarter AIs, is it the format? What is your take on this? How do you guys see this?

Ori Goshen

>> Yeah, I mean, I think the stack is forming. It's been quite liquid in the first few years of this emerging industry, but you're seeing a lot of progress on the hardware side. And you mentioned NVIDIA and Jensen's announcements, which were really impressive, and we've been seeing in the last few years a lot of progress on the model side. The models are more capable, now we have reasoning models that can address more complicated tasks. And I think we're all aspired to that vision of having agentic systems in the enterprise and we can offload work to them and they can automate a lot of the workflows and a lot of the processes inside the organizations, and in many ways also help us to achieve things that were impossible. I can see small organizations that can actually build massive operations that weren't possible before, but there's a gap, because these models are probabilistic by nature. And with all their power they still, they're super probabilistic. I mean, sometimes they're brilliant, but other times they're as dumb as nail. You need to find a way to compensate for that. And what we believe is an amazing opportunity is actually a new layer in the stack that's responsible for planning and orchestrating models and tools within the enterprise, and basically supporting these very complex tasks that enterprise are trying to address. And if you have these very reliable layer that reduces hallucinations, that increases the accuracy, that gives more control and visibility towards the process that the enterprise are trying to solve, that actually takes us from potential to production.

>> And the thing that came up in the past three events, and again, NVIDIA is fresh on the mind, no matter what industry I went to covering the events, the words trust comes up a lot, practical, practical AI, domain expertise. Brian Baumann's running a series called MOE, Mixture of Experts, pun intended. We brought that up. And a lot of chain of thought going on the podcast on the MOE part, which you're now part of. This is where AI has to get better, because it's not just security, it's just just AI, the hallucination, the drift. These are challenges on the software side. So, making it practical becomes big. And so I have to ask you, when you look at these regulated industries, they have high bar in enterprises. The systems are brittle. Sometimes manual hooks are put together. I mean, you can't just go in there and tinker around, you got to build on top of it. I want to know more about this orchestration layer you're building, because does that abstract on top of, and what does that do for say people trying to deploy practical AI or AI that they can get ROI out of it?

Ori Goshen

>> Yeah, no, it's exactly, I think it's the right question. The regime currently, the way developers are approaching for building these agentic workflows is they either use these models, these reasoning models or large language models as the controllers over the process. We call this prompt and pray. You prompt the system and you hope for the best. And after putting this in a demo and realizing that this is pretty brittle, then people switch to what's called static chaining. They hard codify the process, they engineer it. And they make a lot of engineering design choices like how the best way, what's the to retrieve information? What's the best way to use APIs within organization? What's the best way to prompt the models, which models to prompt? There are a lot of design choices that developers do right now, and we think there needs to be a system that automates that.

>> And no one likes the word static, by the way. Static is like monolithic. No one likes those words anymore, I mean, static, chaining, hard coding.

Ori Goshen

>> But that's what happens in practice. That's what we see developers do. We think there's a planning and orchestration layer that automates, that abstracts a lot of the complexities for these developers. It optimizes for accuracy. It's a new contract with AI. I mean, developers would specify the tasks that they're trying to solve, not by only prompting, but also explicitly telling what are the requirements, what are the guarantees they expect from the system? And then you have a whole new layer that's responsible for automating and make sure those requirements are satisfied. That's a new paradigm.

>> We were talking before we came on, the podcast came on camera here about AI21 name, used to be AI Labs now AI21 Labs, now it's AI21. And I made the joke, "Give me 21 to drink the AI." And you said, "No, no, it means 21 century." Explain the origin of the name, how that came about, and then some of the investors you guys have?

Ori Goshen

>> When we started a company, we call it AI21, AI for the 21st century, and the idea was to take AI beyond pure deep learning. And we all know deep learning is the basis for all these large language models and reasoning models is great, and it's a necessary condition, but it's not sufficient. And the original mission was how can we robustify, how can we build a technology that is both very capable but also trustworthy? And we thought that's actually AI for the 21st century.

>> Not to come back to NVIDIA's GTC again, but one thing I love about Jensen is that, and they're an investor too, right?

Ori Goshen

>> Yes.

>> NVIDIA put money in?

Ori Goshen

>> Yeah.

>> And Google, two big AI companies. Jensen's the only CEO in the industry that he uses the word computer science multiple times in his keynote. He constantly references computer science. He's hardcore computer science. We know him, so it's great to see that. Actually, the original tagline for SiliconANGLE where computer science and social science meet from 2009 when we started SiliconANGLE, so we stayed true to our North Star. But talk about computer science. You mentioned deep learning, because what I like about why Jensen says that is because there's a resurgence in computer science, not just coding, not just software engineering. When I went to college, it was called software engineering, computer science degrees. They worked on hard stuff like operating systems, compilers, heavy stuff when I got my degree. But now computer science seems like there's a whole renaissance, and you can see it in his nerdiness on this keynote and what NVIDIA is doing for products. What's the, in your view, what is the core fun and key areas right now in computer science? Because everything matches the same rhythm that what Jensen's saying, "IT wasn't built for AI." And so things that were pre-existing or incumbent systems or software or databases don't work well in AI. You kind of got to be flipped upside down and be more foundational, less technology. What is your take on the computer science as AIs come in as a foundational element? There's science involved, there's engineering. What's your view on this whole computer science piece of it?

Ori Goshen

>> For sure. I think we've had a lot of predictive models. If you look at data science in the last 10 years or so, there are a lot of very powerful predictive models. But that are actually being used today in production. I think now there's a way to interact with those internal tools in a much more ... I mean, the opportunity is to be able to make them much more accessible and interactable. Like if you can in a simple language describe what you want and behind the scenes there will be very complex tasks that will do all these predictions for you, and will give you a lot of analysis power. That's huge. And it's applied for, if you think about it, on almost any scientific domain. But what I'm really excited about, and I think also Jensen mentioned this in his keynote, is the industrial revolution. I mean, companies, IT departments are going to look very different in the future. We're going to have these AI builders, AI engineers in the enterprise that will build not one or 10, there will be hundreds or thousands of these AI applications. And those AI applications will become almost the operating system of any company. So, it's a very exciting vision and future that's ahead of us. But also we need to make sure these systems are trustworthy, that's almost like a necessary condition.

>> Yeah, and I call we're entering in the age of the Lego engineer. I mean, think about it. Anyone can get the right Legos. You do anything. I want to talk about what's going on with your business right now. Obviously, you're years into it, the waves hitting great reality. On the hype side too, starting to see some rationalization around that, practical AI booming, and then AI scaling is coming fast. That's, every company has using it, not just machine learning, but generative AI in the process. What's new going on with you guys? At HumanX you talked about Maestro, which I think is the ... Not observably, but the pipelining and the management of the tools. Taking up what's going on with the new stuff there? What are you guys currently working on that's cool?

Ori Goshen

>> Yeah, at HumanX we announced Maestro. It's our AI planning and orchestration system. And it's really a new way for companies to build these adjacent systems that they can trust. The Maestro can use multiple models, it's model agnostic. It can learn the tools, the specific tools of the enterprise, like which APIs, which databases, and it learns the specific enterprise environment. And then given a new task that a builder is trying to develop, it actually optimizes what is the best path, what is the best plan, and what is the best path to realize that plan? And that needs a lot of deep technology, and a lot of algorithmic innovation. We've stayed true to our DNA and to our North Star. We're a deep tech company, so we want to build deep, differentiated hard technology, and we are targeting the enterprise trying to solve their most challenging problems. And so, we built this product with enterprise developers in mind. And provide them with the tools they need to take those amazing, amazing agentic ideas to production.

>> Give an example of that in the use case you're looking at for that, give me a slice of life. What's that like using Maestro?

Ori Goshen

>> Yeah. So, think about an organization that needs to create financial reports at scale. Like hundreds of financial reports per day. Very nuanced, very detailed. The cost of doing mistake is high. And so, you need to find a mechanism of how to validate, how to make sure you invest the appropriate amount of compute, to make sure the results are actually accurate. And give the visibility, like you want to have a full understanding, visibility, execution graphs that can provide of what happened behind the scenes. These industries cannot work with black boxes. So, we're helping these financial institutions to control over the accuracy and the cost of applying these agentic systems in the end.

>> Talk about your customer base right now. Who's using the services? What's the software being deployed on? Take us through some of the customer action.

Ori Goshen

>> We work with Fortune 100 companies. We work with retail, finance, energy-

>> Enterprise AI.

Ori Goshen

>> Enterprise, large enterprise organizations that want to adopt AI. I mean, we work with some of the largest organizations in the world. One example I can share, we work with Fnac. It's one of the largest European retail company, and we help them just doing amazing things for their internal processes and workflows in their enterprise.

>> What are some of your goals for this year on your plate? What are you trying to accomplish?

Ori Goshen

>> My goal is, I mean, powering numbers. We're not optimizing for huge amount of customer base, but we want to partner with a few organizations and help them go from let's say, tens of use cases in production to thousands. That would be a huge success for these companies, and for our product actually showing the value and how fast they can adopt this technology at scale.

>> What do you think the blocker is right now from the production workloads getting in? Is it just evolution, timing, just where it is in the industry that the clients are getting ready, or have their readiness, as they say, AI readiness kind of checked off? Is it security, resilience? What are some of the things that you're seeing that might be slowing everything going into production, or just that the approval processes, and machine learning has been in production for years, but generative AI is moving fast?

Ori Goshen

>> Yeah, so what we see is a lot of experimentation, massive experimentation. And there are many considerations as you mentioned. But the two I'd like to highlight are efficiency. The cost is still very computationally expensive, unlike traditional software, so the cost are a factor. And when you move to this agentic era where more complex tasks, then latency, the time that actually takes to fulfill a task is also a very important consideration. The second one and probably the most sensitive one is the accuracy. There's use cases that the 85% accuracy is just not good enough.

>> If someone asked you, what's AI21's secret sauce, what would you say?

Ori Goshen

>> We build very efficient foundation models. We build them on an alternative architecture that, I mean, everybody's using transformer architecture. We use a hybrid state-based model and transformer architecture, which makes our models really performant, especially for long sequences. So, we can process long documents, long books very fast and reliably, because it's optimized for these more enterprise types of use cases. We have that foundation layer that we use as part of our stack. And the second pieces are planning and orchestration layer that's really a new approach for builders to build those agentic systems. And we have a lot of algorithmic innovation applied in this layer to get to the level of reliability that's expected.

>> I know you got to go, but you mentioned performance is key, because first token out matters on inference. Training's a little bit different, that changes storage. Real quick, how is storage going to change? You saw a lot of new HBM, you got capacity tiers with SSDs coming fast. Storage is kind of flipping on the script there too. You heard that recently at GTC as well.

Ori Goshen

>> Right. And as these systems in enterprise need to retrieve more information, because the model is the model and it's inferencing, but it's based on data that you retrieve into, so you need fast access. And I'm sure we'll see a lot of innovation on that front as well.

>> Great to have you on theCUBE. Thanks for coming in into our new studio.

Ori Goshen

>> Yeah, thank you. Thank you for having me. That's really amazing.

>> Great to see you. Congratulations, you were the first one in. First CUBE interview here on our new set, our dedicated studio. Got two sets here, and you're talking to M. John Furrier of theCUBE, thanks for watching.

Ori Goshen

>> Thank you.