theCUBE + NYSE Wired: AI Factories - Data Centers of the Future | Autumn Moulder, Cohere

Clips
More from theCUBE + NYSE Wired: AI Factories - Data Centers of the Future

Autumn Moulder

VP of Engineering

Cohere

play_circle_outline Discusses the difference between shiny AGI models and pragmatic enterprise AI adoption.

play_circle_outline Autumn Moulder: Driving Custom Enterprise Solutions and ROI at Cohere as VP of Engineering

play_circle_outline Generative AI landscape presents challenges of reliability and efficiency for enterprises.

play_circle_outline Importance of both large language models and specialized models in enterprise applications.

Info
Transcript

Autumn Moulder, Cohere

Autumn Moulder

VP of Engineering Cohere

In this segment from theCUBE + NYSE Wired’s AI Factories – Data Centers of the Future series, theCUBE’s John Furrier sits down with Autumn Moulder, VP of Engineering at Cohere, to unpack how AI factories are reshaping enterprise infrastructure and the software stacks that run on them. Moulder explains Cohere’s enterprise-first approach across security, privacy and efficiency – meeting organizations where they are with the right-size models and applications, including Cohere’s North product. She shares how enterprises balance general-purpose LLMs with speciali... Read more

explore Keep Exploring

What is Cohere's strategy in balancing innovation in AI with the practical needs of enterprise customers? add

What focus does the company have regarding enterprise partnerships and AI adoption? add

What are the differences and nuances between large language models and specialized models in the generative AI landscape, particularly regarding their applications in enterprise versus free-use environments? add

What are the key considerations when integrating large language models into enterprise environments? add

bolt Powered by CUBE AI

Autumn Moulder, Cohere

search

>> Hello, I'm John Furrier with theCUBE. We are here at theCUBE's NYSE Studios. Of course, we have our CUBE studio in Palo Alto, connecting Wall Street and Silicon Valley, tech and money. As AI and AI factories change the game, we're starting to see large-scale enterprise adoption and thinking about how to leverage the models to create those AI-enabled apps as well as modernize existing applications. Of course, in this series, the models themselves are super important. We got a great guest, Autumn Moulder, here, Vice President of Engineering at Cohere, very strong in the enterprise. Autumn, great to see you again. Welcome back to theCUBE. We just had you on at Dell Tech World. Appreciate you coming in.

Autumn Moulder

>> Thank you, John. Much appreciated and excited to be here with you guys again.

>> We love Cohere. Aidan Gomez has been on early days, years ago when we first started focusing on the innovators in AI. A lot has changed. You have the race for the AGI crowd, which is hold no bars, go hard, things get fixed after the fact. And then you have the enterprise side which is more pragmatic. You guys have been leaning on both sides more on the enterprise. I'd love to get your thoughts on how you guys explain that because the shiny new toy is the fastest, latest, greatest model that does everything that people then it all has more hallucinations, but the enterprise doesn't tolerate that. So explain Cohere's strategy as one of the premier models out there.

Autumn Moulder

>> Yeah, I'm happy to chat about that because we very much have focused on that enterprise segment. We see this as a place where we can drive a lot of value and really help people navigate this transition. There's a recent MIT study that came out about how enterprise partnerships where you partner with an external entity versus roll it yourself like twice the success rate when you're looking to adopt AI. And so we've really taken that to heart coming from our foundations of building models. You mentioned Aidan joining and chatting with you guys early on and we've really evolved and said, "What does an enterprise need to adopt this fundamentally transformational technology?" And so we've evolved from, we build the foundation model to let's bring in some applications, the North products that we've recently developed, to make it easier to adopt AI within the enterprise, to partnering with these companies to say, "Hey, we can help you build a custom model. We can help you build a custom agent. What do you need to get over that hump and really deliver ROI for your business?" And that's just our focus. Whatever needs to be done, we do it.

>> Yeah. And talk about the generative AI landscape because there is a, you have to be on that flywheel, get the latest and greatest as innovation's happening. At the same time on the enterprise side, there's almost zero tolerance for hallucinations or they want better reasoning at different SLAs. So there's a whole nother set of, there's two worlds. There's the free, fast, and loose, and then there's the enterprise who are doing actually application-specific things. Talk about the nuances because you can have the big models that chase each other and then as the specialty and more of the specialism comes in, talk about the distinction and where's that line between large language models and then specialty models that have a long tail as, some call them small language models, you guys aren't small language, you guys are one of the LLMs, you guys are the large language models. You guys are like in the head and then the neck of the power law but also work down when needed. Talk about this is a super important, very nuanced dynamic.

Autumn Moulder

>> Yeah. And you're absolutely right. It is very nuanced because everyone's kind of at a different stage of their adoption journey, and so being able to help people just move the needle and get farther along, you definitely need to have both the ability to engage with a large language model and kind of the latest capabilities and understand where the industry is there, but also meet people where they are. We focus very exclusively on some of the challenges that we see are unique to large enterprise so things like security, making sure we can bring our technology into the enterprise so you don't have to worry about sending your data externally. That's one thing. We focus very much on efficiency, which ties it to this idea of large language models, small language models, what's the right size model and making sure that we can deliver the point solution whether it's a small language model for a highly efficient work flow, or let's use the general model so that you can reach the most number of users with the same solution. We bring both of those to the table and so we see this as an efficiency story and we also see it as a privacy story making sure you can hit both of those pieces.

>> Autumn, at Dell Technologies world, where you were on the queue, it was touched upon as well as other conversations, Dell has adopted the AI factory nomenclature coined by NVIDIA originally two years ago at their GTC event. Jensen Wong called them AI factories. And really it's that physical AI, the blending of first party things like physical objects, whether it's a truck, self-driving car, or robot or whatever with digital. So that's happening. And so these AI factories or large scale systems and the enterprises are looking like data centers from yesterday turning into basically a computer. Okay, that's kind of happened or happening. That concept is not foreign to the enterprise. What is now on the table in our conversations is what runs on the AI factory. So could you share how you see these AI factories, which is basically infrastructure. You need software to run on them. It's not like the old days, "Hey, buy a server and load Linux on it." Remember those days? I'm mean, I'm thinking a point over the top to make a point. It's not that easy. A new stack is emerging in these large systems. Network fabrics are different, storage fabrics are different, memory, the role of memory, all these things to make Cohere run better and/or interactions with other models and data movement, it's not predictable. So the classic compute storage networking paradigm is shifted. What's your reaction to that? What's your observation? Can you share any knowledge on this perspective? Because this is the top conversation. I want these factories, of course, there's demand.

Autumn Moulder

>> Yeah, it's a topic that's near and dear to my heart. So I've been in the industry for 20 years before cloud was a thing. We were sticking PCs under our desktops and saying, "All right, I'm going to build a service and just run it there and I hope, please don't unplug that." So we've come a long way since then. But you're right, in many ways we're reconfiguring data centers. Companies that have massive on-prem installations that they've been running for a long time and they have a cloud footprint, but a large set of their data is still sitting in these on-prem data centers. 48% of the installs that we've done on our North product have been in these on-prem environments because we're very much seeing these factories, that concept that Dell has done such a good job of packaging and saying, "Hey, let's bring the hardware, let's bring the latest set of technology to you." And maybe you don't want to just be reliant on third-party ecosystems. You need to bring it in both to your in-house system as well as have the ability to flex and use some of the latest tech outside of your ecosystem, your on-prem data center.

>> One thing I'd like to chat with you about, obviously we both had a little throw back there from old school hardware days, but software and developers are engaging with you guys. If you look at the ecosystem, even Dell, you look at Dell's ecosystem, it's always been kind of like a hardware ecosystem, just storage works here, SSD, all these things that they do, but now with AI factory, there's a crossover to software and not just software with chips and semiconductor information like networking technology and protocols and interconnects, there's a huge startup ecosystem that's innovating with AI, AI native applications that want to run on the factory. So now you have a tech stack that needs to be inclusive of, say, a startup. It could be three people in a garage. They're like, "Hey, she's got a great product, but she needs to get distribution with say, Dell, which is in all the banks, they're in all the customers, but yet they got to change the configuration to solve these big life sciences problems." Or whatever they're working on. So you have now a paradigm of, okay, the ecosystem crosses over. You guys interface with a lot of those startups. Could you share your perspective of one, state of the market? And two, a lot of these startups are getting pilots, but they're now in line for production. So there's a little bit of a clogging of POCs because that tech stack and the factory's not yet, this is kind of our opinion, you can disagree or we can talk about it, but this seems to be because a lot of these startups have one-year contracts or multi-year contracts by ARR standards, annual recurring revenue, but they won't get renewed after year one, so they're technically not recurring. So if they're not in production, they're not really ARR contracts. So this is a huge industry opportunity, but also challenge. A software stack on the factory solves it. You guys are in the middle of it. You see both sides. What's your opinion? Could you share any commentary on this dynamic? Do you agree?

Autumn Moulder

>> No, it is another good question because I do think it's very much in flux. You do have a lot of these startups. You have great teams that are early in their careers or have focused really well and can solve a problem and they see an opportunity in a particular vertical or a space where they've got a lot of experience. So you get these teams that are coming up with some really cool tech, but it is dependent on a lot of third-party systems, a lot of SaaS solutions. It's not something that you can package necessarily and run easily in an enterprise context. So certainly from a state of the market perspective, we're seeing a lot of clients who are saying, "Oh, I saw this cool thing. I don't know how to scale it. I don't know how to take it to production. I can't ."

>> Or security, the resilience bar is too high and enterprise have a high resilience bar, cyber.

Autumn Moulder

>> It's very much something that we've noticed a lot of companies don't quite get to that threshold, so they don't get to the point that they can share the resilience story. They can share the here's your certifications and here's the way you standardized your development life cycle. And so you know the software that you're delivering into an enterprise is.

>> All right, so I want to bring that up because you mentioned security earlier, and again, I want to get into the engineering differentiation that you have. Could you share how you guys are helping those startups? Because again, that's one of your key engineering points actually, the security piece.

Autumn Moulder

>> The way that when we think about helping those startups, it's very much, "Hey, bring this whole stack to you." It's something that if you leverage the stack that we're using and you build on top of that, you kind of get built in the ability to deploy into an enterprise. I mean you called out, the software stack needs to come on top of the hardware stack, and a lot of those components, it's just too much for a small team to build all of it out on their own. And so helping out, working with those teams, making sure either our software works with theirs from an open standard perspective or they can just leverage some of the components that we're building and help their business grow. Those are really key and important things.

>> Let's talk about customer use cases, architecture. We're getting some data from this series from practitioners. I'd love to get your reaction to this. This seems to be a pattern emerging. AI native companies that are building software kind of want the models to kind of do their thing, right? There's a model for whatever kind of vibe you want. You want the big, fast, and loose? Great. You guys have settled in on positioning. There seems to be a trend to abstract away the app layer and fuse AI into that software for that business benefit, take advantage of the models and then abstract away the infrastructure piece and let that model flywheel, sort itself out, or have connectors. Am I oversimplifying it? Do you agree? Is that a best practice?

Autumn Moulder

>> No, I think these things are like people are seeing those different components. There's nuance in it in that as always, right? You can get a vertically integrated solution and frankly, it's surprising often how much you need the software layer and the model to interplay to really get that 10x improvement.

>> So you're saying it's use case specific on whether the approach on the architecture. There's use cases for vertically integrated, either performance workflows are well-defined or more decoupled or abstracted away.

Autumn Moulder

>> Yes, very much. It's something that you can see that companies go into production, and they need that vertically integrated solution because once you've got something that works, you need everything to kind of line up.

>> Share some examples of customers have gone down the specialized LLMs versus say the general purpose ones, where it worked, what's the best use cases, what's some of the data you have, and then when does the general purposes work better or not work better? However you want to look at it.

Autumn Moulder

>> Yeah, no. I'll pull one that's pretty common. We've seen a pattern repeatedly where within a process that you're trying to automate, you need to do two things. You generally need something like a triage agent and something like a routing agent. And this can generalize to a lot of different processes, but those triage agents and those routing agents tend to be very specific to whatever process you're trying to automate. Those, when you think about building out that triage agent, building out that routing agent, that's where you can start to get some real quality or efficiency lift by having these specialized models or a scenario where you don't want to be swapping models repeatedly, you have something that works. That's a scenario where I see commonalities of people get something that works and they just want it to run. Where I see the model swapping work really, really well is when you're still in that you need that creative or generative system, but you're not sure what you want to do. You want that general model.

>> Yeah. Dave and I don't like to pat ourselves on the back, actually we do a lot, but in 2020 we published a power law of models as we did a big research note on this and was kind of contrarian at the time. We basically said that there's going to be a power law and specialty models would emerge. We got most of the labeling right on the XY axis. But the idea was, yeah, the big models will do their thing at the head and then you'll have long tail. It kind of played out. And you mentioned some of those things around the different approaches, routing, context around how to manage, integrate with models. So what is the future of specialized LLMs? Because I think it's clear where you just pointed it out that you're going to have models that have different characteristics for certain things and you want to go to them. Some might not even need GPUs, some might want to do compute. So now you have an interplay between resource and output and latency. So you have now kind of new factors. So that to me would say that the specialty models will continue to evolve faster. What is the future of the specialized LLMs?

Autumn Moulder

>> Yeah. As with anything in the space, it's hard to make a prediction. Kudos to you guys to staking a position a few years ago. From my perspective, we're going to continue to see specialty models that they come into play when you need that edge of accuracy or quality. However, the common, the general models will continue to get better so that maybe today you can get to a 70% accuracy or a 60% quality with the general model. Next year, maybe you can get to 80% or 75%. And so those, that baseline for how far you can get with the general model, will continue to improve, but you're always going to be at the edge of quality, accuracy, improvements to get the best or the world-class performance there. I think you'll always see you have to have .

>> It's a classic trade-off. Autumn, thank you so much. Again, we love your company. We love what you guys are doing. Again, you're part of the pioneers in the language model. Say hello to Aidan for us. Got a great group over there. Got a long game, you're staying on track with the mission, not a lot of zigzagging. It's like other companies. So congratulations and thanks for participating in our AI factory series.

Autumn Moulder

>> Awesome. Thank you so much. Appreciate it.

>> I'm John Furrier with theCUBE here at the NYSE CUBE Studios, the NYSE Wired community. Of course, we're bringing all the action, all the data here of our specialized model with theCUBE data. I'm John Furrier, thanks for watching.