This segment examines observability for enterprise AI factories. Paul Appleby of Virtana draws on two decades of experience in observability and infrastructure to explain how organizations build and operate enterprise-scale artificial intelligence factories. Appleby discusses system-level observability, agentic AI for discovery and causality and use cases across finance, telco and healthcare. They emphasize the need to treat AI factories as integrated systems rather than discrete components and to reduce mean time to resolution, MTTR, while improving graphics processing unit GPU utilization and energy efficiency. Appleby reports that approximately 25 percent of AI jobs fail, underscoring demand for a new class of observability that leverages AI agents for rapid causality, remediation recommendations and operational governance.
The conversation is hosted by Gemma Allen of theCUBE and informed by theCUBE Research. Topics include observability strategies, AIOps and AI infrastructure for enterprise deployment, monitoring techniques for GPU environments and the role of agentic systems in fault discovery and automated remediation. The discussion provides practical insights into deployment, performance optimization, governance and regulatory considerations for enterprise AI factories.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. If you don’t think you received an email check your
spam folder.
Sign in to AI Factories - Data Centers of the Future.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Register for AI Factories - Data Centers of the Future
Please fill out the information below. You will receive an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for AI Factories - Data Centers of the Future.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. If you don’t think you received an email check your
spam folder.
Sign in to AI Factories - Data Centers of the Future.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Sign in to gain access to theCUBE + NYSE Wired: AI Factories - Data Centers of the Future
Please sign in with LinkedIn to continue to theCUBE + NYSE Wired: AI Factories - Data Centers of the Future. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Paul Appleby, Virtana
In this interview from theCUBE + NYSE Wired: AI Factories – Data Centers of the Future event, Glean co-founder and CEO Arvind Jain joins theCUBE’s John Furrier to unpack what’s really working in enterprise AI today and what comes next. Jain explains why knowledge access remains the first successful AI use case at scale and how Glean’s enterprise search brings AI into everyday work. He details the past year’s lessons with AI agents – from the need for guardrails, security, evaluation and monitoring to democratizing agent building so business owners (not just data scientists) can create production-grade agents.
The conversation dives into Glean’s vision of the enterprise brain powered by an enterprise graph, highlighting the importance of deep context, human workflows and behavior to reduce “noise” and drive outcomes. Jain outlines core building blocks – hundreds of enterprise integrations and a growing actions library – that let agents securely read company knowledge and take actions across systems (e.g., CRM updates, HR tasks, calendar checks). He discusses how organizations are standing up AI Centers of Excellence, prioritizing “top 10–20” agents across functions like engineering, support and sales, and why a horizontal AI data platform that unifies structured and unstructured data – accessed conversationally and stitched together via standards like MCP – sets the foundation for AI factory-scale operations. Looking ahead, Jain says Glean’s upgraded assistant is evolving from reactive tool to proactive companion that anticipates tasks and accelerates productivity.
In this interview from theCUBE + NYSE Wired: AI Factories — Data Centers of the Future at the New York Stock Exchange, Paul Appleby, chief executive officer and president of Virtana, joins theCUBE + NYSE Wired's Gemma Allen to discuss why the industrialization of AI demands an entirely new class of infrastructure observability. Appleby explains that AI factories are far more than clusters of GPUs — they are complex systems with interdependent layers of compute, networking and storage that require holistic, end-to-end monitoring. He highlights a stark reality:...Read more
exploreKeep Exploring
Why do AI factories need a new class of system-level observability, and how does Virtana address the challenges of managing and troubleshooting complex AI infrastructure?add
How should enterprises design, govern, and monitor "AI factories," and why is a new, system-level observability approach required to ensure reliable, efficient, and autonomous AI operations?add
How can observability and telemetry be used to reduce mean time to resolution for service disruptions and to identify and eliminate waste (for example underutilized GPUs) to improve operational efficiency and environmental impact in AI-driven systems?add
How quickly can Virtana be deployed and start providing monitoring, intelligence, and actionable insights, and what about its design enables that rapid rollout?add
How does the organization stay abreast of changes in the technology environment and leverage AI/ML (including agentic AI) to do so?add
>> Welcome back to theCUBE Studio, coming to you here from the New York Stock Exchange. This is one of our NYSE Wired programs, AI Factories. And joining me now is Paul Appleby, CEO and President of Virtana. Welcome, Paul.
Paul Appleby
>> Hey, thank you so much for having me. It's great to be back here in New York again.
Gemma Allen
>> Thanks for being on. You set off camera there. I'm going to give you the quick skinny on observability and the whole space Virtana is operating in, which is changing pretty rapidly, right?
Paul Appleby
>> Yeah.
Gemma Allen
>> This space is a growing conversation for sure. So talk me through Virtana as a company, the market you're in and the competitive dynamics.
Paul Appleby
>> Yeah. A lot of companies love to talk about their technology in terms of their features and functions. I love to talk about what business impact our company has and what business impact observability has. So let's talk about that for a second. And this is a great example. I mean, we're sitting here above the floor of the New York Stock Exchange and all of the systems that keep the New York Stock Exchange alive are absolutely critical for the economy and critical for trading, et cetera, et cetera. And the reality is behind those systems is an incredibly complex set of infrastructure, servers, networks, databases. And the class of software that observability is a class of software that monitors all of those services and makes sure that they remain available and performant.
Gemma Allen
>> Wow. So it seems as though the world is changing pretty rapidly, right? And there is a hugely complex tech stack, like you said, behind many of what seem like to us, like one touch services on the front end. The backend is obviously very complicated. And I guess we're now in a world where AI can do a lot of discoverability for us.
Paul Appleby
>> Yes.
Gemma Allen
>> And solve some of the problems that have existed of old where something breaks and you don't really know where the fix is at or where the challenge or problem is originating from, right? Talk me through a little bit of some examples on a day-to-day basis of your sales guys going out to different clients and talking about the opportunity here and how it's shifting.
Paul Appleby
>> Absolutely. If you think about pretty much any business, we use the example of the New York Stock Exchange, but we could talk about a bank. We could talk about a telco, an airline, a healthcare company. Every one of those organizations has digitized their services. We can tune them every day. We do transactions on our phones, whether we're buying things, booking flights or moving money around in a bank account. And that incredibly complex infrastructure that delivers up those services is fundamental to ensuring the continuity and resilience of those businesses. You overlay AI on that. And what we're seeing, and you hear Jensen Huang and Michael Dell talking about the advent of the AI factory, this shift from experimentation to actually the industrialization of AI, where these banks, these telcos, these healthcare companies are embracing AI and building out enterprise scale AI infrastructure to deliver these services using AI to power those services, whether that's doing demand planning for energy grids or whether that's actually helping healthcare companies accelerate drug discovery or create hugely powerful targeted medicines for certain treatment protocols. These AI factories are being scaled out all over the world. Now, the challenge, of course, with that is they come with their own set of complexities and their own sets of challenges. We think AI factories are just a bunch of GPUs. We hear GPUs have entered the parlance and we now talk about GPUs almost on a daily basis, but the reality is an AI factory is a hugely complex system, if you like, that the GPUs are just a part of. So to ensure that not only those AI factories continue to process these really critical jobs in a timely manner and that they don't fail, but to also ensure that they're used efficiently, because this is an incredibly expensive technology, you need a new class of observability.
And that's really what Virtana does. Virtana doesn't just look at discrete components. Virtana looks at the entire system and we need to think of the AI factory as a system or even the technology that supports the New York Stock Exchange as a system. And what you need to do is observe that entire system and then use AI to discover all of the components, to discover the dependencies between those components. And then if there's some sort of degradation in service or failure of a job, really quickly identify what caused that service using AI to get to causality and using AI to recommend remediation strategies.
Gemma Allen
>> AI factories, obviously, it's a term we use a lot. We have a series where we talk to a lot of folks in this space, but the concept to many is still relatively new, right? And you're right in that some folks still struggle to fully understand what it means. I mean, I'll be frank, I did six months ago. My visualization was going somewhere which probably wasn't sensical now in March of 2026, but your company has been around for close to two decades, right?
Paul Appleby
>> Yeah.
Gemma Allen
>> So you've been on quite the journey from the perspective of observability 10 years ago and what it means now.
Paul Appleby
>> Yes.
Gemma Allen
>> Would you say that there is like a market shift happening that folks is kind of a moment of reckoning in terms of folks understanding who owns even the AI factory network, who is responsible for the overall performance? Is that well understood, do you think?
Paul Appleby
>> What I'd say is that we just did some research. It was a really interesting piece of research which we'll be publishing shortly and there is a big disconnect between what executives believe about their enterprise readiness, their readiness as a bank or a telco to embrace AI and adopt it and what the IT organization actually thinks. So there is a big disconnect just even at that level of, are we ready as an organization? Do we have the governance and controls to take AI from being a science experiment? Which it really has been. Most companies have been experimenting in the cloud. They've been trying to work out what role does AI play in my business. How does that create value for our customers and our shareholders? And now, they're going into, "Wow, we're going to deploy AI at scale."
So the advent of these AI factories, which is just really the scale up of AI, it's a great way of describing it, but it is also a really powerful way. I like the way Jensen and Michael have approached that because the whole concept of a factory implies a system, a system that has inputs and outputs and a process for manufacturing and driving those outcomes. And that is really the kind of key to AI factories because it's not just building the factory, it's then putting the controls and governance in place to ensure that the factory continues to operate, that it does that efficiently, and that it's producing the products that you're supposed to be creating. So that's really the key. And it is a fundamental shift. And I think we're at an inflection point because if you look at where we're at today, and there's been quite a bit of data coming out on this, at least 25% of AI jobs fail. Now, if we're going to be relying on AI as part of our core business, whether it's financial services or telco or healthcare or whatever, having 25% of jobs fail is just not acceptable. So that's really, again, where this new class of observability has to come into play. The kind of legacy observability is really about looking at discrete components of the infrastructure, the network, the application itself, the infrastructure, and capturing, monitoring essentially those components. You need to really think about it as a system and look holistically at the system and capture telemetry right across the system using AI agents to then identify bottlenecks or areas of risk and exposure or even outages and then identify remediation strategies. So it is a really big fundamental shift, this shift from legacy monitoring that really was what observability was, to this kind of dynamic platform that observes the whole service and the whole service from an end to end perspective and leverages the power of AI to drive towards this kind of future of autonomous operations and self-healing infrastructure.
Gemma Allen
>> Let's talk about the incremental wins of this information and data collection. Right?
Paul Appleby
>> Yeah.
Gemma Allen
>> So let's say you are a large bank, you have an outage. Five years ago, you had to run scripts in particular parts of your stock, figure out what was wrong, you decide who's at fault, you ring up the team, they get service engineers in. It's an SHIT show for the most part, right?
Paul Appleby
>> Yeah.
Gemma Allen
>> Now, you're in a position where you have a lot of data very, very quickly, right? It's incremental. You're also able to kind of spot patterns in that data, who's operating at max efficiency, who's not. So I imagine it can also lead to some sort of like further SaaSpocalypse as we've now been referring to it from the perspective of understanding the high performing and low performing value add of the overall stack. Are you seeing that shift? Do you think people are starting to kind of wake up to where the optimization is won and lost?
Paul Appleby
>> Wow, multiple layers to that question. So I'm going to see if I can parse them out a piece at a time. In the first instance, you're absolutely spot on. If you go back a decade and there was some service disruption and we're really monitoring these discrete elements and you mentioned the SHIT show, that was really the reality. You ended up with these different functional groups trying to prove their innocence. So it wasn't really about how do I make sure this service is back up operational, we're running efficiently. It was more, how do I prove it wasn't the storage team or how do I prove it wasn't the network team. So there was this phrase that was coined of meantime to innocence, MTTI. When the reality is that what matters is mean time to resolution. How long does it take if there is a disruption? How long does it take to fix it? And really, if we think about this modern construct of observability, it's really about radically reducing MTTR, meantime to resolution. And that's what really matters. If this is an airline booking system, if this is an internet banking application, if it's a trading application, meantime to resolution is everything. Then you layer on top of that something else, which is really, really important because there is a duality in modern AI. CIOs all over the planet and G2000 companies will tell you this. They're being asked to deliver these modern services, leveraging the power of AI, using this complex infrastructure, and they're being asked to do more and more and more of this, but at the same time, they're being asked to do that more efficiently. How do we lower our cost of operation? How do we become more efficient? So that second point that you touched on there, which is, if you're capturing all this telemetry, can you use that to identify areas of wastage? And this is especially the case where we think about the AI factory, because if you think about the cost of these GPUs, if you've got GPUs sitting idle or GPUs that are only being partially utilized, you're wasting a huge amount of investment. So not only monitoring the availability and performance of the service, but actually the operational efficiency of the AI factory not only helps you maximize the value of your GPU investments, but it also helps you lower electricity usage, water usage, heating, HVAC costs and all the rest of it. So there's also an environmental impact there as well, which is really important. So that duality is a really important thing to identify.
Gemma Allen
>> So this, I don't want to call it a pivot, but this, I guess, rapid shift to the world of AI factories and the observability enablement that, that provides, I guess it's a learning curve too, right? I'm sure for everybody, you guys as providers and enablers and our spheres in your own right, but also people who are learning from the data. Talk to me a little bit about how you think 6 to 12 months out. How do you, from the perspective of Virtana, plan ahead? It's hard to predict exactly just how fast this tech is moving, but also how fast the actual games will be met.
Paul Appleby
>> Yeah. I mean, if we had a crystal ball, we have made a lot of different decisions. But what I can say is by listening to the market, by spending time with customers and spending time with the analysts, you're really able to glean where people are headed. And I'll give you a great example. Going back 12, 18 months ago, most enterprises were really in the experimentation phase. And the general thesis there was it's going to take a few years for us to see this industrialization to occur. Here we are 12, 18 months later, and I saw Jeff Clarke speaking here on theCUBE not so long ago about how quickly that business is growing for Dell. And Dell is a great example of a company that invested ahead of the curve of building out infrastructure specifically for the AI factory. And they're seeing that in the marketplace. We saw the same thing. We were out there talking to the world's biggest banks, telcos, healthcare companies, airlines, insurance companies, and they were already talking about shifting from experimentation into scale out about 12, 18 months ago. So this is one of these interesting things in this world of AI when we're leveraging the power of agents, capturing huge amounts of data and telemetry. The thing to remain aware of is the importance of connecting with humans. And it's one of the things I love doing. I travel around the world. I've just come back from a three-week trip through Asia and through Europe meeting with customers and learning about what are they doing, what are they experiencing, what's working, what's not working, and how do we bring that back into our roadmap. So the one thing I'd say is the human element matters as well. And Michael Dell talks about that. They're really close to their customers and they're hearing from their customers all the time what's working, what's not working. And Michael and Jeff are pivoting the organization based on that. We're endeavoring to do the same thing.
Gemma Allen
>> I mean, they were an interesting paradox as well from the perspective of their services businesses, right? Not just Dell, but across the board, because that services side ... I just saw this morning that some of the gain in US economics over the last two months or so is actually service driven, but then there's a whole backlash happening against the service side of these kind of large tech outlets, right? So it's an interesting time to understand what the next 12 months will look like from the perspective of defensibility, but I don't think we're going to solve that here.
Paul Appleby
>> It comes back to what we were chatting about beforehand. We were talking about less about the tech and more about the business outcomes. So I think if companies are focused on what is going to deliver the best outcome for their customers, what is going to ensure that the technology is delivered in a timely manner with the right controls and governance in place to ensure efficient usage of those investments and great business outcomes, I think that the dynamics between what component represents service and not is less important and the outcome is what really helps.
Gemma Allen
>> No, I agree. Okay. So talk to me about Virtana on two fronts. First, you mentioned off camera that the deployment process very quick for Virtana. Talk me through that. You decide you want to come in here today, you're invited and you want to roll out Virtana on there's often slur. How quickly does this happen?
Paul Appleby
>> Yeah, it's an interesting thing having worked with some big legacy providers before. I remember big legacy deployments, I won't name companies, but some of these things would take years. And I'm talking about, dating myself, I'm talking about in the 80s and 90s. Those sorts of timeframes are just meaningless today. And really, if people have a business problem that they need addressed, they need it now. So one of the great benefits of Virtana and the reason why many of the Global 2000 use Virtana as their observability solution is that because of the power of the metrics we collect, as I shared, there's about 20,000 metrics that we collect in sub-second time and the power of our AI agents, we're able to come in, do that discovery, do the mapping and correlation, and actually start monitoring and providing intelligence and action within a few hours, and that's why we partner with companies like Dell. They in fact use our platform in some of their critical incident response because we're able to do that really dynamic discovery and mapping and correlation. So we're really a technology that's built for the today. We're really built for that environment where you can come in and get value from the technology really quickly and deliver business outcomes that ensure that critical services like New York Stock Exchange or any one of the other services we spoke about remain available, performant, and efficient.
Gemma Allen
>> So last question, the scale of what you are monitoring and observing is changing rapidly, right? These APIs, everything is shifting so quickly, especially in the world of AI where these frontier models are releasing updates almost feels like monthly, right? If not, it's just so noisy, but also so rapid. How do you stay abreast on your end of those changes from like an R&D and build perspective?
Paul Appleby
>> Yeah. First of all, we work with a lot of the world's major technology providers. In fact, many of them are actually our customers. So we worked hand in hand with them and we work inside their labs and we partner with them inside their labs so that we're trying to stay abreast of all of these changes as they're coming along. In addition to that, we use the power of AI. I mean, we adopted AI and ML in the platform over a decade ago, and it's been a real core of our capabilities and the reason why we're able to get to causality far faster than anybody else. But we also were very early in adopting Agentic AI. So using the power of those agents, we're able to kind of stay abreast of changes in a technology environment, which is fluid and dynamic.
Gemma Allen
>> Wow. Well, Paul, wonderful to have you here on theCUBE and to welcome you to our new studio. Thanks so much for being on the show.
Paul Appleby
>> Oh, thank you. I really enjoyed the conversation and like you, I think things are going to keep changing and keep accelerating and I'll look forward to continuing the conversation.
Gemma Allen
>> I'm Gemma Allen coming to you from theCUBE Studio here at the New York Stock Exchange. This is AI Factories, one of our NYSE Wired segments. Thanks so much for watching.