In this KubeCon + CloudNativeCon North America 2025 segment, Alois Reitbauer from Dynatrace joins theCUBE’s Rob Strechay to dig into what it really takes to move AI applications into production on Kubernetes. Reitbauer explains how “AI-native engineering” and “AI-native operations” are emerging as teams A/B test different models, wrestle with token costs and capacity constraints, and design smarter guardrails for sensitive data such as medical records. He introduces the idea of an “agent scorecard” to track whether AI agents are actually delivering business value – from customer onboarding outcomes and response times to how often guardrails are hit and where fine-tuning is still needed.
The conversation also explores why AI will quickly become table stakes rather than a lasting competitive advantage, pushing organizations to master AI-native apps and agentic systems faster than past technology waves. Reitbauer shares how SREs need AI “buddies” to improve decision quality, not just efficiency, and why explainability and production feedback loops are critical for debugging AI behavior in highly dynamic, snowflake-like transactions. Looking ahead, he expects more real-world stories of auto-remediation, preventative operations and optimization at scale, as teams move from visionary slideware to concrete lessons learned from running AI and agentic workloads in production.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
Understanding Today's Digital Business With Dynatrace. If you don’t think you received an email check your
spam folder.
Sign in to Understanding Today's Digital Business With Dynatrace.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For Understanding Today's Digital Business With Dynatrace
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for Understanding Today's Digital Business With Dynatrace.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
Understanding Today's Digital Business With Dynatrace. If you don’t think you received an email check your
spam folder.
Sign in to Understanding Today's Digital Business With Dynatrace.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to Understanding Today's Digital Business With Dynatrace
Please sign in with LinkedIn to continue to Understanding Today's Digital Business With Dynatrace. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Alois Reitbauer, Dynatrace | KubeCon + CloudNativeCon NA 2025
In this episode of “Inside the Digital Business with Dynatrace,” recorded live at KubeCon + CloudNativeCon EU 2025 in London, we kick off the series with Dynatrace Chief Technology Strategist Alois Reitbauer as he talks with theCUBE Research’s Rob Strechay. As part of this ongoing exploration into how businesses are transforming amid digital complexity, Alois unpacks why observability is no longer optional — it’s foundational.
From the rise of agentic AI systems and platform engineering to real-time compliance and the evolving role of developers, this conversation cuts to the core of what it takes to thrive in today’s dynamic digital landscape. Whether you're navigating the shift to AI-native apps or grappling with regulatory pressure such as DORA, this episode offers clear, candid insights from the frontlines of innovation.
Inside the Digital Business with Dynatrace is a series that explores how leaders are turning complexity into competitive advantage—enhancing resilience, agility and trust in a world defined by constant change.
play_circle_outlineTransforming Business Value: AI, Kubernetes, and Observability Insights from KubeCon 2025 on Maturing AI Applications and Engineering
replyShare Clip
play_circle_outlineUnderstanding AI's Impact on Business Outcomes: The Essential Investment for Competitive Advantage in Today's Market Landscape
replyShare Clip
play_circle_outlineAI agents can assist SREs by enhancing decision-making and optimizing systems.
replyShare Clip
play_circle_outlineEnhancing AI Development Through a Collaborative Feedback Loop: Accountability and Real-World Insights in Production Data Sharing
Alois Reitbauer, Dynatrace | KubeCon + CloudNativeCon NA 2025
Alois Reitbauer
Chief Technology StrategistDynatrace
Rob Strechay
Dir./Principal Analyst & HosttheCUBE Research
HOST
In this KubeCon + CloudNativeCon North America 2025 segment, Alois Reitbauer from Dynatrace joins theCUBE’s Rob Strechay to dig into what it really takes to move AI applications into production on Kubernetes. Reitbauer explains how “AI-native engineering” and “AI-native operations” are emerging as teams A/B test different models, wrestle with token costs and capacity constraints, and design smarter guardrails for sensitive data such as medical records. He introduces the idea of an “agent scorecard” to track whether AI agents are actually delivering business v...Read more
Alois Reitbauer, Dynatrace | KubeCon + CloudNativeCon NA 2025
search
Rob Strechay
>> Hello and welcome back to KubeCon,
CloudNativeCon North America, still in North America, and it's 2025. We're still in Atlanta. We're finishing things up
strong here on day three where we're bringing
things to a conclusion. I'm excited because I get to talk to Alois Reitbauer-
Alois Reitbauer
>> Yes.
- ...
Rob Strechay
>> from Dynatrace. Again, you're out talking to customers all the
time in your role there and you're really looking at
what's coming next, especially with observability and how that really addresses business value back to customers. I love how you guys really
make those connections back. What have you been hearing
this week as we look at, the big thing that has been talked about
all week long has been AI and inference and how Kubernetes is really
going to be the platform for that going forward? How have you seen those
conversations going this week?
Alois Reitbauer
>> So our conversations
are actually twofold. It's a lot of AI related
conversations right now. Number one is really helping customers now to move things into production or their AI applications into productions. As we know, a lot of them
are not in production today or don't even make it to
that level of maturity. I think that is still going on.
Rob Strechay
>> Right.
- We also see a change in
Alois Reitbauer
>> how people are building applications. In the past it was basically OpenAI. You used OpenAI and then it started to
switch to the other models. Now we see people
experimenting way more, like A- B testing models and the practice of, I would say, AI native engineering is evolving and at the same time we
move towards this AI native operations SRE practices. What do we deal with in production? And what we see there is obviously that's still the conversation
about performance, about problems, people having, obviously
token usage in two areas, cost, AI is expensive if you run it, but also capacity, where
can you actually run it? This is also still a scarce resource. But the main shift in the
conversation is can you tell me that this actually has value, that this actually works for people? Are they happy with the
resource that they get? Are the applications really paying off or do we still need to
fine-tune on those applications by the output that they're providing?
Rob Strechay
>> Yeah, we were talking
before we went live here and I think I liked how you
talked about, hey, you've got to find product market
fit for your AI before. Which those of us who've been at startups, that's always the challenge is doing that. And I think that's why when
I looked at the MIT study where it's like, oh, 5% is in production, that didn't bother me because
I think that's people trying to find product market
fit for their AI agents. But for that 5%, the stuff that
does make it to production, there's a lot of stuff that,
there's a lot of guardrails and governance and visibility that people need to have into that. What are you hearing from that? Because you guys really
play deep in that space.
Alois Reitbauer
>> Guardrails are a key and guardrails started
to emerge very early on. I think we will also see
a lot of innovation still happening on guardrails. Sometimes they're a bit too static, yet we will see more dynamic
analysis happening on top of that data. Just to give you one example,
guardrail can catch PII data like your medical health records. You could say, okay, the
guardrail would flag this information, but it's fine
if you see your own medical data as intended. If your doctor sees your
medical data, that's also fine. If the assistant at the
doctor that you're going to, it might still be fine seeing it. But if another doctor sees it is not. So depending on which type of information, sensitive information we deal
with, I think we still need to figure out how do we track this? How do we understand this behavior? And I think right now we are
still pretty much statically tracking as we work through systems. Now really thinking to the next
step about agentic, we have to track against goals, and I think that's where
business observability comes in. You're delegating a task. You're not looking AI over
the shoulder saying, "Hey, you have done this, this, this and this and you shouldn't have done
this, this is not good. " With moving towards agentic, you would give it a goal,
like a business goal. "I want you to onboard this
customer onto a platform, help them do something with them. " How many customers get onboarded? So I have this high level concept that I'm using almost like an
agent scorecard that will say, okay, how good are you
achieving your business goals? How often do you run into
guardrails? How much did it cost? What was the performance
that you were using? What was the response time
that you were getting? The response time to the end user, not so much on the technical sense, but assume you're in a
chat, you need something. If this takes you 25
minutes, you don't seem to get any value out of it. And I think we start to
establish this practice. And especially now moving to agentic, I think we don't have
those best practices yet to that point even with the
simpler rack-based applications, we are just getting
there, establishing it, how do we measure the value? But I definitely see the shift
from do they technically work versus do they provide value to the end users of those applications?
Rob Strechay
>> Right. And I think I've
had the pleasure of talking to a bunch of the
Dynatrace customers about a number of things. And I think one of the things that always has really impressed me and is exactly what you're talking about, how you connect the business outcomes and things like customer satisfaction, or CSAT, back to how the application or in this case the agents
are actually performing. Are you seeing that customers
are starting to lean in and try to make those connections so that they can understand are
they getting value out of that?
Alois Reitbauer
>> I think they have to because
you have to hedge your bets. The beginning when everybody started to work on AI there was
this big FOMO, we need now to take budget from here
to there to make AI. But not only that, you took
budget away from another project, that project had
associated business goals. Now you need to meet
that same business goal. And I think also I recently did a talk and one of the questions
I had on the screen was, to the audience, do you think AI is going to be a competitive
advantage in your industry? And what do you guess people said?
Rob Strechay
>> I'm going to assume that almost 100% said yes.
Alois Reitbauer
>> Yes. And my answer is no, it isn't. It's going to commoditize,
it's going to be the standard of how we build things. So you have to get it
really good really fast. So you now have to really
build up the expertise and the practice in a field
that is evolving super fast. Like with the model example, people are now testing
different models, you have to test them in production because that's where you're
getting the data from. You need to work with better data sets. I think that's what
companies have to realize. What you assume is a competitive edge by building something will be
commoditized at the speed at which we're working on very quickly. So it's not a nice to have to be good at building AI
native application, it's going to be a must have if you want to succeed in the industry that you're in. And that's why we have
to very quickly learn and fail, I think quicker
than with any other technology that we have been using in the past.
Rob Strechay
>> Yeah, I mean I liken it
that I use it that everything that's old is new again kind of premise. So we've gone through these
cycles of, hey, we went from three tier applications, then we went to distributed services and now we're going
towards, hey, these services can actually build themselves and things like that with these agents that are codifying
agent skills potentially and building them and you got to kind of make
sure that they don't go rogue on that kind of stuff. To me, that would seem to be that if you don't have the visibility
into what's going on in those black boxes that you've now built, you could get yourself into
some very difficult times either for not understanding why
it's acting the way it is or it goes off and does something
it's not supposed to do.
Alois Reitbauer
>> And I think the engineering
practice is also pretty young. So I think just understanding... I think one of the very
positive trends was that AI models now actually
share a train of thoughts that helps you to understand
how the model arrived at where they arrived, pretty much. I think that helps us a
lot on the debugging side. But debugging AI in agentic applications is kind of different. Why did it behave one time
like this and the other one? You need to look at
individual transactions. And the more we move into
more dynamic systems, like going more into this agentic world, the more the individual
transactions will be different. So detecting anomalies will get harder. There's every transaction
is almost a snowflake. So how do you see what is very similar to what is really significantly different? I mean, all of those things can be built. But that's what I think
what people have to learn and how to work with the data and obviously how to properly adjust to that type of system behavior. So I think really where we are learning as we are rolling it out. And the other trend that I see, it's again what we see very often in technology, and you mentioned platform engineering before, I see something similar
happening now when we talk about observability and using
agentic AI in observability that everybody's to now do
custom build up platforms. Just as we had at the beginning
of platform engineering. Now I'm going to build my own IDP, I'm going to build my own IDP. And people realized while
they were stitching stuff together, it worked quite fine, but it turned out to
be a lot of work, a lot of engineering time went into it, and there are actually products
out there that can do it. I see the same actually happening
right now on whole agentic application space that's not core to your
business especially. >> Yeah, I agree.
Rob Strechay
>> I think I was talking to somebody and in fact they were talking
about did you build your own IDP or did you roll one of the ones there? And everybody at BackstageCon
talking about all of that type of stuff. And I think pieces in making it reusable, because I think a lot of
the toil falls on SREs and what they're trying to... It's almost unknowable how they have to have these skill sets
that are so broad these days. Are you seeing that in the
customers you're talking to and prospects that, hey,
they're looking for things that can help them just
understand the landscape, bring it back in a way and tie it back to the business
that makes a lot of sense?
Alois Reitbauer
>> Yeah. I wouldn't necessarily
say that the SREs... Well that's maybe the wrong statement. But cares too much about the business, but the business is the number one thing to think about it in the
morning, that they do think about the business of their company, but they need to keep
systems up and running. We keep adding more and more complexity and they have to run them. And one conversation that we
have a lot when we talk about agentic applications or AI native applications, we
want to monitor the same way as the rest because we can't
add more complexity at certain layers and we need also AI to assist us. I think that's where for me I'm not so much I'm in the house of, okay, AI is improving efficiency. I think nobody who's
building an AI application or is using AI should think
about efficiency first. I'm more do you improve the
quality of the work output? If you can make an SRE
more efficient by using AI, that is way more helpful. You're trying to trigger, for example, a remediation action on the system. Yes, AI can help you
to trigger that action. It can prepare it for you. You're only reviewing it,
executing it. That is one step. That's what hear a lot
of people talking about. But where I think we
eventually need to get to AI challenging you and telling you, "Hey, the last
time somebody tried this on the system, these were
actually the side effects. Let me check why this
might happen again today. " So your overall decision process, not for the immediate mitigation, but remediation might actually get longer, but the decisions that you take and how you modify, tune the systems are actually getting better. And that's what I don't see
is a lot of people looking at, okay, we get more efficient,
we can do more with less, but there's very little
conversation can AI to actually do something better by more or less submerging
information to people who have to take decisions more efficiently out of this very large systems. I think that's way more the conversation that we should have. How can we help people to
take more informed decisions, still optimizing the ones
that we can automate? But the other part I think is
very much neglected the same way where, I think that's an
interesting shift right now, everybody was talking about
development, help people to be more efficient writing code with AI. Now we see the shift,
"Hey, actually 80% of our time is spent on actually
running and maintaining them. If we get even half
the advantage in there, that has an order of four
magnitude more impact."
Rob Strechay
>> Right. Yeah. And I think that's exactly it. I also think there's a
huge, not just skills gap, but lack of skills and skilled people in that
area of running the stuff. And people, it's a place that honestly, most companies don't want
to invest more money in. So if you can have an AI agent or as I keep calling them AI
buddy, to help out those SREs to really achieve more efficiency, that to me is really key. I mean, you guys have had
AI in your product forever. I mean, again, and I think
when you start to look at the learnings you have internally
going through this process and then applying it out to customers, how do you see the things
that customers are coming to you and asking? It's like, yes, we understand that because we went through this process as we built our own software and we're actually doing this as well, and this is why we think
if you go down this path you can be successful.
Alois Reitbauer
>> Yeah, I think that's where we help them. Number one step is you
have to obviously get a lot of the load off people so
they can focus on other tasks. But our focus was really,
okay, there is a task. It is very well understood.
It can be easily automated. There's no reason for somebody to do it. But from day one, we started to have explainability in there. That somebody can take, does this decision actually make sense? Can I trust a decision that was made and do I actually understand it. So it's not just it's just creation, it happened to be like this. Because also you want to
have this feedback loop so that people also learn
how decisions are taken. And also I think especially
the output quality in the context you can provide becomes
even more vital right now as we start to connect AI systems. Like, in our case, we take our predictive and causal AI, now the generative AI or AI agents that now can
take or propose action. Suddenly that context becomes so important because depending on how
much content you provide, the quality of this
action is getting better. And that's where we see
that they're interested. They want to go eventually to
zero incident policy, that's what you hear from a lot of
them, AI taking care of more. I think once they're done with this... We will see it in three phases. We see first it's remediation. If something's burning, you
have to take out the fire. Step number zero, fix what's broken. Step two is going to preventative, we see more people are going
to preventative operations, ensure that the system wouldn't even fail because you have predictive
AI, you understand how a system might potentially
fail, what might fail there, you're taking it there/ and
then you move into optimization. I think it will up-level
the entire industry, but so many people are stuck
taking fires out on a day-by- day basis that they never
get to this next level. It's almost like, I hate the word, but like a majority
curve to move from A to B to C, and that's where we start. We free up their time that
they can start to focus and even get strategic. If part of your infrastructure
is down right now, or part of your applications
are down right now, you're not thinking how
could I optimize my system to run 20% more effective? And you never get there. And
then we increase the pace, we increase the complexity, and that's where we need to help them.
Rob Strechay
>> I totally agree. And I
think, again, it's always fun to talk with you about these things because you see it, we
actually see it very alike, but you're talking to these
customers all the time. We're talking to organizations.
Last question here. When we're together either in Amsterdam or a year from now in Salt Lake City, what do you think we're
going to be talking about? >> I think the topics
won't change necessarily.
Alois Reitbauer
>> We will see more people
being in production. I think there will be, again,
a shift throughout this year. It's going to be interesting
what it's going to be. We will talk, I think, also more bit how do we get production
feedback to developers in AI? That's a big topic. It's really
the first technology that's so heavily focused on production data. It's so hard to test. It's almost the extend left but give them the data from the right. How can we do this? I
think that's something we're talking about. And I hope we'll see the majority
in the industry and also, but it's the coding side but also the operations side, where people don't talk about the vision and what can be built, where we will start to hear from the first people,
okay, this is what we built. This worked really well. We tried this. We are not yet there.
So there will be some, okay, we had this great vision. It didn't work. So some more
reality check for people. I think it's also the majority,
how the industry will go, what can actually be done. People thinking more long
term. We see people even on our side going more into
this auto-remediation. We have some who have it in production for moderately simple use cases. We have very few customers we
have very advanced use cases, but we will learn as we go,
how we want to go there. We will see more companies adopting it. We should have more conversations
what have people learned versus how they anticipate
to go in that direction.
Rob Strechay
>> Yeah. I can't wait to be talking about this again with you. Alois, you're awesome to
talk to about this stuff. >> Thank you.
- Because I think it's moving so fast
Rob Strechay
>> and I think this value is key.
Alois Reitbauer
>> And I love that you brought
up the explainability, which has been absolutely not around in this conference
this entire week, and it's hugely important. So thank you again for coming on board.
Alois Reitbauer
>> Thank you for having me.
- Yep.
Rob Strechay
>> And thank you for watching KubeCon, CloudNativeCon North
America 2025 from Atlanta. We're done here, but you can go back and watch all the episodes on theCUBE. net and on YouTube. Stay tuned to more from theCUBE,
the leader in tech analysis and news.