In this interview during theCUBE's coverage of AWS re:Invent, Eswar Bala, director of container engineering at AWS, joins theCUBE’s John Furrier to discuss the strategic shift toward "container native" computing. Bala explains that engineering teams currently spend nearly 70% of their time on infrastructure management rather than business applications, a dynamic AWS aims to reverse through new application-centric capabilities. The discussion breaks down major announcements designed to minimize operational overhead, including Amazon ECS Express Mode for one-click production deployments and the new Amazon EKS Capabilities suite, which features managed Argo CD and AWS Controllers for Kubernetes (ACK) to streamline GitOps and resource orchestration.
The conversation also explores the massive surge in AI and agentic workloads running on containers, with Bala revealing that the number of GPU instances managed by Kubernetes on AWS has doubled compared to last year. He details how AWS is optimizing infrastructure for this AI boom through EKS Auto Mode, which automates GPU provisioning and right-sizing to maximize utilization and performance. Bala also highlights the introduction of 100,000-node ultra clusters for foundation model builders and the integration of Amazon Q into ECS and EKS, enabling developers to troubleshoot complex operational events in minutes rather than days.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
AWS re:Invent 2025. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Register for AWS re:Invent 2025
Please fill out the information below. You will receive an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for AWS re:Invent 2025.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
AWS re:Invent 2025. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open the link to automatically sign into the site.
Sign in to gain access to AWS re:Invent 2025
Please sign in with LinkedIn to continue to AWS re:Invent 2025. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Eswar Bala, AWS
In this keynote analysis from AWS re:Invent 2025, theCUBE’s John Furrier joins analysts Paul Nashawaty, Zeus Kerravala and Sarbjeet Johal to unpack how Amazon is redefining cloud infrastructure through the lens of agentic AI. The panel breaks down Matt Garman’s declaration that "agents are the new cloud," exploring key announcements surrounding the Nova model family, AgentCore and Amazon Bedrock. The discussion highlights AWS’ strategic pivot from merely abstracting infrastructure complexity to abstracting work itself, effectively bridging the gap between professional coders and "citizen developers" while unifying the experience for builders at every level.
The conversation digs deeper into the practical realities of enterprise AI adoption, emphasizing the critical role of security, governance and compliance in moving from proof-of-concept to production. Kerravala, Johal and Nashawaty analyze AWS’ vertically integrated approach – spanning from custom silicon like Trainium and Inferentia to the application layer – and how this full-stack strategy allows customers to train models on proprietary data with improved price-performance. The group also debates the evolving competitive landscape, noting how AWS is equipping organizations to build autonomous, long-running agents that function as teammates rather than just tools.
play_circle_outlineEmpowering Developers: Leveraging Container-Native Concepts and AWS Services for Rapid Production Deployment and Reduced Infrastructure Management
replyShare Clip
play_circle_outlineLeveraging AI and Cloud-Native Technologies: The Rise of Kubernetes and Containerization for GPU-Intensive Workloads
In this interview during theCUBE's coverage of AWS re:Invent, Eswar Bala, director of container engineering at AWS, joins theCUBE’s John Furrier to discuss the strategic shift toward "container native" computing. Bala explains that engineering teams currently spend nearly 70% of their time on infrastructure management rather than business applications, a dynamic AWS aims to reverse through new application-centric capabilities. The discussion breaks down major announcements designed to minimize operational overhead, including Amazon ECS Express Mode for one-cl...Read more
exploreKeep Exploring
What is the concept of container native and how does it aim to improve developer efficiency in application deployment?add
What are the key ways customers are using containers for AI within the Kubernetes ecosystem?add
>> Hello, I'm John Furrier with theCUBE here in Seattle, Washington for a preview of AWS re:Invent, also getting all the latest updates here from Amazon Web Services. Eswar Bala's here. He is the Director of Container Engineering at AWS. Eswar, thanks for coming on in theCUBE. Good to see you. You get the hottest area, containers is the cloud-native keys to the kingdom.
Eswar Bala
>> Right.
John Furrier
>> You got the AI world connecting in with cloud native. What does that mean in your mind? How do you see the cloud native world, which has been leveling up and now bringing all the greatness of the cloud? KubeCon just happened, theCUBE was there, you're seeing a lot more innovation.
Eswar Bala
>> Yeah.
John Furrier
>> Kubernetes is getting boring, that's a good thing. Just like Linux is boring, it's very reliable, everyone uses it.
Eswar Bala
>> Right.
John Furrier
>> Containers still are super important in the development process.
Eswar Bala
>> Right. John, first, thanks for having me and thanks for visiting us here in Seattle. Let's talk about container native first, right? We heard cloud native. What we believe is container native is it's an application centric view of what we want to do for developers. Container native is all about enabling developers, taking an idea to production with minimal effort. If you think about where effort's getting spent today, it's all in, it's not just the business applications, but it's also setting up the infrastructure. Scaling and security and compliance, all of these things. And what if we enabled all of this, and to our container services, allowing developers to focus on their applications. That's the view we are taking here in AWS. And if you talk to the engineering teams, they tell us that 70% of their time is actually spent on their infrastructure management. Only 30% is spent on the business applications. That's the problem that we want to go tackle. We've made several strides into that story, and there's a reason why 65% of net new AWS customers come to AWS ECS. EKS, if you think about EKS customers, sophisticated customers using EKS, run a million, tens of millions of clusters annually on that, right? So now, what are we doing there? So first, I want to highlight the stuff that's happening in ECS, right? The first one is Amazon ECS Express Mode. What we are planning to do is with a single click and a container customer is going to take it and get to a production application with just a single click. We manage infrastructure, we manage domains, we manage networking, and we launched Amazon ECS managed instances, which is all about giving the infrastructure flexibility to customers, but providing a Fargate-like experience where we take on operations for patching, scaling, Security, all of that is handled by us. And most importantly, we are launching something for Kubernetes customers and we are calling it Amazon EKS Capabilities. And it's three features that I want to highlight. The first one is allowing customers to take their codes committed into Git, to deploying into Kubernetes clusters very easily. We are announcing managed Argo CD from the community. The second thing we are doing is customers can actually manage their AWS resources using Kubernetes APIs. So that's ACK for you, AWS Controllers for Kubernetes, and we are managing it for customers. And finally we are announcing managed KRO, which is Kubernetes resource orchestrator, allowing customers to compose really complex resources from primitives, and we are going to manage that for customers. With that, customers would be able to define, this is what my web application looks like. They can templatize it and standardize it across their orgs, and now every developer in the org can actually take it and run it, right? So they don't have to worry about what does the web service mean for their organization.
John Furrier
>> So unification between Kubernetes and AWS.
Eswar Bala
>> That's precisely right.
John Furrier
>> On the news, what's available? Can you talk about what's announcements? Will there be previews? What's GA? Is it GA?
Eswar Bala
>> Yeah, so the EKS capability is we are launching in December at re:Invent and it's going to be GA at launch. Everything that I mentioned, ECS Express Mode, is launching today, and that's GA already.
John Furrier
>> And in terms of the Kubernetes ecosystem, do you see that growing? And how do you see that evolving? Because that is a key, has been a key growth vector for containers, and then orchestrating across?
Eswar Bala
>> Correct. And we are clearly seeing the shift in containers, customers using containers for AI in the Kubernetes ecosystem.
John Furrier
>> In which way?
Eswar Bala
>> For example, customers run sophisticated agentic workloads on Kubernetes. They run multi multimodal inference on Kubernetes, and they also run GPU intensive batch workloads on Kubernetes. All of these put together drive a really strong agentic AI story for our customers. And to put this growth into perspective, there are 2X more than GPU instances, weekly GPU instances managed by Kubernetes in AWS compared to last year. That is the staggering amount of growth that we are seeing on AWS. And if you think about, I want to highlight three key areas that we are investing in this space. So the first one is we want to focus on infrastructure automation for AI workloads, and that's where EKS Auto Mode comes into the picture. So EKS Auto Mode handles the GPU provisioning, picking up the right instance for your workloads, right-sizing the workload for the instance, and making sure that the scaling all happens automatically and really fast. That's our first story. The second one is we are focused on operational, large-scale operations as well. For example, we announced 100,000 note EKS ultra clusters for Kubernetes customers. We see foundational model builders using that feature. We see lots of inference working, running on these a hundred thousand node clusters for example. And most importantly, we are also focused on operational intelligence, so we are announcing integration with Amazon Q in both ECS and EKS. So what that allows developers to do, using Q, they can troubleshoot their operational events in a matter of hours or even minutes instead of days that they end up spending their runbooks collecting information and synthesizing them together.
John Furrier
>> They've got the AI to help them there and Q help them there.
Eswar Bala
>> That's exactly right.
John Furrier
>> Talk about the things that you create that's the undifferentiated heavy lifting because in containers there's a lot of things that are involved that you've got to manage observability data and Q.
Eswar Bala
>> Right.
John Furrier
>> What is the big lift with the container native approach? What does that do for the infrastructure DevOps engineer or developer?
Eswar Bala
>> Correct. The first thing we want to do is the infrastructure automation piece. So that's a place we've focused on very heavily over the last five years. Think about the launches we have done on the ECS side. Fargate was our first foray into that where customers don't have to worry about managing infrastructure. We've actually translated that into EKS Auto Mode and we've translated that to ECS Managed Instances as well, for example. If you think about the AI side of the equation, we also announced hosted MCP server support for customers. Customers preferring to use their own tooling to troubleshoot issues can now use the MCP servers to get the cluster context data and make really good intelligent decisions because these are hosted, customers don't have to download, they get automatic updates because it's running on our site and they get all the benefits that we on our site evolve, are evolving into these MCP servers.
John Furrier
>> The roles of the containers, I mean obviously DevOps and infrastructure automation, the consequences for not getting it right are significant. When you're looking at compute, you mentioned GPU. What are some of the things that you see that are super important to get right, because you're dealing with now resources on workflow tokens are involved. A lot of these workloads.
Eswar Bala
>> Right.
John Furrier
>> GPU underutilization has been a big topic.
Eswar Bala
>> That's right.
John Furrier
>> At Supercomputing.
Eswar Bala
>> That's right.
John Furrier
>> Recently. And so people want to maximize their throughput. They want to maximize their budget. How do containers fit in there because most people think, oh, containers, it's for developers and workloads, but now you've got consequences because you want performance.
Eswar Bala
>> Containers is a very natural fit. If you think about the problems in that space, it comes down to, hey, how do I pick the right instance type for or GPU type for my workload? How do I right size it? Number two, how do I make sure that I can scale in and out really fast? For example, the dynamic nature of the system. And most importantly, how do I utilize, how do I drive the utilization of what you're saying? And the ways we see customers approaching this problem is they try to amortize the GPU fleet across different kinds of workloads. It could be training and inference running in the same environment, for example, or inference and batch workloads running in the same environment for which the system needs to be really dynamic in moving workloads around and fitting them where it makes sense. That's exactly Karpenter and EKS Auto Mode is designed for, right? And we are seeing several GPU AI workloads running on EKS Auto Mode today because of that particular aspect.
John Furrier
>> Just a personal question I want to ask you, because it's fun to watch the cloud native. From day one, theCUBE was there when CNCF was formed with KubeCon before even was part of Linux Foundation, watching it mature, the role of containers has changed. Share your point of view on how mission-critical containers are. You just highlighted a few examples where, I mean the consequences of getting that wrong are you can quantify them and they're not good. You're talking about compute resources, talking about GPU, we're talking about real performance.
Eswar Bala
>> Right.
John Furrier
>> How far along has containers come? Share your thoughts on the importance of the role of containers in this big-picture AI world.
Eswar Bala
>> Right. Let's talk about the community first, right? I think containers has been entirely driven by the cloud native community at the end of the day. Yes, we do have ECS and we have actually driven our own unique way for our AWS customers. But if you look at the community angle, lots of the orchestration capabilities we got right. We had an application-centric view, and that naturally translates into AI workloads as well. We didn't land up here with all these features, it took us over 10 years to evolve the system. When Kubernetes launched, it had nothing but support for web services at the end of the day, like stateless services. It went through an evolution of stateful services. It went through an evolution to effectively use GPUs, so dynamic resource allocation in Kubernetes. It has taken all of these learnings and implementing it into the community and to be here at the right place where many foundational model builders now use Kubernetes to run their workloads, for example.
John Furrier
>> I would say that the AI workloads would fail if Kubernetes ecosystem wasn't as mature through that open infrastructure wave.
Eswar Bala
>> Yeah, it definitely is a big enabler for the AI.
John Furrier
>> What's next? What has to happen next? Okay, this is happening. Container-native, seeing the AI stacks emerge around the models.
Eswar Bala
>> Right.
John Furrier
>> Model purpose-built, not purpose-built. Model-driven approaches, but still the infrastructure is expanding.
Eswar Bala
>> Correct.
John Furrier
>> What has to happen next?
Eswar Bala
>> I think the next step obviously is, okay, we have all of these models and we are expecting many customers writing applications that leverage this model, right? Now, if you think about what do you need in this world where it's not GPU applications, it's still general-purpose compute applications, but they're using the models. We need to make sure that orchestrating that easily spinning up the model and connecting into the application is one aspect. The second aspect is if you think about the agentic application model, you're going to have lots of agents working in tandem to achieve a result at the end of the day. And you want to run these agents in really sandboxed isolated environments, right? There's a question on is containers the right boundary for those workloads, or do we have to come up with a better isolation model with strict virtualization built in so that way we can protect the cross-tenant approaches?
John Furrier
>> And if you look at some of the things that's coming out of re:Invent, some of the trends, security runtime, the agents have a runtime. I mean, everything rhymes.
Eswar Bala
>> That's right.
John Furrier
>> It's coming together.
Eswar Bala
>> That's right.
John Furrier
>> Remember, generative AI is just, the generative process is a runtime.
Eswar Bala
>> Yep.
John Furrier
>> So it all comes together. Well, great to have you on. Appreciate it. We'll see you at re:Invent. Congratulations on the news. Obviously, we're going to ship this on after the news is out, but thanks for coming on. Appreciate it.
Eswar Bala
>> Thank you so much, John.
John Furrier
>> All right, containers again continuing to provide the value for the developers and now the infrastructure automation again in a perfect lineup with the AI infrastructure. I'm with theCUBE, John Furrier, your host. Thanks for watching.