Jago Macleod, Gari Singh, Google & Kate Holterhoff, RedMonk

Jago Macleod, Gari Singh, Google & Kate Holterhoff, RedMonk

In this KubeCon + CloudNativeCon North America segment, theCUBE’s Savannah Peterson sits down with Jago Macleod and Gari Singh from Google and analyst Kate Holterhoff from RedMonk for a fast-paced look at how GKE is scaling to meet AI demand. Singh explains how Google doubled a reference cluster from 65,000 to 130,000 nodes in a year for massive AI training jobs that can require 130,000 GPUs, and what it really takes for the control plane to schedule, start and communicate across clusters of that size. Macleod details how Google moved internal control-plane state from etcd to Spanner for massive scale, and how new Kubernetes capabilities like Dynamic Resource Allocation, in-place pod resizing, Vertical Pod Autoscaling and improved cluster autoscaling are helping customers run AI on Kubernetes and manage Kubernetes with AI. The conversation also explores how hardware limits and efficiency are reshaping cloud-native design, from power and cooling innovations seen at Supercomputing to squeezing more capacity into every data center. Holterhoff shares how Kubernetes, AI conformance efforts and projects like OpenTelemetry (OTel) are coming together to support AI agents and complex workflows with strong community backing and observability. Looking ahead, Macleod points to a future of millions of accelerators on Kubernetes clusters and better “graceful degradation” as systems hit scale ceilings, while Singh envisions true platform agents that can auto-size and reshape pods so developers simply deploy and let the platform optimize.

Share this session