Sunghyun Park, Rebellions
Sunghyun Park of Rebellions.ai joins theCUBE Research and NYSE Wired to examine how purpose-built memory-centric silicon enables always-on real-time artificial intelligence inference in next-generation data centers. Park outlines Rebellions' approach to optimizing inference workloads for energy and cost efficiency, describes rack-level system design and networking and highlights commercial deployments with Korean telecommunication operators that validate the architecture. They explain how production deployments with SK Telecom and Korea Telecom demonstrate high API throughput and improved operational efficiency. Park argues that the market shifts from training to inference, where performance per watt and per dollar matter most. They emphasize that memory-centric architectures, photonics-enabled networking and rack-level co-design are critical to lowering total cost of ownership. TCO reduction enables diverse deployments in cloud, edge and on-device environments. Rebellions reports production collaborations with SK Hynix and Samsung Foundry on silicon and manufacturing roadmaps and describes how rack-scale co-design and networking choices reduce power consumption and operating expense. The discussion also covers open-source software stacks such as PyTorch for inference optimization and practical factors to consider when deploying AI infrastructure at scale.