theCUBE + NYSE Wired: Robotics & AI Media Week | Azeez Bhavnagarwala, Metis Microsystems

Clips
More from theCUBE + NYSE Wired: Robotics & AI Media Week

Azeez Bhavnagarwala

CEO & Founder

Metis Microsystems

play_circle_outline Metris Microsystems developing circuit technologies for energy efficiency in AI workloads.

play_circle_outline Addressing CMOS Memory Bottleneck for Energy-Efficient Data Centers: Importance and Solutions

play_circle_outline Revolutionizing Inferencing with Cerebras: Integrating SRAM, Photonics, and CMOS for High Performance

Info
Transcript

Azeez Bhavnagarwala, Metis Microsystems

Azeez Bhavnagarwala

CEO & Founder Metis Microsystems

Exploring Advanced Technologies in AI with Azeez Bhavnagarwala of Metis Microsystems

Azeez Bhavnagarwala, Chief Executive Officer of Metis Microsystems, joins theCUBE's John Furrier during NYSE Wired Robotics & AI Media Week to explore innovative circuit technologies aimed at addressing today's semiconductor challenges. This session delves into critical advancements at the intersection of AI and energy efficiency.

In this video, Bhavnagarwala shares expertise on CMOS (Complementary Metal-Oxide-Semiconductor) memory technology and its pivotal... Read more

explore Keep Exploring

What grand challenge of semiconductors is Metris Microsystems trying to address with their circuit technologies? add

What potential impact could building more energy-efficient CMOS memories have on total system power dissipation and energy efficiency? add

What enables Cerebras to be the fastest AI inferencing hardware currently available? add

bolt Powered by CUBE AI

Azeez Bhavnagarwala, Metis Microsystems

search

>> Welcome back, everyone. I'm John Furrier, host of theCUBE at our New York Stock Exchange CUBE Studio, the East Coast. It's our access point where our subnet in New York. Of course, we've got our Palo Alto access point in Palo Alto connecting tech and Wall Street. And bringing it all together, bringing two communities with the NYSE Wired, and an open source network of experts, a mixture of experts, but we're here for the robotics, wheat coverage, and AI leaders. Azeez Bhavnagarwala, CEO of Metris Micro. Fundamental primary technology in all of this is CMOS memory. We all know what it is. If you're in technology, Azeez is the CEO, you've got all the intellectual property, you've got the keys to the kingdom. I mean, you're in everything.

Azeez Bhavnagarwala

>> Yep. Yeah. So, Metris Microsystems develops circuit technologies that we believe can be a significant solution to what is widely identified as the grand challenge of semiconductors today, energy efficiency. So as you probably know, AI workloads are getting bigger, AI models are getting bigger, more complex. And they are also demanding 10 to 12X increase in the computations per year, which translate into larger GPU clusters. And in fact, a million GPU cluster is coming up very soon, which will consume well over a gigawatt. So, that's the energy efficiency scaling that is required to minimize total energy for compute in the data center. And that's the circuit technology we are developing to enable CMOS memories, which is the bottleneck to deliver much higher energy efficiencies for the entire compute system. So, that's really what we do. And-

>> And the company, just so people can know, you don't make the CMOS. You have the intellectual property, but you work on software. Because you're enabling others to build it into their system, right? Is that right?

Azeez Bhavnagarwala

>> Yep. So our business model, we are small company, so we are flexible. So, it's much easier to develop the circuit technologies and then build compilers that we can license to customers. But there are customers that would prefer to own the design itself. So, we would just license the patents to them. And then there are customers who would rather have us build the memories for a CMOS platform, and we can do that too.

>> So I got to ask you, because first of all, I want to get that out there because you have IP, but other people use the technology and CMOS is in everything. We see cameras here, it's in devices, sensors, anything with this circuitry, CMOS is somewhat involved. If you had a PC, you'd boot up stuff and CMOS was always the last hope for things. People remember those days, but it is fundamental technology. It's been a major driver as semiconductors have been a big part of the AI infrastructure. The biggest topic obviously this week, is robotics and AI. But if you look at the AI industry with agents, the AI infrastructure layer below it is under massive acceleration. You mentioned NVIDIA, gigawatt power is required to power it. But now you're starting to see the chips get configured in a way that people can innovate around not having just a monster GPU. So, you start to see a lot of engineering around the semiconductor side for these large systems or clusters. It's not just your yesterday's server day, it's like servers, plural, with fabrics coming together. How do you guys accelerate that mission? What are some of the conversations and technologies that people should know about that are super important?

Azeez Bhavnagarwala

>> Right. So one of the biggest challenges for very high performance accelerators, GPUs in the data center, is heat removal. And that is a consequence of the industry being unable to scale operating voltages since scaling ended 20 years ago. And the power density goes up as a square of the scaling factor causing a lot of limitation on how fast you can clock the processor chips. So, processor chips in the last 20 years have not increased their clock frequency much. They're pretty much in the three to five gigahertz range, so power density is a big limitation. So when we improve the energy efficiency of components that are large contributors to the total power dissipation of a chip, we are actually enabling the chips to run at higher clock frequencies and at lower power densities. Which is really the key part of enabling for Metis Microsystems to enable much higher energy efficiency as well as higher performance so these AI models-

>> So, energy at scale.

Azeez Bhavnagarwala

>> Energy at scale, yep.

>> Those are two of the key. I mean, that's all everyone's talking about.

Azeez Bhavnagarwala

>> Yep.

>> So, what specifically goes on there? What's the big secret sauce?

Azeez Bhavnagarwala

>> So if you look at NVIDIA's GPU publications from several years ago, 70% of the average multiply accumulate operation is consumed by just the register files that store the private context of each concurrently executing thread. And if you look at the edge AI accelerators, again published by NVIDIA, 47% of the total chip power is consumed by CMOS memories. So if we build more energy efficient CMOS memories, we can impact the total system power dissipation's energy efficiency substantially. So, that's really where Metis Microsystems IP comes in, where we are claiming that we and our test chips are in progress, we can improve the performance by, we can double the performance at a minimum. And we can also reduce the total energy consumption by as much as 80%, which is very hard. If you talk to chip designers, it's very hard to get the active energy down. It's very hard to improve the speed, given the limitations of the technology, especially variability, leakage. And many of these challenges that have come up in the last years.

>> There's been so much innovation just photonics on chip. You're starting to see a lot more silicon advancements. But in everything I've seen in the past two years, whether it's from Broadcom or NVIDIA or other people, the memory bottlenecks are always the big discussions. I was talking to somebody, I still don't think SSDs are in a capacity tier on the chip yet. You have high bandwidth memory, kind of big strategic part. You got TPU, GPU. So, people are testing how to configure the memory configurations around the processors. How does CMOS change that?

Azeez Bhavnagarwala

>> Yeah, so one very good example is Cerebras. So Cerebras builds this wafer-sized, sorry, mousepad-sized chip.

>> Cerebras?

Azeez Bhavnagarwala

>> Cerebras. Yeah.

>> Yeah, okay. We know those guys, and .

Azeez Bhavnagarwala

>> And that's entirely SRAM, entirely CMOS memory, so they don't even have any DRAM in their system. 40 plus gigabytes of just SRAM. So, CMOS SRAM enables much higher bandwidths on chip, much lower latencies. And as a result, no surprise, Cerebras is the fastest AI inferencing hardware there is today. So the importance of these foundational IP components, CMOS memories, are critical. And really are the bottleneck to getting to a higher performance as well as higher energy efficiency.

>> You like Cerebras?

Azeez Bhavnagarwala

>> Oh, it's the best thing that has happened for AI hardware. It's the fastest inferencing engine. It's-

>> Yeah, when new things happen, you always have naysayers. I love Cerebras. Think they're really innovative. I've interviewed Andrew Feldman, Julie Choi, the team over there. If you look at the naysayers, Intel has their own problems. But say you're former Intel people are like, "They'll never get the yields." They defied the skeptics. And I bring this up only because first of all, they have, because it works, because we're seeing new architectures using existing things like CMOS. I was talking to an engineer at Broadcom about roll of copper. Now it's like, use copper everywhere. Now ethernet is changing how it's deployed. So, all these new ways to handle it can change old conventional wisdom.

Azeez Bhavnagarwala

>> Yep, yep. And we have a lot of circuitry that are used in today's chips, which believe it or not, are relics from a long time ago. So, we are still using circuits that were developed at the time of pagers over 30 years ago. And many of these are actually the bottlenecks within CMOS memories as well. So yeah, I mean the need for innovation is really there across the board.

>> So, what's the latest version of CMOS, 2.0? 3.0?

Azeez Bhavnagarwala

>> No, three nanometer is in production. Customers are building on three nanometers. Two nanometer is also I think, in risk production, depending on which fab you talk. So, TSMC and Intel, I think they both have two nano, and Samsung too. So, two nanometer has a backside power delivery that makes available a lot more tracks for signal at the higher levels of the chip. So it's all exciting stuff, but the challenges are there too in these advanced technologies, specifically with heat removal. Because the footprint of these circuits are smaller with stacked gate devices, and it's really a challenge to remove heat.

>> And that's why you mentioned energy and scale earlier. You mentioned SRAM when we talked about Cerebras. CMOS working with say, SRAM and photonics, for instance, I mentioned that, these are heterogeneous, but kind of related technologies. You see them all working together.

Azeez Bhavnagarwala

>> Yeah. So optical links between server boards, between chips on a board, has already been used for a long time. Now we have these 3D heterogeneously integrated packages where you can integrate very different chips across different functions, different costs, and deliver a much lower cost system with very high performance too. So, yeah.

>> Azeez, great to have you on here. A mixture of expert series, part of the robotics and AI leader series. Been great. What are you most excited about right now? Obviously, the AI wave is bringing back a resurge. I won't say they never really went anywhere, but I would say a double resurgence of semiconductor, infrastructure, engineering. There's some really cool stuff going on right now. What are you excited about?

Azeez Bhavnagarwala

>> So obviously, we're all excited about the opportunities these make available for people in technology and in the system level. However, there's a very significant concern. The world produces only five times 10 to the 20 joules per year. And that amount of energy does not increase by much more than a half or a percentage point. But the energy consumed by AI hardware is increasing exponentially. And the expectation is that if we do nothing, we will not have enough energy available for every use. So a lot of these clusters, AI clusters, are going to private energy sources such as small nuclear reactors. And that's a great solution because nuclear reactors can be engineered to be very safe. But the real problem with the nuclear reactor solution is as we scale the number of reactors, because this is an exponentially growing demand for energy from data centers and from the edge as well, the meantime to catastrophic failure will drop. And that makes it a-

>> Dangerous....

Azeez Bhavnagarwala

>> very risky solution. So in my opinion, the best solution for the AI hardware systems to meet the energy demands of AI models is to develop energy efficient compute solutions. Which are sustainable, where we can scale the energy consumption as quickly as the demands for additional computations increase.

>> That is a huge issue because there's limited energy. Solar's out there. That's the sun, but still you've got to build... That's not forever. I mean, that's a short term at best. But it's going to come down to the engineering and the materials, science, computer science. Awesome. Azeez, thanks for coming on theCUBE. Appreciate it.

Azeez Bhavnagarwala

>> And you're welcome.

>> Great commentary, getting in CMOS. But what it means to the real world, what it means is that energy efficiency and value in the engineering side of it is going to be super important as the demand for AI is coming. You're going to see a lot more innovations from heterogeneous components, but also again, deep hardcore technology. And John Furrier, I'm from theCUBE. Thanks for watching.