theCUBE + NYSE Wired: Zero Trust Cyber Series | Darrick Horton, TensorWave

Clips
More from theCUBE + NYSE Wired: Zero Trust Cyber Series

Darrick Horton

Founder & CEO

TensorWave

play_circle_outline TensorWave's Strategic AMD Partnership: Optimizing AI Infrastructure for Superior Customer Experience

play_circle_outline Importance of GPUs in AI workloads for matrix multiplication and performance

play_circle_outline TensorWave's business model of renting AI compute to clients with added value abstractions

play_circle_outline Choice of AMD as a viable alternative to NVIDIA in the AI space

play_circle_outline Exploring Future Trends: Dual Phase and Direct-to-Chip Liquid Cooling for High Density Racks

Info
Transcript

Darrick Horton, TensorWave

Darrick Horton

Founder & CEO TensorWave

Dave Vellante and John Furrier are discussing TensorWave, a GPU cloud provider specializing in AI infrastructure. They partner with AMD for optimized AI workloads, positioning themselves against larger cloud providers like AWS, Google, and Microsoft due to their tailored approach. TensorWave emphasizes performance and cost-effectiveness through their focus on AI. They rent AI compute infrastructure with customization for a better user experience, currently seeking financial backing for expansion and global growth. Serving startups, enterprises, and hyperscale... Read more

explore Keep Exploring

What makes TensorWave different from giant hyperscalers like AWS, Google, and Microsoft and how do they compete with them in the AI infrastructure market? add

What is the role of GPUs in AI workloads and why are they specifically optimized for matrix multiplication in data centers built for AI? add

What is the business model of the company that rents AI compute to clients, and how do they tailor their services to meet the specific needs of each customer? add

What were the reasons behind starting the company AMD in the AI space? add

What are some thoughts on Dell using warm water and a combination of cooling methods in their AI lab, including the discussion of two phase cooling technology? add

bolt Powered by CUBE AI

Darrick Horton, TensorWave

search

Dave Vellante

>> Hi everyone, welcome back to Media Week NYSE Wired in theCUBE's exclusive coverage of cyber and AI innovators. My name is Dave Vellante and John Furrier is my co-host. This week we're got three days wall-to-wall coverage. Super excited to have Darrick Horton, who's the CEO of TensorWave. Welcome to theCUBE NYSE. What do you think of this? Pretty nice, huh?

Darrick Horton

>> Yes. It's great to be here.

Dave Vellante

>> Yeah. Well, thank you. Tell us about TensorWave. You guys are at the forefront of AI. You're partnering with AMD, everybody's NVIDIA crazy, but you're partnering with AMD. I know you'll do some stuff with NVIDIA as well, but tell us about the company.

Darrick Horton

>> Yes, so TensorWave is a GPU cloud provider focused on building infrastructure for AI. AI workloads like training and inference for companies of all sizes. So we focus on building that infrastructure, optimizing it, and making sure that the end customers that are utilizing it have the best possible experience. And we went with AMD for a bunch of reasons that we can get into here if you'd like.

Dave Vellante

>> Yeah, I'm interested. But before we get there, how is it that a company like yours can compete with these giant hyperscalers? And you get this question all the time, AWS, Google, Microsoft, they're insurmountable. How do you compete with them?

Darrick Horton

>> So the fun thing is in this industry, we actually don't consider the hyperscalers to be competition. And that's because we operate in very different markets. So the hyperscalers, they do everything, every type of compute, every type of cloud, every service. What that means is they're not really the best at anything. They're a good place to go if you need a bunch of little things. But if you need the best, most optimized experience, you don't go to the hyperscalers for them. And also because of their business model, they're very expensive. So they tailor to companies that have to use them for compliance reasons or for other reasons. And so from a price perspective, they're not competitive. And from a performance perspective, they're also not competitive, so.

Dave Vellante

>> Ironic, this is ironic because here's why. In the early days the cloud was ostensibly less expensive or better than traditional IT. Now, traditional IT has gotten much better. You are best of breed and you're more cost-effective. The question is why?

Darrick Horton

>> It's because of focus. That's all it broiled down to. The hyperscalers do everything, so we're not good at anything. We do one thing and we do that one thing better than anyone else. It's because of the focus. We're able to put all of our resources into deploying one type of compute, optimize the whole stack so that the end customer gets their best experience, best performance, best cost, and best experience.

Dave Vellante

>> Give me a one-on-one on why AI workloads are different than general-purpose workloads.

Darrick Horton

>> Sure. So AI workloads are very compute-intensive, and there's a special type of math that's happening behind the scenes for almost all AI workloads, and it's known as matrix multiplication. And it just so happens that GPUs, the things we use for graphics are very good at matrix multiplication. And so early on, AI workloads started using GPUs to go faster, and then GPUs were tweaked into these AI purpose-built accelerators that are really good at just that thing. And so that thing is really what's going on in these AI workloads and in these data centers that are built for AI, everything is built to be optimized for just that thing. Whereas traditional data center and cloud, you have a lot more things going on. You have CPUs and you have storage, and you have those things in AI as well but to differing degrees. An AI is really about the GPU compute for matrix multiplication.

Dave Vellante

>> It's interesting, we had an analyst, he's our analyst emeritus now, his name is David Fleurer. Before the term accelerated computing came out, he called it matrix computing for the reasons that you just mentioned. I've also had people tell me that you need GPUs to do inferencing-

Darrick Horton

>> That's right....

Dave Vellante

>> because it's pretty simple math, matrix math, but there's a lot of it.

Darrick Horton

>> That's right.

Dave Vellante

>> But others have said, oh no, I can use x86 to do inference. So there's some debate there. Where do you weigh in?

Darrick Horton

>> You can use x86 for inference. So on the inference side, things are a little more flexible. GPUs are still going to give you the best performance for most workloads, but CPUs exist, edge devices exist, FPGA-based devices exist, and ASIC-based, purpose-built function devices, they also exist. And there's a spectrum here, there's trade-offs at every level on the inference side in terms of cost, efficiency, power consumption, and just raw performance. And so for most workloads in the inference space, GPUs still are king. But if you need to get the best performance inside of a drone, you can't put a GPU in a drone. And so you might select a different type of device.

Dave Vellante

>> Okay. So your business model essentially is renting AI compute to your clients.

Darrick Horton

>> That's right.

Dave Vellante

>> Tell us a little bit about the business model and a little bit about who the customers are.

Darrick Horton

>> Absolutely. So fundamentally at the core, our business model is infrastructure as a service. So we build the infrastructure, optimize the physical, build the data center, stock it full of GPUs, and then lease that out to companies. But how we do that depends a little bit on the workload and the specific requirements of the company. We are a bit more hands-on, white glove, bespoke than a lot of our GPU cloud competitors because we like to ensure that the customers have the best possible experience. And so that means we might tweak the model a little bit for certain customers, but generally we're deploying this infrastructure, we're leasing it out to them, and then we'll have varying levels of abstraction on top of that. A very sophisticated customer with their own software might just want raw access to the hardware, and we can facilitate that, but other customers might want some abstraction on top of that, maybe a software layer or a platform on top of that or an API on top of that that makes it easier for them to consume those resources in an efficient manner. So they're getting the best bang for their buck and they feel like they're having a good experience utilizing the hardware effectively.

Dave Vellante

>> So that abstraction is additional value-add that you provide-

Darrick Horton

>> Exactly....

Dave Vellante

>> and is part of the business model, obviously. Why AMD?

Darrick Horton

>> Why AMD? That's an excellent question. It really goes down to the fundamentals of why we started this company. If you look at a bit over a year ago when the company was founded, we noticed a problem in the AI space, and that was there was a monopoly with NVIDIA. Every AI end user was using NVIDIA and they had to use NVIDIA, regardless of whether NVIDIA was what they wanted to use, they had to use NVIDIA, it was the only solution in town. And so the end customer, they could buy from NVIDIA or they could buy from NVIDIA Cloud one or NVIDIA Cloud two, and that's the extent of their choice. And that struck us as an opportunity. It's a problem that needs to be solved, it's unsustainable, it has to be solved at some point, it has to be addressed. And so we set out to figure out can we address this? Can we bring a viable alternative to the market? And critically, in order for a viable alternative to be successful in this space, it has to check a lot of different boxes. It has to be cost-effective, it has to be available, it has to be scalable, and it has to be easy to use. It has to work. And that's probably the most critical of all. There've been a number of projects and hardware solutions over the years that on paper have excellent performance. But if they're not easy to use, you're not going to get market traction. So when we did the research, is there any solution out there that checks boxes? The thing that was closest was AMD. And it so happens that our team has a long history of working with AMD. And so we knew all the right people to contact and to work with that AMD to make this happen. And so we reached out and went at it. And really, AMD is very well aligned with us, all of the goals, in terms of bringing a viable alternative to market that gives people choice when they're going looking for compute for the AI workloads. They're very committed to open source, open frameworks and allowing everybody to have access instead of NVIDIA's motto, which is to build a walled garden and not let anybody in. AMD is the polar opposite of that. So we're very well aligned in terms of vision and values.

Dave Vellante

>> So you're building data centers?

Darrick Horton

>> That's right.

Dave Vellante

>> Okay. And it's a very capital intensive business.

Darrick Horton

>> That is.

Dave Vellante

>> So help us understand how you're raising capital, how much money have you raised, how you're deploying it?

Darrick Horton

>> Yes. So we did our seed round earlier this year. The seed round was $44 million.

Dave Vellante

>> Nice seed.

Darrick Horton

>> Yeah. That was all on safes. So I think we might've set a record for the most money raised on a safe, but that was really just to get us going. And we built out our first couple of data centers with that and built out the team. And right now we're going for the series A, which is mid-nine figures. But in this industry, generally what happens is you do a price round and then alongside that, you get a large amount of collateralized debt financing because you have so many physical resources and you have contracts backing those physical resources. So it's a very debt friendly business. And so that's really where a lot, the majority of the capital comes from is these large scale collateralized debt financing.

Dave Vellante

>> So your series A will price the round, is that right?

Darrick Horton

>> That's right.

Dave Vellante

>> The safe and the safe investors will get that price?

Darrick Horton

>> That's right.

Dave Vellante

>> Yeah. That's awesome.

Darrick Horton

>> We're excited.

Dave Vellante

>> Very exciting. How do you see this market shaping? So who are your customers today? Are they doing training? Are they doing inference? Are they doing a combination? How do you see these markets shaping up, particularly interested in enterprise AI here at the New York Stock Exchange?

Darrick Horton

>> Yeah, enterprise is the big one, but there's several sectors here and I can break them down. So it's really three groups. The smallest group is the startups. These are small contracts, companies that maybe raised their pre-seed round. They have a little bit of money, they need some initial compute, but those contracts are very small. We do engage with those companies, but it's really more of a marketing effort than anything. And then in the middle, you have enterprise, and enterprise, these are mid-market companies that have budgets, but they might not have the top-tier AI talent. And then on the far end of the spectrum, you have hyperscale. And these are very, very large contracts, however, there's only a small handful of customers that fit into this category. Big, big AI houses, the most popular names, and the hyperscale clouds themselves fall into that category. And that's building large clusters of hundreds of thousands of GPUs. So you have these three categories very broadly speaking, within those categories, you have inference, you have fine-tuning, and you have pre-training. Those are the three big workloads. And so you have three big workloads spread across these three different micro sections.

Dave Vellante

>> So you said inference, fine-tuning?

Darrick Horton

>> And training, like pre-training.

Dave Vellante

>> Yeah, it's pre-training. Right, right, right. Okay. The training inference-

Darrick Horton

>> Fundamental training....

Dave Vellante

>> training, fine-tuning and then inference, right? Okay.

Darrick Horton

>> Exactly. Now training and fine-tuning have started to bleed into each other. So pre-training is the foundational step. And then after that you do smaller scale training, fine-tuning and then eventually inference. So bigger customers tend to have all of those, they need pre-training, they need training and they need inference. But as you move down towards smaller customers, it tends to be then pre-training gets filtered out, and then at some point, fine-tuning gets filtered out, generally speaking. And that's because pre-training and fine-tuning are incredibly expensive, whereas inference, you can get into inference for a much lower sum. And it's because pre-training, anything competitive today requires tens of thousands or hundreds of thousands of GQs. And so you're talking hundreds of millions or several billion in CapEx, or even if you're only renting those for six months, it's still a significant sum. And so not that many companies can afford to do pre-training, whereas a lot of companies can afford to do fine-tuning and every company can afford to do inference. You also have another driving factor, which is what customers will need in the long run. A lot of pre-training happening now that has been happening over the past couple of years. But the more we pre-train these models, especially open-source models, the more everyone has access to them. And they might want to fine-tune them a little bit, but really what they want to do is they want to run them. And so inference is really the workload that scales, inference scales infinitely. Every single company on this planet needs inference, whether they know it or not, they all need inference, whereas a smaller fraction of them need pre-training. So while we do engage in all of these models, we see inference being the thing that really scales.

Dave Vellante

>> I want to ask you about energy building data centers. So you need power.

Darrick Horton

>> That's right.

Dave Vellante

>> How are you handling that problem? Are you building your own data centers? Are you partnering with co-los? How are you solving that problem?

Darrick Horton

>> It's a little bit of all of the above. We started doing co-los. It's more cost-effective initially, but it's clear for any cloud company that you have to build your own data centers as you get going. And so that's always been on our roadmap and we are transitioning into that as we speak. So today we have co-located data centers all over the US, but going into next year, we're looking at a few things really building out globally and building out very large scale facilities in the US. So today our facilities are tens of megawatts, 50 megawatts ballpark individually. But we have under contract sites that can do over a gigawatt. And so that's really where things are going, especially with these larger hyperscale deals. Every single hyperscaler out there is trying to get their hands on gigawatt class data centers, which is an insane shift because just three years ago, a 15 megawatt data center was state of the art, and now we're talking about data centers 20 times that being not big enough, not dense enough. And so we love to see it. It's a lot of fun keeping up with this pace. But we do have over a gigawatt now under contract. And that's something that our competitors cannot say.

Dave Vellante

>> We were at Supercomputing in Atlanta. I don't know if you were there.

Darrick Horton

>> I was there, yeah.

Dave Vellante

>> And so I was struck by how many liquid cooling companies there are in the industry now. We had some on theCUBE and we actually had a hose manufacturer on theCUBE, they were called Omni Services. We like, "What do you guys do?" They make hoses to deliver liquid cooling. And they're talking about connection integrity. And what's your take on cooling, liquid cooling, direct liquid cooling, hybrid cooling? I'd love to get your thoughts on two-phase if you're-

Darrick Horton

>> I would love to talk about it.

Dave Vellante

>> Let's get into it. Great.

Darrick Horton

>> Well, liquid cooling is necessary. It's happening. We use it almost exclusively now. This generation, the current generation of GPUs is the last generation that it will be possible to air cool. This generation, most people are doing liquid cooling. You can still technically get away with air cooling, but next generation, not possible. Completely infeasible because the power density of the chips is getting so high. And so AI has forced a lot of really good innovations in this space. One of them being liquid cooling, others being on the energy side and the density side. But liquid cooling is really the key enabler for getting these higher density chips. And the reason behind that is when you're building these clusters of GPUs, especially for training workloads, the physical space that they take up is actually really important. It's ideal if you can keep everything as close as possible for networking reasons, actually, primarily. And so that means you have to be dense. And so a rack used to be 10 kW and then maybe 20 if you're really pushing it, 20 kW per rack. That was state of the art just a few years ago, air cooling. And then today, with today's tech, you can easily do 120 up to 200 kW per rack. So we're talking 10x from just a few years ago. And then the trajectory over the next couple of years is to go towards one megawatt racks, a single rack that's 1,000 kW. And for reference, a house, a general house might use 10 kW. And so you have 100 homes worth of power in one rack.

Dave Vellante

>> One rack, wow.

Darrick Horton

>> That's what we're talking about here. But liquid cooling is absolutely required. You can't even get anywhere close to those densities on air because air is actually a pretty bad conductor of heat. So you got to get a little more creative. And then technologies like immersion that have other more attractive heat transfer properties, dual phase or a single phase immersion are also being looked at a little more intensely now, although direct to chip, liquid cooling is what's clearly winning at the moment.

Dave Vellante

>> And in fact, we talked, I was at an AI lab that Dell has, they're actually using warm water and a combination. It's hybrid. They're able to tune the fans. Although at Supercompute we had a panel and there was one of the brains behind two phase was really pushing dual phase saying it's more efficient. And do you have thoughts on that?

Darrick Horton

>> Some people are pushing two phase, they're combining the two, two phase in direct to chip, which is really interesting technology, I think that has promise. And so it's essentially combining the benefits of evaporative cooling that you get with two phase, but with the simplicity, relative simplicity of a directed chip design versus an immersion two phase design, which is incredibly complex and comes with-

Dave Vellante

>> Which the GPU manufacturers are not warranting-

Darrick Horton

>> Exactly....

Dave Vellante

>> the immersion today in the world. All right. All right. We got the bell.

Darrick Horton

>> We got the bell.

Dave Vellante

>> It's a special treat when they ring the bell and we're live on the program. So it's like you win a prize.

Darrick Horton

>> Yeah.

Dave Vellante

>> There it is. All right. Fantastic. Yeah. So this of course happens. This happens twice a day and then of course the options exchange, they close in another half hour, they'll ring the bell as well. That's good.

Darrick Horton

>> Very loud.

Dave Vellante

>> That's good. It's like a special moment at the New York Stock Exchange. Well, Darrick, phenomenal having you on. What's next for you guys? What should we be looking for?

Darrick Horton

>> Our series A and scaling out globally and across the US with these larger scale data centers. That's really what we're focused on. So 2025 will be a big year for us as we scale into the hundreds of thousands of GPUs.

Dave Vellante

>> Yeah, well it looks like, well, supposedly Elon's going to try to get there in the first quarter.

Darrick Horton

>> That's right.

Dave Vellante

>> We'll see. We're going to test the scaling laws to see if they hold.

Darrick Horton

>> That's right. We're very excited to see.

Dave Vellante

>> I hope they do.

Darrick Horton

>> Yeah, likewise.

Dave Vellante

>> Darrick, thanks so much. Really appreciate your time.

Darrick Horton

>> Thanks for having me.

Dave Vellante

>> And thank you for watching this episode. We're here three days of wall-to-wall coverage, NYSE Wired and theCUBE community. This is our Cyber and AI Innovators. We keep it right there. Dave Vellante for John Furrier, be right back. Right after this short break.