We just sent you a verification email. Please verify your account to gain access to
SC24. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For SC24
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for SC24.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
SC24. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to SC24
Please sign in with LinkedIn to continue to SC24. Signing in with LinkedIn ensures a professional environment.
Senior Director of Product Management, Unstructured Data SolutionsDell Technologies
Geeta discusses the challenges of AI data management, highlighting the importance of proper data handling. She mentions Dell's AI Factory and Data Lakehouse as solutions for customers. Different user segments are identified, including advanced users, CSPs, and early adopters exploring AI possibilities in various industries. Geeta emphasizes the opportunity for organizations to develop a strategic approach to data management in the AI era. Geeta and Dave stress the importance of selecting high-value data for AI applications. They discuss Dell's AI Factory as a...Read more
exploreKeep Exploring
What is the Dell Data Lakehouse and how does it fit into Dell's portfolio of offerings for customers?add
What are some organizations doing in terms of data management - starting from scratch or leveraging existing assets?add
What should be the first step in preparing for AI implementation?add
What approach is being considered when deciding which components are necessary for effective AI implementation?add
What considerations need to be taken into account when building a team to make AI real within an organization?add
>> Good afternoon HPC fans and welcome back to Atlanta, Georgia. We're midway through day two of the three days of coverage we've got here for you on theCUBE. My name's Savannah Peterson with my left-hand man, Dave Vellante.
Dave Vellante
>> Hey, I'm lefty, you know.
Savannah Peterson
>> I know you are lefty. I was born lefty actually, fun fact, and they tied my hand behind my back and turned me into a righty. So I do everything with my left hand except for write.
Dave Vellante
>> So are you quasi-ambidextrous, that's cool.
Savannah Peterson
>> Quasi, yeah. Yeah. Cool or just clumsy depending on the situation.
Dave Vellante
>> Just wacky.
Savannah Peterson
>> Yeah. Definitely wacky. I think we know that. Our next guest, however, is definitely not wacky. She was so great earlier this morning, we had to have Geeta back on. Geeta, thanks again for joining us this afternoon.
Geeta Vaghela
>> Thank you. I'm having so much fun with you guys. Thank you for having me.
Savannah Peterson
>> Well, exactly. It seems like we're now fan-girling on you, so it's great to have you back on. I'm really excited we can dive a little deeper into the storage story and into data management. I want to start really high level just in case folks are just tapping into this conversation. AI creates a really interesting data management problem.
Geeta Vaghela
>> It does.
Savannah Peterson
>> Can you break that down for us?
Geeta Vaghela
>> Yeah, I think it's sometimes the unseen problem because I think there's been all of this just massive focus on the compute. Can I get the GPUs? Can I get the compute? And there's been a lot of focus in that, and that's all goodness. That sort of was a requirement we needed to make sure we got through that. But there's this sort of afterthought which comes in which like, okay, now that I've got all this compute power, what do I feed it and if am I actually going to use it in practice? And I think that's where we're seeing a lot of the data conversations pop up. So this event has been amazing my first time at SuperCompute, but I have learned so much. Yeah, normally on my flight back, I can download all my thoughts. This one I need a couple of flights just to download my thought on what I've learned. Because the conversations are so broad and so deep and they vary so much. Everybody's got a slightly different position that they're coming from, whether it's academia or enterprise or folks that have been doing HPC for a long time or newbies and the government-
Savannah Peterson
>> Government's here too.
Geeta Vaghela
>> 100%. Right?
Savannah Peterson
>> Yeah.
Geeta Vaghela
>> There's all these different nuances and I think just having been in data and data management for a long time, all sorts of questions go off in my mind, which are, okay, what are your priorities and what are your hard data requirements? There's compliance and governance and all these things that many organizations, the government, have to comply to. And I think when just now starting to scratch the surface on what does that mean? And how do you quickly get yourself ready so that you've got the data, these GPUs that are either on their way to you or have been delivered. So I think it's a really, really interesting time where people are asking hard questions about storage and data management. I think we're going back and we're reevaluating is it a new thing? Is it an evolution? Am I going to do this for a while and then go back? Should my investment be all in? Should I run it as a POC? Should I do it in the cloud? I mean, there's all of those questions and they're big questions when it comes to data because it means you're moving it around doing all sorts of things. So fantastic time. I don't know that any of us have the answer. But I can say I feel like being within Dell, we have access to so many people to talk to yourselves and we're all learning and it's good. We'll collectively figure this out.
Dave Vellante
>> I think about the classic data management problems. You've got to ingest it into a data lake or a database. And then you've got to analyze it. You've got to clean it. You've got to engineer it. You've got to data science it. You've got to bring in metadata. You've got to govern it. I mean, it's all these-
Savannah Peterson
>> There's a lot of handholding....
Dave Vellante
>> linear layers, and it's a real difficult process. And then once something comes out of it, you got to do it all over again because it's out of date.
Geeta Vaghela
>> Yeah. Exactly.
Dave Vellante
>> So how is AI sort of changing that? What are the problems that you're talking to customers about today and how is it different?
Geeta Vaghela
>> I think the problem is, as you said it, Dave, it's marrying all of that together. There's a lot of technology out there that does a piece of it really well. But if you kind of start to think about how do I do this in a repeatable fashion that I can stand behind, order, authenticate, all those things, it suddenly becomes a mix of tools that have to somehow work seamlessly really well together. Your metadata comes from one engine. You've got structured, unstructured, semi-structured data. They all have different sorts of paradigm shifts. So I think the biggest thing that I'm seeing is how do we use a selection of toolsets and bring them together such that we're providing a full workflow, kind of solves all of those problems, let me store it, let me ingest it, let me understand it, let me make some use out of it. But also do it in a way that the sum total gives a repeatable model that many of these companies can stand behind because otherwise we get into hallucinations and challenges around is this real? Is my data trustworthy? Is it clean? Everything that you're saying.
Dave Vellante
>> So, sorry. If I may?
Savannah Peterson
>> Yeah.
Dave Vellante
>> Is your approach then to have an opinionated, curated, capabilities solution for your customers?
Geeta Vaghela
>> Yeah, I mean, we've had the Dell AI Factory. We've launched over the last year additions to our portfolio within Dell. We've been doing storage a long time, we've evolved those capabilities. But also the Dell Data Lakehouse, which is the ability to do, what you highlighted, how do I pull out the metadata, get federated engines, run some compute against it to get that analytics out of it. So it really is a series of options within the portfolio. So our customers can choose based on where they are in their journey. They don't have to lift and shift and throw away what they've got. But, really just use a framework with the AI Factory and then use the pieces they need as they need them. And what Dell does is we validate those solutions, we've got a point of view on where the strengths are, what to consider. And for those that are early in the journey, there's professional service opportunities just to kind of, what questions should I even ask? Where should I start to even know where I should plug myself into this environment? So that's where we are.
Savannah Peterson
>> I think you just brought up a really good point. Dave and I were at Dell Tech World when they launched the AI Factory earlier this year. Really cool to see the evolution there. I mean, the factory is actually here on the floor, which is very cool. Lots of different components to that. I can imagine you're talking to a variety of different customers who are probably, and I don't mean to project, but I would imagine a little bit overwhelmed because you've got all these different sources of data, all these different opportunities and all these different toolkits. How do you start them down that decision tree so that they can have a shorter time to value and see that ROI quickly?
Geeta Vaghela
>> Yeah, it very much is that we've sort of developed a service offering that really allows us to get with our customers, or potential, and ask some quite rudimentary questions sometimes. And there's many light bulb moments like, oh, I hadn't considered that. And just like the thing we said, okay, how do you manage your metadata? What are your requirements? Are you obliged to certain GDPR? Or, how do you manage your PI? I mean, just some of those kinds of questions, which gets a lot of folks just thinking about, oh yeah, I've been doing it this way, but over time as I've evolved and maybe I did some POCs in the cloud and I went and did this over here in dev test. I've broken some of my paradigms, how do I build out this engine? So I think the biggest part is starting with those questions, getting a baseline. And then I think the fact that Dell's approach has been a very open one plug into open ecosystem. So if you're already doing Parquet... It's your choice. We're not really trying to mandate saying it's got to be our proprietary environment because someone's starting from somewhere and that somewhere comes with some mystery and there's been investments made. So I think that's the biggest thing that I'm seeing that Dell's got a starting point. As you get through some of those early questions, we'll usher our customers into, okay, here's some of the areas where you might want to dig a little bit further. Let us either introduce you to a strategic partner because we don't have it all and Dell's built a lot of strategic partnerships, or we think we have a product that helps you with that, would you like to POC? And then we're kind of going down the path of, we run labs in our own environments that a customer can log into, run a POC, get a feel for how it might work for them or on their own, run off with a POC and see how that works.
Dave Vellante
>> What do you find is working? Because what you described, it sounds great, but it's complicated.
Geeta Vaghela
>> Yeah.
Savannah Peterson
>> Choose your own adventure, but a super intellectual adventure with a great partner, obviously. But, yeah.
Geeta Vaghela
>> Yeah, yeah, yeah.
Dave Vellante
>> And so if I think about, okay, I've got data in the cloud, I've got metadata, I've got technical metadata, business metadata, and I've got to figure out now Iceberg, open Iceberg tables and how am I going to govern that stuff? And it's just this mishmash of stuff. Okay, great, you're going to have a curated stack. So how are people applying it? Are they taking small bites, trying to figure out ROI? And then what's that roadmap look like? Can it lead to sort of a bigger, more impactful outcome?
Geeta Vaghela
>> Yeah, yeah. I'm seeing sort of three segments. There's the really advanced users who have thought this through, they've got a very clear objective, and they're coming at it saying, I've understood what I'm trying to achieve out of this, I know what my end game is. And therefore I'm looking for particular questions to be answered, help me with that and I'm good to go. There's a set of, I'll call it, the CSPs and the enterprises, quite different markets when we think about AI, generative AI. The CSPs are still very focused on the higher performance. They're kind of building the LLMs, doing more of the training versus the inference. So some of those separations allow us to minimize the number of adventures you might want in your world because now you can kind of categorize, well, I'm really not getting into training. I'm not getting into these thousands of GPU counts. Help me understand an environment that roughly is of this size and this is how it's going to look for me. And then there's the really, really early conversations which is, I put in an order for my GPUs and haven't yet defined what success is for my business. I want to go try what this might do for me and understand where my roadblocks might be because I haven't yet looked at my data and I don't know what I've got. I haven't looked under the covers. So I'm sort of seeing a level of maturity that often infers where these customers might drop into this sort of pick your own adventure. And then also some have very clear this is success for me, others are still trying to figure out what the technology can do for them.
Dave Vellante
>> The third example is fire, ready, aim.
Savannah Peterson
>> Yeah. Right. Well, exactly.
Dave Vellante
>> Got me some GPUs.
Savannah Peterson
>> Yeah. And now how am I going to utilize them? We don't know. I'm curious if you're noticing any... Dell gets to work with so many different companies across verticals and industries. Are you seeing any trends that are vertical or industry specific in terms of who's ready to roll versus who's still learning?
Geeta Vaghela
>> I mean there's a couple of verticals that just do directly like media and entertainment, they're getting quite big into this. We're seeing a lot with, when we met earlier, kind life sciences, companies in the medical region. Certainly there's that space. Seeing some work around the autonomous driving. So I mean there's a lot of the verticals are getting into it. My takeaway is that the horizontal of what AI is is pretty similar with the requirements that each of them are coming up with. Kind of comes down to a capacity, performance, resilience, metadata, insights. The sort of requirements tend to be a little bit more generic, at least for now, we may see them refine in the future. So my approach has been, okay, many of these verticals are going about it for their own outcome, but they have very similar challenges. And so the problems to solve are quite similar in nature.
Savannah Peterson
>> Interesting. I mean, we are all in it together and lots of collaboration. I love that your candor when you say that we're all still figuring it out. I mean, the AI Factory has only been announced for about what the last six, seven months? Yeah, something like that. What do you see the evolution there being? Do you think we're going to continue to see... I mean there's a lot of tools out here right now. And one of the things I think that I'm always very impressed with Dell in terms of this offering is you're decreasing the cognitive load for our customers, folks looking for solutions and making that easier. Do you think the market is going to narrow down to a certain set of tools or do you think we're going to continue to see this broad swath?
Geeta Vaghela
>> . I think we'll pendulum swing for a little bit. I always feel like when there's a new technology, everyone's trying to figure it out and we kind of go all in and we've swing the other way and say everything must be. And I think the happy medium is somewhere in between because ultimately all of this has a result and a cost and it's like what is that worth to you? We talked earlier about is it going to replace humans? I mean, in my mind it's helping humans. It's going to make us all more productive, more efficient. And so I think it comes down a question of what's it worth for you? What's the right optimizations you need to make to run this in production? Because running it in test dev is different before you kind of scale it out and make it and all in. So I expect a little bit of thrashing around and then it'll settle somewhere in between. Like we've seen, I feel like, with every other technology. Cloud was another example of that in my mind.
Dave Vellante
>> When I think about the spectrum of maturity model that you laid out, I mean the cynic in me say data management has failed us. We built a $50 billion industry to do BI because we couldn't get the data out of the database, so we created this other layer. And so my question is, I forget what it was called Gartner years ago had this thing, make it up like old IT, new IT. Put a brick around old IT, and then go after the new IT and it's all cloud and it's all wonderful. How do you see this playing out? You've mentioned in some of the big LLM vendors doing training, they have a clear indication of what they want. I almost feel like the CIO wants to say, this would never happen, but throw it all the way and start over.
Savannah Peterson
>> I can't wait to see where you're going David.
Dave Vellante
>> Throw it all the way and start over because this doesn't work. So either put a brick wall around it and service the old sort of business that way, but we have to reinvent the way in which we do data management and really make it AI native. How do you think about that possibility? I mean, we never throw away stuff in it. I get that.
Savannah Peterson
>> Legacy exists there for a reason, yeah.
Dave Vellante
>> But I feel like that legacy is so broken even though it sort of worked and it's definitely advanced us, but it's never given us the promise of 360 customer view and simplicity. And now AI's making that promise-
Savannah Peterson
>> I very much agree with you.
Dave Vellante
>> And it seems to like, wow, this actually may happen with data for the first time in our lives. So do you see organizations having that conversation, start with a clean sheet of paper? Or are they saying, no, we have to leverage those existing assets and figure out how to not make them an albatross?
Geeta Vaghela
>> The interesting found that I found, Dave, is that a lot of customers weren't thinking about data. We talk about data management like everyone's been doing it for a long time, and I don't know that they have. I think people have been storing data for a long time and they have been doing what they felt was necessary. But data management takes a lens beyond infrastructure into data. What is it and where are you using it and why does it matter? And I don't know that everyone's been doing that. I've run across customers who haven't. So I think there is a section of the market, I have a hard time quantifying that, but there's a section of the market of who are just getting into data management. So they are starting with a clean sheet of paper.
Dave Vellante
>> It's probably pretty big actually. It probably is a pretty big segment now that you mention it.
Geeta Vaghela
>> Right.
Savannah Peterson
>> Especially as we're taking data out of silos to combine it into larger systems and leverage supercomputing to architect the solutions of the future. A, we have way more data than ever before, B, and it's only going to continue to increase in velocity. But yeah, I think that's actually a good point is there is the opportunity to say, whoa, we didn't actually have a huge strategy for this at a high level, we've just been gathering it and hopefully keeping it secure in the meantime.
Geeta Vaghela
>> Right. Yeah, yeah.
Dave Vellante
>> Well, I mean I love Amazon, but Amazon would say, well, AI's all about the data and the data in the cloud is really good data, but it's a mess. Like I was saying, you got technical metadata, operational data, business metadata, it's just all over the place. You got 15 different data stores. And so, to your point, they're doing stuff with it at the very primitive level, but it's not solved to the point where you can say, okay, AI, here you go. We're ready to play.
Geeta Vaghela
>> Exactly. Exactly.
Dave Vellante
>> And that's really where people should start. I wouldn't start buying GPUs. I'd start with what data do we have? How is it adding business value? How can we drive differentiation in our business? Is our data actually ready? Is it governed? Is it secure? Start there, spend a year figuring that out. And then by then GPUs will be bigger, faster, better.
Geeta Vaghela
>> And that's exactly it. I mean, you know what I was sort of saying, there's sort of this questionnaire, almost a rudimentary questionnaire. That's what it's starting with. It's saying, what are you doing today? What are you trying to achieve? Where are you? Where does your data live today? And a lot of times it's just all data's not equal and I think that's sort of what in my mind is we've gone through this pendulum of I've got to store data, now I've got to make it super cost-efficient because I'm going to drive down my cost and my IT budget is constrained. Now it's about data value, but have I made the investments to know what data is valuable because all of it's probably not? And there's some investment or, I don't know, cleanup required in that space. So I think it's going through that journey. And ultimately in my mind, there's been this data set concept that's been resonating with me, which is not all your data, it's a set of data that's actually your pertinent information. That's what you want to feed the GPU, and that's your high value data because that's what you're actually getting the most best results out of. So I think we're in that journey. But I agree with you. I think there's a lot of tools out there. I think there's many people starting from scratch because they've not thought about this in the same way before. But I also think it's going to be a great opportunity within the market to really refine all the tools that are out there and say, okay, let's stack them up and say against this criteria, which ones really meet it? And so I think you'll see a shift in the market around the most appropriate data management tools bubbling up and probably starting to become more mainstream and some of the others potentially just moving out. But I think it's a great opportunity for us to leverage this and really refine what does data... I mean data management, you ask three people, you get three different definitions.
Savannah Peterson
>> Absolutely. You just brought up such a good point I want to sit on for a second, and I think this is kind of distilling a little bit or settling down. But the quest for GPUs, everyone's excited about compute. The reality is not everyone needs to use every data point they have and train everything on these huge models. There's actually a lot of benefit, to your point, of curating and looking at what's the cleanest data we have, the best data we have, and what can actually inform decision making or research or innovation versus just saying, well, we've got all this stuff, we might as well just throw it in there like making a smoothie and put it all in the blender. It's not necessarily going to taste good or achieve results if it's the entire refrigerator.
Geeta Vaghela
>> Right.
Dave Vellante
>> That was Hadoop mistake.
Savannah Peterson
>> Exactly.
Dave Vellante
>> that again.
Savannah Peterson
>> Totally. So I think that's one of the things that's so cool about what Dell's doing. We're, Dave and I, in particular very closely following the AI Factory is saying, okay, what are the components you need? What's going to actually work and what's actually going to tell you valuable information versus just spit something out? Which I think it's an interesting tension here where to your point, it's like do you just put a brick box around it or do you kind of go in there and selectively pluck out exactly what you need to make the magic happen?
Geeta Vaghela
>> Yeah, yeah.
Dave Vellante
>> I think you got to cherry-pick. I do.
Savannah Peterson
>> Yeah, I agree.
Dave Vellante
>> I don't think you can-
Savannah Peterson
>> It's too much....
Dave Vellante
>> take the old and say, okay, now become AI-
Savannah Peterson
>> And especially with all the new data coming in, there's no way to do that.
Dave Vellante
>> I mean, there are certainly are ways to inject intelligence into that awful data pipeline to make it simpler. Data engineers, we say, oh, they complain. And they're right. They spend 80, 90% of their time wrangling data. AI can help with that. You know what's interesting to me, Geeta, is that you're having these conversations with customers. Dell five years ago wasn't having these conversations. You're now a strategic partner in many ways because you've got this massive portfolio, you've got this huge distribution channel.
Geeta Vaghela
>> That's right.
Dave Vellante
>> And you've gone from selling boxes to having these strategic discussions. You're still making money selling boxes, I get it.
Geeta Vaghela
>> Absolutely.
Dave Vellante
>> But you're having really interesting discussions across the ecosystem and with customers.
Geeta Vaghela
>> Yeah. And I think they're starting to get through the whole organization. So you mentioned the data scientists. Before we talked to the infrastructure folks, and now it's the data scientists or it's the DevOps engineers and they've got very different ask. Their perspective of life, their value, all of what they do is a completely different level of a conversation and the expectations on even the infrastructure because everything's kind of a block diagram, it builds on what's underneath it. I think it does really look to us to challenge ourselves on are we building the right staff? Are we serving the right persona within these environments to make AI real within their environments? Because it's no longer just the infrastructure people, it's a series of people that need to come together within an organization to make it real. So yeah, it's very exciting.
Savannah Peterson
>> We're all about making it real here on theCUBE, it's one of our favorite conversations to have. We were wondering if 2024 was going to be that year, a little bit. I think we're going to see a lot more in 2025. A lot in the inference world. I have one final question for you. If there's someone watching today looking at you, a brilliant woman here on this stage who's thinking about getting into our arena, who maybe some of the students like we were talking about, the researchers or you're a mathematician by trade, which is really spectacular. What would be your advice to that eager mind or maybe that minority mind who's thinking, oh gosh, this sounds so cool, but I'm just a little nervous and I don't know where to get started?
Geeta Vaghela
>> I think just jump in. I feel like curiosity is the best skill that anybody can have. Be curious, ask the questions. Don't be shy. I think the tech industry, I feel like it's evolving so much and there's so much that I'm a big believer in young women in STEM. I support multiple groups. I mentor. Dell has a set of offerings, these forums where we can help young women, Girls Who Game is one that I'm very passionate about. So I think there's so many opportunities, take advantage of them because if you don't get yourself out there you won't know. And we're learning together, so it's not like anybody's behind. But I think the curiosity, ask the questions, get out there. I think we've got so many curious minds, brilliant people out there in the world. Let's get everyone to use.
Savannah Peterson
>> Oh, beautifully stated.
Dave Vellante
>> A lot of brilliant people here I'll tell you.
Savannah Peterson
>> A lot of brilliant people. Just because we're all crunching numbers and thinking about complex things doesn't mean we're not a welcoming group of individuals.
Geeta Vaghela
>> Absolutely.
Savannah Peterson
>> I mean, we hugged after our last panel.
Geeta Vaghela
>> Yes, we did. We did.
Savannah Peterson
>> I think that matters. I think that's kind of the behind the scenes you don't always see here is you don't have to be perfect. In fact, the hyperscalers don't have it perfected yet either. We're all learning together. Geeta, thank you so much for spending so much of your day with us today. I feel like you're spoiling us with your attention and we're very grateful for that.
Geeta Vaghela
>> I appreciate it. Thank you.
Savannah Peterson
>> Dave, always a fantastic discourse.
Dave Vellante
>> .
Savannah Peterson
>> And thank all of you for tuning in wherever you might be. We're here in Atlanta, Georgia, midway through day two of SuperComputing 2024. My name's Savannah Peterson. You're watching theCUBE, the leading source for enterprise tech news.