We just sent you a verification email. Please verify your account to gain access to
CDOIQ Symposium 2024. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For CDOIQ Symposium 2024
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for CDOIQ Symposium 2024.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
CDOIQ Symposium 2024. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to CDOIQ Symposium 2024
Please sign in with LinkedIn to continue to CDOIQ Symposium 2024. Signing in with LinkedIn ensures a professional environment.
Dave Vellante and Sanjeev Mohan interviewed Shayde Christian, the chief data and analytics officer at Cloudera. Shayde discussed his role in delivering services to the company's executive staff and overseeing AI-related services internally. He highlighted Cloudera's evolution beyond being a Hadoop company and emphasized the importance of reliable and responsible data for AI initiatives. The conversation touched on the challenges of implementing gen AI effectively and the value of bringing AI to the data rather than moving data for energy requirements. Shayde ...Read more
exploreKeep Exploring
What is the scope of the role of the person in charge of data and analytics internally, including AI aspects?add
What are Cloudera's superpowers when it comes to delivering business value and ensuring trust in AI through reliable and responsible data?add
What strategies are being implemented to harmonize and integrate different data environments and models in a true hybrid approach?add
What is the biggest advantage of companies like Cloudera and hyperscalers in the technology sector?add
>> Hi, everybody. Welcome back to CDOIQ from
Cambridge, Massachusetts. My name is Dave Vellante.
I'm here with Sanjeev Mohan. My cohost Paul Gillin
will be in house tomorrow. This is our eighth CDOIQ, the 18th event. We've done well over 100
interviews. We've got 150 guests. We're really excited to
have Shayde Christian, who's the chief data and
analytics officer at Cloudera. Shayde, thanks for coming
on with me and Sanjeev.
Shayde Christian
>> I'm privileged to be here. Am I your eighth or 18th
interview of the day? >> I think seventh maybe.
- Seventh. We're getting there.
Shayde Christian
>> We've got a ways to go now
to match the body of work >> that we've created here. Chief data and analytics officer. I'm waiting for AI to get into that title. We've been talking all day about those CDO and the AI roles coming together, but talk a little bit about your role, the scope of that role.
Shayde Christian
>> Yeah. I'm in charge of data and analytics internally,
so I deliver services to the executive staff
in their organizations. I'm also in charge of AI as it relates to that service delivery internally. We're a technological company, and so we have different AI
factions within our product development headed by Dipto, our CPO. >> Ah, okay. You're a practitioner?
Shayde Christian
>> In part, and then because
I utilize our products and services and my team utilizes them, there's an external facing
function here to evangelize and talk to other CDOs
about how to do this. >> I'm sure the sales team wants to drag you all over the world.
Shayde Christian
>> Marketing and everyone. >> Best practice. theCUBE actually started in Cloudera's offices. Did you know that? >> Oh, no way. Where?
- Oh, my God. >> Right next to the old
Fries. You remember that?
Shayde Christian
>> Yeah.
- electronics. >> It was next to box. >> John Furrier put out a tweet saying- >> In Palo Alto. >> Yeah. In Palo Alto. >> Saying, "Hey, I'm trying to find a place
Shayde Christian
>> to cohabitate . " I said, "Well, why don't
you hang out at Cloudera? " We did a quid pro quo. We said, "We'll create a cube
studio inside of Cloudera and exchange for the space. " We did. That's how theCUBE got started, and then we had one on
the East Coast, too.
Shayde Christian
>> Am I part of the exchange?
You're still paying this off? >> No. You guys kicked us out
when you needed more space, but point is back then Cloudera started the big data movement. You were associated at a time with Hadoop. You hired Doug Cutting, who
was the founder of Hadoop. But you've evolved
dramatically since that. Obviously, merged with Hortonworks. What's the update on
Cloudera? What are you up to?
Shayde Christian
>> Yeah. We're no longer a Hadoop company. That's one block in our
huge pyramid of innovation. We've had a lot of innovation
since, as Sanjeev knows, where analytics at the edge, multi-cloud. We have accelerators for AI that let you bootstrap these
foundational models very quickly, but none of it really matters. Those are just blocks in our stone. It really matters what we're delivering in terms of business value. To trust your AI, you
have to trust your data. Your data has to be reliable,
has to be responsible, and these are Cloudera's
superpowers from the time theCUBE began until now. >> Sanjeev, you put a link
into the notes here on this Goldman report, which essentially,
I'll just read the title. You can search. It's a report that's kind of negative on AI generally,
but it's gen AI specific. Gen AI, too much spend, too
little benefit. I don't know. It's maybe a little controversial. Maybe it gets some good
eyeballs, but I'm not surprised. We've just wrote as well,
it's still elusive, that ROI. Takes time. It's not
some magic technology. But what are you seeing
with customers in terms of how they're applying gen AI,
and what are you advising them?
Shayde Christian
>> AI in general, our customers have been
employing AI in Cloudera for some time and are
achieving massive amounts of business value. This was a little bit
more specific on gen AI, and I understand the point. If you need to develop a
landscape of Nvidia architecture to build your own LLMs to do A, B, and C, then you're probably going to overspend and under deliver. >> Big denominator.
- But there are so many ways
Shayde Christian
>> to employ gen AI cost effectively
and for good use cases. >> Just even a small one. I was
part of watching a group. One developer for two months, 99. 99% automation of contract summarization and
managing contract clauses, and you do that by bringing
the models to the data. You bootstrap them where you
can train them internally, and that group trained them on CPUs. There are very cost-effective
ways to do this. Foundational models are
becoming commodities, just like your compute and storage. At any phase of these kind
of revolutions, of course, you're going to overspend
and under deliver. But like any technology, this
is going to come together. >> I also want to make a point. You're bringing AI to data, so having data under your
management is really important. One thing that Cloudera does
not talk enough about is that Cloudera, actually
even today, has one of the highest amount of
data under its management, which is 25 exabytes. It's actually far more than
many of the big companies that we talk about on the show.
Shayde Christian
>> That's 25% of the world's
data. That's enormous. Yeah. We're not good at being- >> Wait, Cloudera software
touches that 25%?
Shayde Christian
>> Our platforms host 25% of the world's data. Massive. >> Irrespective of physical
location, obviously. >> Correct.
- That's interesting. >> As analysts, we hear a lot
the repetition of things like,
Shayde Christian
>> "We're going to bring AI to the data."
Shayde Christian
>> Correct. >> We've heard Jensen say that
at a variety of conferences. We've said it. I've said it. Howie Xu. I don't know if you know
Howie. He's an AI guy, former security companies,
Palo Alto and Zscaler. He came out with a contrarian view. He said the energy requirement
is going to be so high.
Shayde Christian
>> That is true.
- You're actually going to have >> to bring the data into where the energy is, and that's going to be in the cloud. Now, that's contrary.
It's very contrarian. We produce the gen AI power law. We talk about it all the time. It's a power law, but it basically says,
like a lot of power laws, you're going to have a long
tail across the enterprise. The vertical axis is model size, the horizontal axis is domain specificity. You're going to have a
lot of on-prem activity. You guys are multi-cloud. You
don't care whether it's in the cloud or on-prem, but what
are you seeing in terms of domain specific AI generally, but specifically gen AI
in terms of being adopted where the data lives? Are people moving data
into where there's energy? >> Energy. Yeah.
- Because people hate to move data. >> Can I say something? I completely agree.
Shayde Christian
>> The energy requirements
of AI are astronomical >> and we know that, but then
if you read the Economist, according to the Economist,
the energy problem of the entire world is
soon to be resolved, and that is through solar. >> That's what Elon says.
Shayde Christian
>> Actually, what's happening
is China, in order to get out of the economic malaise they're
in, they've oversupplied. They have overabundant
factories that they're built, so they're literally dumping solar cells, which are very efficient these
days, all over the world. That's their way to get- >> Ray Kurzweil says the
same thing, by the way. Exponential growth. He says, "If you do the
exponential curve," he goes, "This problem will be solved." >> Correct. >> That's interesting. Anyway. >> Given that you have a broad observation space
across all that data and all those customers, are you seeing the domain specific AI pop up and produce results?
Shayde Christian
>> For us, what we look at is true hybrid. Hybrid to me is something
where you have data and apps in multiple locations. True hybrid, I want to differentiate. It's somewhere where you have
a harmony in your multiplicity of environments where the
environments are working cohesively together to allow you to democratize data effectively
and to generate insights. We're the only true hybrid
company in my opinion. The tenets of true hybrid
really are about interoperable, portable data services. I want to build a workload one time and I want to run it anywhere
in the world on any platform, and it works instead of
having to rewrite it in AWS and Azure on premises. Open data lake houses are
a big tenet of true hybrid. Having a central command
and control for security and governance is very important. To me, a practitioner having
the full data lifecycle of tools available to you in integrated
fashion is another tenet of true hybrid. Yes, I agree with you. We're already doing a
great job at Cloudera of bringing the models to the data, but you've got to take the
data to the models as well. We're looking at that true hybrid harmony of multiplicative environments working together as a way to accomplish it. >> How do you harmonize that? Is it an abstraction layer that you apply across those different estates and different models, or how do you make that opaque to the customer? I mean
that in a good way.
Shayde Christian
>> Common form factor.
You're on one form factor. It doesn't matter what the form factor is. You have common code is one thing. Another thing is you don't
have to rewrite software. We all had a lot of pain when
we migrated to the cloud, because we had on-premise software, and now we had to acclimate to
our AWS store or Azure store. We're doing it at the level
of software and form factor. Common code for form factor and same code. You write it once, you run it anywhere, and there's no amount of rewriting. Optimization is always
going to have to happen. If you're going to go to the
cloud and save on compute and storage, you got to
optimize your workloads. >> One of the ways that
Cloudera has done it, from what I've seen is, that they
very quickly got into S3 compatible storage ,
and now with Iceberg. Actually, Cloudera adopted Iceberg way before it was even a thing. That's how they're standardizing, how they're writing applications. Then after that, what
they do is they take these applications, they dockerize
it into a container, and then they can run it on prem so you don't have to rewrite it. That's how you get this hybrid benefit.
Shayde Christian
>> Love it. - What's your take
on the whole trend toward >> OpenTable formats? Everybody wants now to be able
to control their own data, bring any engine. May the best engine win is the mantra now. How do you think about that? How are customers thinking about that? How do you think about
it as a practitioner? How do you think about
governing that open data? Sounds like a free for all. How do you balance the need to reduce lock-in, control
your own data with it's got to be governed and I got
to generate business value?
Shayde Christian
>> We're highly against vendor lock-in. That's why we're almost
purely open source. Our SDX components governance and security have some wrappers
that we provide as well, but our offerings don't even
lock you into us as a vendor. Open is still very big with us. >> What about the
governance aspect of that? How will customers solve
the governance problem while at the same time preserving that openness? How do
you think about that?
Shayde Christian
>> I don't think I can sugarcoat
this. I've been surprised. I've been attending a lot of
conferences since 2021, and before gen AI exploded, I'm still being asked the
same questions from stage and panels, which is what is
your enterprise data strategy, and how do you do data governance well? I think what's wonderful
about our platforms is that we provide common
data and governance. Same way to do it on any form factor, but it's just the starting point. Governance is really, really hard work and we can make that uniform, but nobody can really make that simple. >> How should we think
about where you add value? I'm talking about Cloudera
as a vendor now, not so much, Shayde, as a practitioner, but I'm interested in that as well. Oracle would say it's in the DBMS. They're not relinquishing that value. We've talked about, and I
don't know if you buy this, but it seems to me that the DBMS in terms of the source of
control, it seems to be moving to the data catalog. But that's not where the value is. It's all being open sourced. >> Correct.
- Where is the value >> equation now?
Shayde Christian
>> Is it the ability to
harmonize the different data that's out there? Is it the ability to work multiple tool chains at the same time? How do you see that?
Shayde Christian
>> Data is everywhere. How
do you break down the silos? How do you get it all
connected in together with a common set of tools
to govern, secure, manage, process, collect, and analyze it? To me, that is the common
element, getting commonality and harmony with one
place to do everything, and you can take it anywhere
in the world you want to go with it, including this discussion
we've had about bringing models and data to each other. I still think this true hybrid, and part of that true hybrid really is modern data architecture. It's about data fabric and data mesh. I just still think that's it. I hate to give you the
same answer twice, Dave. I feel you deserve a second. >> It's all good. >> Dave, I have a point of view on this. If you look at what's
going on in our industry with these rapid changes that are going on with new standards and
new ways of doing things, there is really very
few moats that are left. Technology is not a moat anymore, because if you're working on Iceberg and catalogs are open source,
then what is your moat? There are only a few moats. One moat is capital and skills, because OpenAI and all these companies. >> Cloud vendors have the balance sheets. >> Yeah. Cloud vendors have the
money. But the biggest moat
Shayde Christian
>> from technology sector in which
Cloudera does really well, and so do hyperscalers,
is an integrated stack. The companies that have
an integrated stack that brings together structured
and unstructured data. Also, batch data, streaming data with any compute engine on top. That compute engine could be
Spark, it could be Pandas, it could be SQL, DuckDB,
or it could be an LLM and you could do RAG
and vector embeddings. Then on top of that, you build your data products or AI products. That to me is the moat. >> Interesting. It's one
thing to have the tools, because I would argue
that AWS has the tools, but that stack is not integrated. It's just their tools in the toolbox that you have to then integrate.
Shayde Christian
>> Correct.
- If you're going to be multi-cloud,
Shayde Christian
>> they certainly don't integrate with Azure, and that's what I meant about this harmony and the true hybrid nature of it. You want one place to
do everything no matter where your data is. What you're saying is exactly true, but once I get to that
level, I want to be able to do it once and do it everywhere. To me, that's the secret.
Shayde Christian
>> You have unified
governance irrespective of where the data resides
and what format it's in. That unification and integration, in fact,
the pendulum has shifted. It used to be modern data
stack where we had hundreds and thousands of products
that we were dealing with. The pendulum has now shifted
to a more centralized way, because what businesses
are saying is that, "We don't have the time or the budget to stitch
together different things. We want the vendor to
provide us a data fabric or some prebuilt,
pre-integrated components." >> We rewrote our applications
to go on the cloud
Shayde Christian
>> and then we learned it
was more expensive there. Now, we're repatriating. If you're not on that common data platform with common tools, that's so
much energy wasted and redoing. I want to do it one time.
I want to do it one time. >> The other bromide
that we hear is you got to get your data house in
order before you can do AI. It makes sense, but how
much of your data house? If you have to get your
entire data estate in order, you'll never finish painting
the Golden Gate Bridge. How do you think about
that as a practitioner and as an advisor to your customers? Can they get part of
their data house in order to solve a specific business problem, or do they actually have
to at least have a strategy to get the entire data estate
in order before doing it? >> I like how you ended that sentence.
Shayde Christian
>> I like the strategy forward, but yes, it all depends on the use case. There's an insurer who looks at training AI models to recognize from satellite
images how devastated an insured home is after
a natural disaster. Partially devastated, somewhat
devastated, fully devastated. The AI models offer that to an individual, say, "Do you agree? " An ACH payment is cut the next day to repair the cost of
that home or a new home. For that, you need external data. You don't need to
aggregate all of your data. But there's a biopharma
company who's trying to significantly cut down
the duration of time it takes to get lifesaving cures
into the marketplace. In that super complex, hugely valuable use case, you need a lot of data to be aggregated. It really depends on what
you're trying to accomplish, but there's a lot of low hanging fruit. >> All right. We got to
leave it there, Shayde. Thanks so much. theCUBE. >> Appreciate you. Thank you.
- It's good to have you. >> Thank you you so much for coming. >> Thank you. All right. Keep
it right there, everybody. >> We'll be right back. Dave
Vellante, Sanjeev Mohan.
Shayde Christian
>> You're watching theCUBE's coverage of CDOIQ from Cambridge, Massachusetts. The 18th annual CDOIQ. Be right back.