Can Hadoop get past enterprise-grade roadblocks?
by Marlene Den Bleyker | Apr 14, 2016
The keynote for day two of Hadoop Summit in Dublin, Ireland, exhibited a high level of technical information with speakers from Yahoo, BMC Software and Hewlett-Packard Enterprise (HPE). Each of the speakers spoke about their company’s work and contributions to the Hadoop ecosystem.
The first speaker was Sumeet Singh, senior director of products for cloud and Big Data platforms at Yahoo!, Inc. He spoke about using the Hadoop platform for many years and how the company has come to rely on it to help it push the boundaries of its capabilities. Singh discussed a number of open-source projects that included batch, compute, queries, Apache HBase and other open-source initiatives in which Yahoo participates and the advances that the company has made.
There is a framework we call CaffeOnSpark, a phenomenal framework to advance deep learning on existing Hadoop or Spark clusters.” He also explained that CaffeOnSpark also enables the conversion of existing Hadoop and Spark clusters on to a very powerful platform for deep learning without the need to set up a separate cluster or move data back and forth between these clusters.
The platform provides server-to-server direct communication that speeds up learning and offers the ability to fully distribute the learning without scalability issues. CaffeOnSpark also supports incremental learning that occurs on top of saved models. The open-source project received an Apache license last month.
Singh ran through a number of open-source projects that included batch compute, queries, Apache HBase and other open-source initiatives in which Yahoo! participates, and he also talked about the advances that the company has made with these projects.
Moving forward, Singh sees four areas of opportunity in the Hadoop ecosystem: large-scale machine learning, deep learning, a quest for speed to fight for latency and more efficient cluster operations.
Herb Cunitz, president at Hortonworks, Inc., next conducted a use-case panel, which shared many of the experiences of enterprise users, including how each customer is gaining value by using the Hadoop platform. Some of the types of projects included connected data platform collective data from smart meters, machine learning and sophisticated analytics
Helping the enterprise achieve the highest level of automation
Joe Goldberg, solutions marketing consultant at BMC Software, Inc., was on hand to talk about what he called the “backroom stuff,” the behind the scenes activities that enable customers to get more from their data. He referenced that some of the more popular use-cases for Hadoop and Big Data are things like Extract, Transform and Load (ETL) and enterprise warehouse data modernization.
Goldberg talked about a platform approach to managing batch. “One of the most important characteristics — as Big Data and Hadoop applications are moving toward the enterprise — is that you have the ability to manage all of your batch processing in a consistent way,” he said. “A single way to visualize and manage across that diversity.”
What he hears from customers is that when they are moving Hadoop and Big Data applications into an enterprise context, there is a lot of complex traditional technology that already exists and they want to be able to manage and see the relationships and how all of that processing is coming together.
Goldberg said it is necessary to abstract and elevate how you manage batch processing so that you don’t look at the individual technologies but you can look at it from a business perspective. “You still need that deep technical detail and you need to be able to drill down and see all that information, but you want to stay at a high level from a management perspective,” he said. He discussed the need for a platform to be adaptable and extendable.
Steve Sarsfield, product marketing manager at HPE, took to the stage and said, “Enterprise-grade Hadoop has enterprise-grade problems.” He went on to discuss the three types of data that Hewlett Packard looks at: business data, machine data (IoT) and human data (facial recognition data), noting that the vision of Hadoop put all this data into the data lake. The problem, according to Sarsfield, is that the data remains in silos. He stated that there is a need to break away from doing things in different clusters and move away from silos.
Sarsfield laid out four issues for enterprise-grade Hadoop. First, it is hard to get mature analytics capabilities; secondly, there is a need for specialized skills and software needs to be easy for end users. The next problem is architectural limitations when running complex workloads. And, finally, there are security challenges.
Hewlett Packard’s acquisition of Voltage Security is used to provide security for data on Hadoop servers, data going through a company’s Hadoop network, and data in use.
@theCUBE
#HS16Dublin
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
Hadoop Summit 2016 | Dublin. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For Hadoop Summit 2016 | Dublin
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for Hadoop Summit 2016 | Dublin.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
Hadoop Summit 2016 | Dublin. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to Hadoop Summit 2016 | Dublin
Please sign in with LinkedIn to continue to Hadoop Summit 2016 | Dublin. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Day Two Keynote - Hadoop Summit 2016 Dublin - #HS16Dublin - theCUBE
Can Hadoop get past enterprise-grade roadblocks?
by Marlene Den Bleyker | Apr 14, 2016
The keynote for day two of Hadoop Summit in Dublin, Ireland, exhibited a high level of technical information with speakers from Yahoo, BMC Software and Hewlett-Packard Enterprise (HPE). Each of the speakers spoke about their company’s work and contributions to the Hadoop ecosystem.
The first speaker was Sumeet Singh, senior director of products for cloud and Big Data platforms at Yahoo!, Inc. He spoke about using the Hadoop platform for many years and how the company has come to rely on it to help it push the boundaries of its capabilities. Singh discussed a number of open-source projects that included batch, compute, queries, Apache HBase and other open-source initiatives in which Yahoo participates and the advances that the company has made.
There is a framework we call CaffeOnSpark, a phenomenal framework to advance deep learning on existing Hadoop or Spark clusters.” He also explained that CaffeOnSpark also enables the conversion of existing Hadoop and Spark clusters on to a very powerful platform for deep learning without the need to set up a separate cluster or move data back and forth between these clusters.
The platform provides server-to-server direct communication that speeds up learning and offers the ability to fully distribute the learning without scalability issues. CaffeOnSpark also supports incremental learning that occurs on top of saved models. The open-source project received an Apache license last month.
Singh ran through a number of open-source projects that included batch compute, queries, Apache HBase and other open-source initiatives in which Yahoo! participates, and he also talked about the advances that the company has made with these projects.
Moving forward, Singh sees four areas of opportunity in the Hadoop ecosystem: large-scale machine learning, deep learning, a quest for speed to fight for latency and more efficient cluster operations.
Herb Cunitz, president at Hortonworks, Inc., next conducted a use-case panel, which shared many of the experiences of enterprise users, including how each customer is gaining value by using the Hadoop platform. Some of the types of projects included connected data platform collective data from smart meters, machine learning and sophisticated analytics
Helping the enterprise achieve the highest level of automation
Joe Goldberg, solutions marketing consultant at BMC Software, Inc., was on hand to talk about what he called the “backroom stuff,” the behind the scenes activities that enable customers to get more from their data. He referenced that some of the more popular use-cases for Hadoop and Big Data are things like Extract, Transform and Load (ETL) and enterprise warehouse data modernization.
Goldberg talked about a platform approach to managing batch. “One of the most important characteristics — as Big Data and Hadoop applications are moving toward the enterprise — is that you have the ability to manage all of your batch processing in a consistent way,” he said. “A single way to visualize and manage across that diversity.”
What he hears from customers is that when they are moving Hadoop and Big Data applications into an enterprise context, there is a lot of complex traditional technology that already exists and they want to be able to manage and see the relationships and how all of that processing is coming together.
Goldberg said it is necessary to abstract and elevate how you manage batch processing so that you don’t look at the individual technologies but you can look at it from a business perspective. “You still need that deep technical detail and you need to be able to drill down and see all that information, but you want to stay at a high level from a management perspective,” he said. He discussed the need for a platform to be adaptable and extendable.
Steve Sarsfield, product marketing manager at HPE, took to the stage and said, “Enterprise-grade Hadoop has enterprise-grade problems.” He went on to discuss the three types of data that Hewlett Packard looks at: business data, machine data (IoT) and human data (facial recognition data), noting that the vision of Hadoop put all this data into the data lake. The problem, according to Sarsfield, is that the data remains in silos. He stated that there is a need to break away from doing things in different clusters and move away from silos.
Sarsfield laid out four issues for enterprise-grade Hadoop. First, it is hard to get mature analytics capabilities; secondly, there is a need for specialized skills and software needs to be easy for end users. The next problem is architectural limitations when running complex workloads. And, finally, there are security challenges.
Hewlett Packard’s acquisition of Voltage Security is used to provide security for data on Hadoop servers, data going through a company’s Hadoop network, and data in use.
@theCUBE
#HS16Dublin