John Furrier speaking with Omer Trajman, Customer Solutions for Cloudera at Intel Developer Forum, 2012.
At IDF 2012 John Furrier interviews Cloudera's Omer Trajman in the developer zone. The two shed some light on the evolution of storage and computing, Cloudera, Hadoop, datacenters, and Intel's role in carrying our data forward. Furrier opens the interview by touching on fourth generation processors, its effect on opens source and how that impacts Cloudera.
Trajman responds, elaborating on the connection between the demand from improved processors and Cloudera's role in Intel. As more types and varieties of data are created, greater pressure is placed on the classic data center model. This causes the desire for industry standard hardware and industry standard components. Although there will continue to be variations such as spinning disk or flash which will take over as the next generation, we're now seeing more types of storage heavy or compute heavy applications.
Intel is moving from a component player to a more of a data center player. Furrier asks Omer where and if he sees an architectural change that impacts flash and solid state given Intel's change of focus. Flash and solid states are still moving forward, states Trajman. Disk is still in there to some extent, but the move is away from separating storage and computing and towards using standardized components to bring them together. By putting the intelligent software with the servers you'll need a lot less separate storage.
Furrier then mentions the first Hadoop World where Abhi Mehta coined the term data factories. He then asks how that has evolved today and how do you see people organizing their data? "Instead of data flowing freely throughout the enterprise... its getting centralized on joint storage and compute architectures," Trajman responds. We're seeing data being put in the same place that you compute, and that becomes your data hub, data factory, or data reservoir where you have pristine data.
The interview switched gears and the two began to discuss the difference between Hadoop and Mongo in data analytics. Mongo is very useful; it's a very important specialized engine. It solves a lot of interesting problems, primarily in the document and application serving space. Mongo typically gets seen on the front end serving an application. "Hadoop 's philosophy is that there isn't any single engine that will solve all problems." When u generate a lot of data, you need to compute the data and get it back into your application, that part would happen on a Hadoop system.
Furrier then asks for an update on Cloudera and their growth since their inception, as well as their work on Apache in the open source arena. Trajman was happy to elaborate stating, "We are growing fast we're close to around 300 employees, which for a four year company is pretty wild growth." A lot of the investment has been on the open source side. Over one third of the company is just focused on building great open source software that people can use. We spend a lot of our time contributing to the Apache open source community. CDH 4 now has Apache high availability.
Lastly, Trajman talks about what separates Cloudera from its peer corporations such as MapR, EMC, Greegplum and Hortonworks. In addition, Furrier asks Trajman to tell us how relevant is HBase, part of the Holy trinity of Hadoop, despite the criticism regarding its scalability or versatility. Its very critical, replied Trajman. Hbase is modeled on what Google created when they created Big Table. Today Big Table powers a lot of their applications and infrastructure. We're seeing HBase power equivalent things outside of Google, for example, the messages on Facebook. Hbase touches and impacts a lot more people in different ways than they realize. It is great for real-time "atomic" access to data. While it differs from classic relation based systems, HBase is a lot more focused on discreet access.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
Intel Developer Forum 2012 | San Francisco. If you don’t think you received an email check your
spam folder.
Sign in to Intel Developer Forum 2012 | San Francisco.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For Intel Developer Forum 2012 | San Francisco
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for Intel Developer Forum 2012 | San Francisco.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
Intel Developer Forum 2012 | San Francisco. If you don’t think you received an email check your
spam folder.
Sign in to Intel Developer Forum 2012 | San Francisco.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to Intel Developer Forum 2012 | San Francisco
Please sign in with LinkedIn to continue to Intel Developer Forum 2012 | San Francisco. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Omer Trajman - Intel Developer Forum 2012 - theCUBE
John Furrier speaking with Omer Trajman, Customer Solutions for Cloudera at Intel Developer Forum, 2012.
At IDF 2012 John Furrier interviews Cloudera's Omer Trajman in the developer zone. The two shed some light on the evolution of storage and computing, Cloudera, Hadoop, datacenters, and Intel's role in carrying our data forward. Furrier opens the interview by touching on fourth generation processors, its effect on opens source and how that impacts Cloudera.
Trajman responds, elaborating on the connection between the demand from improved processors and Cloudera's role in Intel. As more types and varieties of data are created, greater pressure is placed on the classic data center model. This causes the desire for industry standard hardware and industry standard components. Although there will continue to be variations such as spinning disk or flash which will take over as the next generation, we're now seeing more types of storage heavy or compute heavy applications.
Intel is moving from a component player to a more of a data center player. Furrier asks Omer where and if he sees an architectural change that impacts flash and solid state given Intel's change of focus. Flash and solid states are still moving forward, states Trajman. Disk is still in there to some extent, but the move is away from separating storage and computing and towards using standardized components to bring them together. By putting the intelligent software with the servers you'll need a lot less separate storage.
Furrier then mentions the first Hadoop World where Abhi Mehta coined the term data factories. He then asks how that has evolved today and how do you see people organizing their data? "Instead of data flowing freely throughout the enterprise... its getting centralized on joint storage and compute architectures," Trajman responds. We're seeing data being put in the same place that you compute, and that becomes your data hub, data factory, or data reservoir where you have pristine data.
The interview switched gears and the two began to discuss the difference between Hadoop and Mongo in data analytics. Mongo is very useful; it's a very important specialized engine. It solves a lot of interesting problems, primarily in the document and application serving space. Mongo typically gets seen on the front end serving an application. "Hadoop 's philosophy is that there isn't any single engine that will solve all problems." When u generate a lot of data, you need to compute the data and get it back into your application, that part would happen on a Hadoop system.
Furrier then asks for an update on Cloudera and their growth since their inception, as well as their work on Apache in the open source arena. Trajman was happy to elaborate stating, "We are growing fast we're close to around 300 employees, which for a four year company is pretty wild growth." A lot of the investment has been on the open source side. Over one third of the company is just focused on building great open source software that people can use. We spend a lot of our time contributing to the Apache open source community. CDH 4 now has Apache high availability.
Lastly, Trajman talks about what separates Cloudera from its peer corporations such as MapR, EMC, Greegplum and Hortonworks. In addition, Furrier asks Trajman to tell us how relevant is HBase, part of the Holy trinity of Hadoop, despite the criticism regarding its scalability or versatility. Its very critical, replied Trajman. Hbase is modeled on what Google created when they created Big Table. Today Big Table powers a lot of their applications and infrastructure. We're seeing HBase power equivalent things outside of Google, for example, the messages on Facebook. Hbase touches and impacts a lot more people in different ways than they realize. It is great for real-time "atomic" access to data. While it differs from classic relation based systems, HBase is a lot more focused on discreet access.