The past year has been a relentless inquiry on Big Data, its implications and potential. But do we really know what it is?
Big Data is a collection of data sets, which is a collection of data usually in tabular form, that is so large and complex that it becomes difficult to process using on-hand database management tools and users are often face with the challenge of capturing, curating, storing, searching, sharing, analysis, and visualization these huge amount of data.
You’d think that, as person whose life or work-life doesn’t revolve around information technology, you’d have no use for Big Data, best to be left with experts. But contrary to popular belief, Big Data can actually be used by “mere mortals,” to help them with their business or keeping track of their web traffic.
At the Strata Conference + Hadoop World 2012, Microsoft made big news with the debut of HDInsight Server for Windows, its Hadoop offering based on Hortonworks’ HDP distribution of the open source framework. The company also showed off its new Hadoop prowess with a nifty and easy way of using big data, and SiliconANGLE Founder John Furrier and Wikibon Co-founder Dave Vellante were able to see the Big Data demo, first hand from Microsoft’s Mike Flasko, at theCube.
Flasko gave an overview of what the demo was all about. Basically, Big Data is aggregating various data sets using cloud, using Excel and fusing those various data sets.
“The demo that we did was what I consider to be kind of the wheelhouse of Hadoop,” Flasko said. “We took a large amount of logged files, processed them in the cloud using Hadoop, get some aggregation on them, get some shaping on them etc., brought them down to 20,000 – 30,000 records, and then we pulled them into Excel so we can do a little more ad hoc analysis, as we wanted to, kind of a self-service analysis.
“And the whole idea was that we had logged files tracking our online business and in our demo it was an online bike shop and what we were doing was we wanted to see the traffic patterns, we wanted to then mash that up with data we had in our enterprise databases, it’s about some of our sales record and promotions we were running on those days. And then finally, once we’ve done that, enrich the data a little bit more with demographic information so we can understand how our traffic was going versus the marketing impact we wanted to have versus the demographic of the user that was seeing our site. And the idea was to show how easy you can do that, getting up to speed with the cluster in the cloud and just using Excel.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
O'Reilly Strata Conference + Hadoop World 2012 | New York. If you don’t think you received an email check your
spam folder.
Sign in to O'Reilly Strata Conference + Hadoop World 2012 | New York.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For O'Reilly Strata Conference + Hadoop World 2012 | New York
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for O'Reilly Strata Conference + Hadoop World 2012 | New York.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
O'Reilly Strata Conference + Hadoop World 2012 | New York. If you don’t think you received an email check your
spam folder.
Sign in to O'Reilly Strata Conference + Hadoop World 2012 | New York.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to O'Reilly Strata Conference + Hadoop World 2012 | New York
Please sign in with LinkedIn to continue to O'Reilly Strata Conference + Hadoop World 2012 | New York. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Mike Flasko | Strata-Hadoop World 2012
The past year has been a relentless inquiry on Big Data, its implications and potential. But do we really know what it is?
Big Data is a collection of data sets, which is a collection of data usually in tabular form, that is so large and complex that it becomes difficult to process using on-hand database management tools and users are often face with the challenge of capturing, curating, storing, searching, sharing, analysis, and visualization these huge amount of data.
You’d think that, as person whose life or work-life doesn’t revolve around information technology, you’d have no use for Big Data, best to be left with experts. But contrary to popular belief, Big Data can actually be used by “mere mortals,” to help them with their business or keeping track of their web traffic.
At the Strata Conference + Hadoop World 2012, Microsoft made big news with the debut of HDInsight Server for Windows, its Hadoop offering based on Hortonworks’ HDP distribution of the open source framework. The company also showed off its new Hadoop prowess with a nifty and easy way of using big data, and SiliconANGLE Founder John Furrier and Wikibon Co-founder Dave Vellante were able to see the Big Data demo, first hand from Microsoft’s Mike Flasko, at theCube.
Flasko gave an overview of what the demo was all about. Basically, Big Data is aggregating various data sets using cloud, using Excel and fusing those various data sets.
“The demo that we did was what I consider to be kind of the wheelhouse of Hadoop,” Flasko said. “We took a large amount of logged files, processed them in the cloud using Hadoop, get some aggregation on them, get some shaping on them etc., brought them down to 20,000 – 30,000 records, and then we pulled them into Excel so we can do a little more ad hoc analysis, as we wanted to, kind of a self-service analysis.
“And the whole idea was that we had logged files tracking our online business and in our demo it was an online bike shop and what we were doing was we wanted to see the traffic patterns, we wanted to then mash that up with data we had in our enterprise databases, it’s about some of our sales record and promotions we were running on those days. And then finally, once we’ve done that, enrich the data a little bit more with demographic information so we can understand how our traffic was going versus the marketing impact we wanted to have versus the demographic of the user that was seeing our site. And the idea was to show how easy you can do that, getting up to speed with the cluster in the cloud and just using Excel.