Andy Palmer, Co-founder & CEO at TAMR joins theCUBE hosts Dave Vellante (@dvellante) and Paul Gillin (@pgillin) live from MIT CDOIQ in Cambridge MA
#theCUBE #MITCDOIQ @SiliconANGLE theCUBE
https://siliconangle.com/2019/08/09/real-big-data-problem-machine-learning-can-fix-mitcdoiq-startupoftheweek/
The real big-data problem and why only machine learning can fix it
Why do so many companies still struggle to build a smooth-running pipeline from data to insights? They invest in heavily hyped machine-learning algorithms to analyze data and make business predictions.
But then, inevitably, they realize that algorithms aren’t magic: If they’re fed junk data, their insights won’t be stellar. So they employ data scientists who spend 90% of their time washing and folding in a data-cleaning laundromat, leaving just 10% of their time to do the job for which they were hired.
What’s also flawed about this process is that companies only get excited about machine learning for end-of-the-line algorithms. They should apply machine learning just as liberally in the early cleansing stages instead of relying on people to grapple with gargantuan data sets, according to Andy Palmer, co-founder and chief executive officer of Tamr Inc., which helps organizations use machine learning unify their data silos.
Lots of companies have spent large amounts of money on systems for big data collection. Their emphasis on data quantity over quality is readily apparent. “Anybody that’s worked at one of theses big companies can tell you that the data that they get from most of their internal systems sucks, plain and simple,” Palmer said.
Palmer and Michael Stonebraker (pictured), co-founder and chief technology officer of Tamr, spoke with Dave Vellante and Paul Gillin, co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, which covered the recent MIT CDOIQ Symposium in Cambridge, Massachusetts. They discussed machine learning in big-data cleansing and why Tamr not surprisingly believes startups offer better, more scalable big-data solutions than do legacy companies (see the full interviews with transcripts here and here).
This week, theCUBE spotlights Tamr Inc. in its Startup of the Week feature.
Big data? Big whoop
Palmer and Stonebraker have been trying to deflate the big-data hype bubble for years. All the way back in 2007, they predicted that the Apache Hadoop big-data framework wasn’t going to deliver the results so many expected of it.
“Mike actually was really aggressive in saying that it was going to be a disaster,” Palmer said.
It’s not that large data sets are bad. They’re obviously necessary for training analytics models and artificial intelligence. It’s the notion that as long as data is big, the rest of the analytics or AI pieces will fall into place that’s left so many companies disillusioned.
Organizations now realize that data quality is not negligible. They also know that a data scientist shouldn’t have to spend 80% to 90% or more of his or her time cleansing and wrangling data. There has to be a better, faster way to get data ready for use in analytics and AI.
The answer is to start looking at machine learning as a highly practical tool for doing these bulky, unglamorous tasks, according to Palmer. So many vendors use machine learning to make more appealing the marketing of software for prediction, recommendation engines, etc. Tamr is using it for the least glamorous thing there is: cleansing and organizing big data before anyone analyzes, predicts, markets or sells anything with it.
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston. If you don’t think you received an email check your
spam folder.
Sign in to MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston.
Thanks for confirming your account. Now you can access MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston with this email address.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston. If you don’t think you received an email check your
spam folder.
Sign in to MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston
Please sign in with LinkedIn to continue to MIT Chief Data Officer and Information Quality Symposium (CDOIQ) 2019 | Boston. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Andy Palmer, TAMR | MIT CDOIQ 2019
Andy Palmer, Co-founder & CEO at TAMR joins theCUBE hosts Dave Vellante (@dvellante) and Paul Gillin (@pgillin) live from MIT CDOIQ in Cambridge MA
#theCUBE #MITCDOIQ @SiliconANGLE theCUBE
https://siliconangle.com/2019/08/09/real-big-data-problem-machine-learning-can-fix-mitcdoiq-startupoftheweek/
The real big-data problem and why only machine learning can fix it
Why do so many companies still struggle to build a smooth-running pipeline from data to insights? They invest in heavily hyped machine-learning algorithms to analyze data and make business predictions.
But then, inevitably, they realize that algorithms aren’t magic: If they’re fed junk data, their insights won’t be stellar. So they employ data scientists who spend 90% of their time washing and folding in a data-cleaning laundromat, leaving just 10% of their time to do the job for which they were hired.
What’s also flawed about this process is that companies only get excited about machine learning for end-of-the-line algorithms. They should apply machine learning just as liberally in the early cleansing stages instead of relying on people to grapple with gargantuan data sets, according to Andy Palmer, co-founder and chief executive officer of Tamr Inc., which helps organizations use machine learning unify their data silos.
Lots of companies have spent large amounts of money on systems for big data collection. Their emphasis on data quantity over quality is readily apparent. “Anybody that’s worked at one of theses big companies can tell you that the data that they get from most of their internal systems sucks, plain and simple,” Palmer said.
Palmer and Michael Stonebraker (pictured), co-founder and chief technology officer of Tamr, spoke with Dave Vellante and Paul Gillin, co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, which covered the recent MIT CDOIQ Symposium in Cambridge, Massachusetts. They discussed machine learning in big-data cleansing and why Tamr not surprisingly believes startups offer better, more scalable big-data solutions than do legacy companies (see the full interviews with transcripts here and here).
This week, theCUBE spotlights Tamr Inc. in its Startup of the Week feature.
Big data? Big whoop
Palmer and Stonebraker have been trying to deflate the big-data hype bubble for years. All the way back in 2007, they predicted that the Apache Hadoop big-data framework wasn’t going to deliver the results so many expected of it.
“Mike actually was really aggressive in saying that it was going to be a disaster,” Palmer said.
It’s not that large data sets are bad. They’re obviously necessary for training analytics models and artificial intelligence. It’s the notion that as long as data is big, the rest of the analytics or AI pieces will fall into place that’s left so many companies disillusioned.
Organizations now realize that data quality is not negligible. They also know that a data scientist shouldn’t have to spend 80% to 90% or more of his or her time cleansing and wrangling data. There has to be a better, faster way to get data ready for use in analytics and AI.
The answer is to start looking at machine learning as a highly practical tool for doing these bulky, unglamorous tasks, according to Palmer. So many vendors use machine learning to make more appealing the marketing of software for prediction, recommendation engines, etc. Tamr is using it for the least glamorous thing there is: cleansing and organizing big data before anyone analyzes, predicts, markets or sells anything with it.