01. Holden Karau, IBM, Visits #theCUBE!. (00:21)
02. Give Us An Update On Spark. (00:43)
03. Do The Hardcore Spark Developers Have To Main Stream It. (01:48)
04. There's A Lot Of Integration What Are Your Thoughts On That. (03:22)
05. Is Spark A Comparable Investment To Lynx. (04:32)
06. Give Me An Example Of The Magnitude Of Spark. (06:11)
07. Can You Give Us Examples Of Products That Are Moving To Spark. (07:24)
08. Who Is Policing The Agorithms. (08:26)
09. Where Are We In Machine Learning Put On The Process Of The Design And Run Time. (11:03)
10. Do We See Big Packet Apps Emerging For This Class Of Apps. (15:32)
11. What Is Your Take On The Status Of Machine Learning. (17:32)
12. Do You Have Another Book On The Horizon. (19:24)
Track List created with http://www.vinjavideo.com.
--- ---
Machine learning on machine learning software: It’s closer than you think | #BigDataSV
by Amber Johnson | Mar 31, 2016
As the tech world pivots on game-changing applications, data scientists rise to the occasion. Such is the case with Holden Karau, principal software engineer of Big Data at IBM and coauthor of Learning Spark.
When asked about the current renovations within Spark, Karau said she sees this time as an “opportunity to get rid of dead weight” by streamlining certain processes. For example, she cited getting functional and relative queries to talk to each other within Spark.
Two area of expansion include sequencing and machine learning. Karau noted another “massive expansion” in getting other applications to run on top of Spark during an interview with John Furrier (@furrier) and George Gilbert (@ggilbert41), cohosts of theCUBE from the SiliconANGLE Media team, during the BigDataSV 2016 event in San Jose, California, where theCUBE is celebrating #BigDataWeek, including news and events from the #StrataHadoop conference.
The three self-described tech geeks discussed the advances with Spark since the bandwagon effect has kicked in. Karau predicted that machine learning on machine learning software will arrive sooner than Gilbert’s conservative five-year estimate. While she didn’t give a specific time frame, Karau stated emphatically that it is “closer than five years.”
How data science is changing software dynamics
Karau conferred with Furrier and Gilbert about several aspects of data science and how it is changing software dynamics. One side project in particular stood out. Karau is working on a Spark validator that will help with “policing quality” in regards to algorithms within pipeline models. Pipeline models present challenges regarding working large scale and still being able to work with the Big Data interactively. When asked about getting data science to work on data science, Karau said the tech was “there-ish.”
In addition, Karau is working with her coauthor, Rachel Warren, on a new book called High Performance Spark. Karau spoke eloquently and candidly about sources of frustration in working with Spark pipeline issues, saying, “How do I save this damn thing?” However, when it comes to Spark, Karau literally wrote the book.
@theCUBE
#BigDataSV #StrataHadoop
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
BigData SV 2016 | San Jose. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For BigData SV 2016 | San Jose
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for BigData SV 2016 | San Jose.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
BigData SV 2016 | San Jose. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to BigData SV 2016 | San Jose
Please sign in with LinkedIn to continue to BigData SV 2016 | San Jose. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Holden Karau, IBM - #BigDataSV 2016 - #theCUBE
01. Holden Karau, IBM, Visits #theCUBE!. (00:21)
02. Give Us An Update On Spark. (00:43)
03. Do The Hardcore Spark Developers Have To Main Stream It. (01:48)
04. There's A Lot Of Integration What Are Your Thoughts On That. (03:22)
05. Is Spark A Comparable Investment To Lynx. (04:32)
06. Give Me An Example Of The Magnitude Of Spark. (06:11)
07. Can You Give Us Examples Of Products That Are Moving To Spark. (07:24)
08. Who Is Policing The Agorithms. (08:26)
09. Where Are We In Machine Learning Put On The Process Of The Design And Run Time. (11:03)
10. Do We See Big Packet Apps Emerging For This Class Of Apps. (15:32)
11. What Is Your Take On The Status Of Machine Learning. (17:32)
12. Do You Have Another Book On The Horizon. (19:24)
Track List created with http://www.vinjavideo.com.
--- ---
Machine learning on machine learning software: It’s closer than you think | #BigDataSV
by Amber Johnson | Mar 31, 2016
As the tech world pivots on game-changing applications, data scientists rise to the occasion. Such is the case with Holden Karau, principal software engineer of Big Data at IBM and coauthor of Learning Spark.
When asked about the current renovations within Spark, Karau said she sees this time as an “opportunity to get rid of dead weight” by streamlining certain processes. For example, she cited getting functional and relative queries to talk to each other within Spark.
Two area of expansion include sequencing and machine learning. Karau noted another “massive expansion” in getting other applications to run on top of Spark during an interview with John Furrier (@furrier) and George Gilbert (@ggilbert41), cohosts of theCUBE from the SiliconANGLE Media team, during the BigDataSV 2016 event in San Jose, California, where theCUBE is celebrating #BigDataWeek, including news and events from the #StrataHadoop conference.
The three self-described tech geeks discussed the advances with Spark since the bandwagon effect has kicked in. Karau predicted that machine learning on machine learning software will arrive sooner than Gilbert’s conservative five-year estimate. While she didn’t give a specific time frame, Karau stated emphatically that it is “closer than five years.”
How data science is changing software dynamics
Karau conferred with Furrier and Gilbert about several aspects of data science and how it is changing software dynamics. One side project in particular stood out. Karau is working on a Spark validator that will help with “policing quality” in regards to algorithms within pipeline models. Pipeline models present challenges regarding working large scale and still being able to work with the Big Data interactively. When asked about getting data science to work on data science, Karau said the tech was “there-ish.”
In addition, Karau is working with her coauthor, Rachel Warren, on a new book called High Performance Spark. Karau spoke eloquently and candidly about sources of frustration in working with Spark pipeline issues, saying, “How do I save this damn thing?” However, when it comes to Spark, Karau literally wrote the book.
@theCUBE
#BigDataSV #StrataHadoop