01. Fernando Perez. Lawrence Berkeley Laboratory, Visits theCUBE . (00:21)
02. Perez's Background and Role at Berkeley. (00:59)
03. How the Tools for Data Scientists Have Evolved. (03:46)
04. The Spirit of Collaboration in Open Source. (10:35)
05. Using the Proper Tools for Extracting Data. (14:57)
#theCUBE #SparkSummit #Spark #IBM #SiliconANGLE
--- ---
One scientist’s perspective on Spark | #SparkInsight
by Amber Johnson | Jun 19, 2015
Fernando Perez is a scientist at Lawrence Berkeley National Laboratory and a founding investigator of Berkeley Institute for Data Science. In addition, Perez is a particle physicist who worked with the Python Project that led to the Jupyter Project, which is part of the Spark ecosystem.
“The Jupyter environment is precisely about building an environment where you can build code and narrative together,” Perez told Jeff Frick and George Gilbert of theCUBE at IBM Spark Summit 2015. The Spark system uses the Jupyter technology to run code, data and narrative live.
“Now in the last few years, the folks at the amp lab have built PySpark, which is the Python layer on top, [which] allows you to call Spark with a Python API … and then once you have run all your large-scale analytics in Spark, then you can import all of these Python libraries that these physical scientists have been writing for the last 10, 15 years … and use those … with the interactive facilities we have been building,” Perez said.
Contributions led to the current Spark program innovations
When asked how Perez started working with Python, he replied, “I realized that was I probably spending more time switching between coding languages rather than doing any work.” Then, while Perez was in graduate school, he learned about Python.
“We were all able to interact very quickly” with data, and according to Perezm in the early 2000sm multiple laboratories and institutions began contributing in Python. This trend of contributions led to the current innovation of the Spark program.
Around 2002, Perez “realized there was a value to seeing these things as open source projects, and many of us realized that we should actually work and try to get these things funded.”
Today, many government organization are funding such academic projects. Perez commented that DARPA partially funded Spark. “I think what Spark has brought to the game is an additional layer of enterprise-level analytics,” he said. “It’s not so much for everyday numerical computing workloads that many people in the physical sciences were using … Spark made a real killing in that space.”
Watch the full interview below, and be sure to check out more of SiliconANGLE and theCUBE’s coverage of IBM Spark 2015.
@theCUBE
#SparkInsight
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
IBM Spark Summit 2015 | San Francisco. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For IBM Spark Summit 2015 | San Francisco
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for IBM Spark Summit 2015 | San Francisco.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
IBM Spark Summit 2015 | San Francisco. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to IBM Spark Summit 2015 | San Francisco
Please sign in with LinkedIn to continue to IBM Spark Summit 2015 | San Francisco. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Fernando Perez. Lawrence Berkeley Laboratory | IBM Spark Summit 2015
01. Fernando Perez. Lawrence Berkeley Laboratory, Visits theCUBE . (00:21)
02. Perez's Background and Role at Berkeley. (00:59)
03. How the Tools for Data Scientists Have Evolved. (03:46)
04. The Spirit of Collaboration in Open Source. (10:35)
05. Using the Proper Tools for Extracting Data. (14:57)
#theCUBE #SparkSummit #Spark #IBM #SiliconANGLE
--- ---
One scientist’s perspective on Spark | #SparkInsight
by Amber Johnson | Jun 19, 2015
Fernando Perez is a scientist at Lawrence Berkeley National Laboratory and a founding investigator of Berkeley Institute for Data Science. In addition, Perez is a particle physicist who worked with the Python Project that led to the Jupyter Project, which is part of the Spark ecosystem.
“The Jupyter environment is precisely about building an environment where you can build code and narrative together,” Perez told Jeff Frick and George Gilbert of theCUBE at IBM Spark Summit 2015. The Spark system uses the Jupyter technology to run code, data and narrative live.
“Now in the last few years, the folks at the amp lab have built PySpark, which is the Python layer on top, [which] allows you to call Spark with a Python API … and then once you have run all your large-scale analytics in Spark, then you can import all of these Python libraries that these physical scientists have been writing for the last 10, 15 years … and use those … with the interactive facilities we have been building,” Perez said.
Contributions led to the current Spark program innovations
When asked how Perez started working with Python, he replied, “I realized that was I probably spending more time switching between coding languages rather than doing any work.” Then, while Perez was in graduate school, he learned about Python.
“We were all able to interact very quickly” with data, and according to Perezm in the early 2000sm multiple laboratories and institutions began contributing in Python. This trend of contributions led to the current innovation of the Spark program.
Around 2002, Perez “realized there was a value to seeing these things as open source projects, and many of us realized that we should actually work and try to get these things funded.”
Today, many government organization are funding such academic projects. Perez commented that DARPA partially funded Spark. “I think what Spark has brought to the game is an additional layer of enterprise-level analytics,” he said. “It’s not so much for everyday numerical computing workloads that many people in the physical sciences were using … Spark made a real killing in that space.”
Watch the full interview below, and be sure to check out more of SiliconANGLE and theCUBE’s coverage of IBM Spark 2015.
@theCUBE
#SparkInsight