Name: Peter Wang | Strata Data Conference 2013
Uploaded: 2013-02-28T17:37:00.000Z
Duration: 15 min 26 s

Peter Wang | Strata Data Conference 2013

Co-Founder and President of Continuum Analytics, Peter Wang, joined SiliconAngle's John Furrier and Wikibon's Dave Vellante inside theCUBE at Strata Conference 2013. The days where you can view data as a static thing are over. Kaput. No mas. Peter Wang, Co-Founder & President of Continuum Analytics, was kind enough to join John Furrier and Dave Vellante on theCube last week at Strata to discuss predictive analytics, Big Data, scientific computing, and the moving of more and more analytical code to where the data is. Continuum Analytics, the premier provider of Python-based data analytics solutions and services, is player we would bet big on in the Big Data space (full video below). As Wang sees it, there has been a fundamental disruption in the storage and ETL end of Big Data (business analytics) space. It is a push up market that has caused a push-back down market as all of the players jostle for position. "The Big Data wave that's coming is exceeding the disciplines for doing business analytics that most companies are used to," Wang says. Transformation (read: metadata) is turning Data Warehousing on its head. Announced just prior to Strata 2013 was Continuum's latest version of Anaconda, its premium collection of libraries for Python that includes NumbaPro, IOPro, and wiseRF all in one package. Anaconda enables big data management, analysis, and cross-platform visualization for business intelligence, scientific analysis, engineering, machine learning, and more. Here is a brief part of that press release: Available on Windows, Mac OS X and Linux, Anaconda includes more than 80 of the most popular numerical and scientific Python libraries used by scientists, engineers and data analysts, with a single integrated and flexible installer. It also allows for the mixing and matching of different versions of Python (including Python 3.3 on a 64-bit Linux installation), NumPy, SciPy, etc., and the ability to easily switch between environments. Improvements to the latest version of Anaconda include: The ability to build your own packages using conda New versions of wiseRF Pine and NumbaPro New, faster data adapters for Mongo database in IOPro New versions of currently included packages, notably cython v0.17.4, pandas v0.10.1, llvm 3.2 New packages: cubes, ply, pyparsing, mpi4py (OSX), googlecl, gdata, biopython Wang believes, and I would agree, that data at this point is a first-class concern. "Data has hit mass now. When you have enough data, you can't just willy-nilly move it around," he says. "You have to think about where did it come from, how am I going to view it, how do I want to transform it into those most useful views and do it in a way that doesn't incur more data movement." The dilemma is very peculiar...with data movement as a first-class concern, how do you best analyze In-Memory? Fluidity has found it's way to Big Data. Strike that. We've found that there is fluidity in Big Data, all data. "The days you can view data as a static thing are over," said Wang. He mentioned a quote he once heard, that there is no such thing as raw data. Which, by definition, is and always will be true: there is a sensor somewhere that collected the data in the first place. So what does that mean? How well do the worlds of Data Warehousing and Data Analyzing need to merge? Is proprietary the new 'last-year' and open source the new 'black'? Transformation, Co-Transformations ... what is the first big step in Big Data? I'd love to hear your thoughts in the comments. But mark this day in your calendar: the mobile revolution has centralized all of the data around our activities. Men lie, women lie, numbers don't. Good luck to all of the men and woman tackling the numbers.

Share this session