Name: Daniel Abadi - Hadoop World 2011 - theCUBE
Uploaded: 2012-05-01T18:42:00.000Z
Duration: 2 min 41 s

Daniel Abadi - Hadoop World 2011 - theCUBE

“I started studying data systems as an undergrad because I found it interesting,” says Daniel Abadi, Yale University assistant professor and co-founder and chief scientist at startup Hadapt. “Clearly data is becoming the center of the world, and how you store and process that data is fascinating and vital. If you control that, you control the company.” He only began to realize the opportunities outside of academia when he developed three data systems while a grad student at MIT that were all used either by researchers or industry before they were commercialized. “Until then I didn’t realize it would be a multi-billion dollar industry,” he told Wikibon Co-Founder and Chief Analyst David Vellante and SiliconAngle Founder and CEO John Furrier in an interview in the SiliconAngle.tv Cube webcast live from HadoopWorld in New York City Nov. 8. Today at Yale he is part of a group that runs a large data systems lab that includes four PhD candidates and several undergraduates in its core group. They have several advanced projects in progress focused on answering important questions in data management and processing. One of those is attempting to develop methods of scaling transaction processing across thousands of parallel machines with a particular focus on NoSQL databases. Today those do not scale well above a small number of nodes, and given the amount of data and complexity of the processing often involved, this is an issue with important business implications. “We are trying to fix that problem.” Another project is looking at creating a graphic database system. “Hadoop is great for handling unstructured data, but often data ends up in graphs, and we are trying to figure out systems for managing and processing graph data.” At HadoopWorld he did a presentation on sub-graph pattern matching that comes from that work. For instance, he says, a topography of Twitter connections is a graph with several million nodes. Companies want to find patterns in those graphs. “Suppose you want to find out who tweets to both President Obama and Lady Gaga. Or suppose you want to find out who both are following. This is a sub-graph pattern-matching query.” Those kinds of queries are easy to answer for one node but quickly become extremely complex across many nodes. Today no system exists that can do that analysis on a system as big as Twitter or Facebook, and that is what this project is trying to solve. Another of the lab’s projects, that is further along, is attempting to combine SQL and NoSQL data analysis in a single system. Hadoop is very effective for analyzing unstructured data, but it is not efficient at handling traditional structured queries. Today companies get around that by running two systems in parallel, one a traditional RDBMS for structured data – for instance financial analysis of a business system – and the other a NoSQL system for the unstructured data. This however is inefficient, and it will become increasingly difficult as the amount of data and complexity of the data models and queries grow. Dr. Abadi’s lab has been working for several years to design a management system that can combine the two data engines in a single architecture to simplify the system. “Why have two when you can have one?” he asks. A paper he published on the early work on this problem attracted a great deal of notice in the industry and attracted venture capitalists who urged him to form a company to commercialize the technology. “I wanted to start a company some day, but I didn’t expect to do it so soon,” he says. “But given the pressure, I took the jump.” The result is Hadapt, that was launched in 2009 with capital from Launch Capital and Connecticut Innovation. It just received $9.5 million in second-round funding from Norwest. “The goal now is to build the product,” he says. His lab is in the process of gaining a fourth patent on technology from this project. Yale University owns all four patents and is participating in Hadapt on that basis. Dr. Abadi says his role as chief scientist is to effect the technology transfer from the Yale lab to the company and to contribute to the high-level design points for the product. “I am learning a lot with this, a lot of the process is new to me,” he says. “For now it’s fun.”

Share this session