Simon Crosby, CTO at SWIM.AI, joins Stu Miniman for theCUBE on Cloud 2021.
#theCUBE #CUBEOnCloud
https://siliconangle.com/2021/01/21/facing-down-the-data-onslaught-with-stateful-architecture-cubeoncloud/
Facing down the data onslaught with stateful architecture
SPECIAL COVERAGE: THECUBE ON CLOUD BY BETSY AMY-VOGT
The age of big data taught us that there is a timeline to data management: Store the data, analyze it, then model predictions.
Unfortunately, that time-consuming process just doesn’t cut it with the staggering amounts of data currently being generated. Petabytes of data flow from edge devices daily, and the amount is growing rapidly thanks to the demand for ever more connections.
“The data onslaught is very real,” said Simon Crosby (pictured), chief technology officer at Swim.ai Inc. “Companies are facing more and more real-time data from products from their infrastructure, from their partners. They need to make decisions rapidly, and the problem is that traditional ways of processing that data are too slow.”
The solution is to adopt a process of data analysis on the fly, according to Crosby. “You need to analyze [data] as you receive it and react immediately to be able to generate reasonable insights or predictions that can drive commerce and decisions in the real world,” he said.
Crosby spoke with Stu Miniman, host of theCUBE, SiliconANGLE Media’s livestreaming studio, during theCUBE on Cloud event. They discussed the future of data analysis and how architectures are evolving for real-time processing in-memory.
Getting faster at the edge
The data onslaught bombarding organizations is mostly thanks to the proliferation of new products with built-in CPUs, otherwise known as edge devices. According to Crosby, “the right way to think about edge is where can you reasonably process the data. Edge as a place doesn’t make as much sense as edge as an opportunity to decrypt and analyze data in the clear.” The edge, for Crosby, is often the cloud.
Cloud computing has taken advantage of two major abstractions: “REST, which is static disk computing, and databases,” Crosby said. “REST means any old server can do the job for me. Then the database is just an API call away.”
There’s just one problem: With CPUs speeds clocked in gigahertz and the network in milliseconds, connecting to a data store means a (relatively) interminable wait. “You’re going a million times slower than your CPU,” Crosby said. “That’s terrible. It’s absolutely tragic.”
Dumping cloud for an in-memory model with stateful computation solves that. Instead of having to connect externally going back and forth to store or retrieve data, compute is done as data arrives. “You get a million times speed up,” he explained. “You also end up with this tremendous cost reduction because you don’t end up with as many instances having to compute.”
Let data build the model
A real-life example comes from traffic light data in Palo Alto, California. The city generates about 4 terabytes of data a day from just a few hundred lights. Although that can theoretically be handled with a serverless compute service, such as Amazon Web Services Inc.’s Lambda, “the problem is that the end-to-end per event latency is about 100 milliseconds,” Crosby said.
And with upwards of 30,000 events a second, “that’s just too much.” Solving the problem with stateless architecture would be “extraordinarily expensive,” Crosby said, estimating costs of “more than $5,000 a month.”
Beyond the Palo Alto scenario, the volumes of raw data generated are “staggering,” according to Crosby. A similar traffic monitoring system in Las Vegas generates about 60 terabytes a day, and just one mobile provider can deal with real-time data from hundreds of millions of mobile devices.
“There is simply no way you can ever store that and analyze it later,” he said. “So cloud is fabulous for things that need to scale wide, but a stateful model is required for dealing with things which update you rapidly or regularly about their changes in state.”
One obstacle in the proliferation of edge computing is the lack of skilled data scientists and engineers to train the algorithms and deploy them at the edge. To eliminate this, Crosby offers an alternative worldview where uncomplicated algorithms are deployed at scale to stateful representatives.
...
“Then we let data build the model by essentially creating these little concurrent objects for each thing, and they will then link to each other and solve the problem,” he said. “If you adopt a stateful computing architecture … you get to go a million times faster. The applications always have an answer. They analyze, learn and predict on the fly, and they go a million times faster. They use 10% of the infrastructure of a store than the analyze approach. And it’s the way of the future.”
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE on Cloud 2021 | Digital. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For theCUBE on Cloud 2021 | Digital
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for theCUBE on Cloud 2021 | Digital.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
theCUBE on Cloud 2021 | Digital. If you don’t think you received an email check your
spam folder.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to theCUBE on Cloud 2021 | Digital
Please sign in with LinkedIn to continue to theCUBE on Cloud 2021 | Digital. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Simon Crosby, SWIM.AI | theCUBE on Cloud 2021
Simon Crosby, CTO at SWIM.AI, joins Stu Miniman for theCUBE on Cloud 2021.
#theCUBE #CUBEOnCloud
https://siliconangle.com/2021/01/21/facing-down-the-data-onslaught-with-stateful-architecture-cubeoncloud/
Facing down the data onslaught with stateful architecture
SPECIAL COVERAGE: THECUBE ON CLOUD BY BETSY AMY-VOGT
The age of big data taught us that there is a timeline to data management: Store the data, analyze it, then model predictions.
Unfortunately, that time-consuming process just doesn’t cut it with the staggering amounts of data currently being generated. Petabytes of data flow from edge devices daily, and the amount is growing rapidly thanks to the demand for ever more connections.
“The data onslaught is very real,” said Simon Crosby (pictured), chief technology officer at Swim.ai Inc. “Companies are facing more and more real-time data from products from their infrastructure, from their partners. They need to make decisions rapidly, and the problem is that traditional ways of processing that data are too slow.”
The solution is to adopt a process of data analysis on the fly, according to Crosby. “You need to analyze [data] as you receive it and react immediately to be able to generate reasonable insights or predictions that can drive commerce and decisions in the real world,” he said.
Crosby spoke with Stu Miniman, host of theCUBE, SiliconANGLE Media’s livestreaming studio, during theCUBE on Cloud event. They discussed the future of data analysis and how architectures are evolving for real-time processing in-memory.
Getting faster at the edge
The data onslaught bombarding organizations is mostly thanks to the proliferation of new products with built-in CPUs, otherwise known as edge devices. According to Crosby, “the right way to think about edge is where can you reasonably process the data. Edge as a place doesn’t make as much sense as edge as an opportunity to decrypt and analyze data in the clear.” The edge, for Crosby, is often the cloud.
Cloud computing has taken advantage of two major abstractions: “REST, which is static disk computing, and databases,” Crosby said. “REST means any old server can do the job for me. Then the database is just an API call away.”
There’s just one problem: With CPUs speeds clocked in gigahertz and the network in milliseconds, connecting to a data store means a (relatively) interminable wait. “You’re going a million times slower than your CPU,” Crosby said. “That’s terrible. It’s absolutely tragic.”
Dumping cloud for an in-memory model with stateful computation solves that. Instead of having to connect externally going back and forth to store or retrieve data, compute is done as data arrives. “You get a million times speed up,” he explained. “You also end up with this tremendous cost reduction because you don’t end up with as many instances having to compute.”
Let data build the model
A real-life example comes from traffic light data in Palo Alto, California. The city generates about 4 terabytes of data a day from just a few hundred lights. Although that can theoretically be handled with a serverless compute service, such as Amazon Web Services Inc.’s Lambda, “the problem is that the end-to-end per event latency is about 100 milliseconds,” Crosby said.
And with upwards of 30,000 events a second, “that’s just too much.” Solving the problem with stateless architecture would be “extraordinarily expensive,” Crosby said, estimating costs of “more than $5,000 a month.”
Beyond the Palo Alto scenario, the volumes of raw data generated are “staggering,” according to Crosby. A similar traffic monitoring system in Las Vegas generates about 60 terabytes a day, and just one mobile provider can deal with real-time data from hundreds of millions of mobile devices.
“There is simply no way you can ever store that and analyze it later,” he said. “So cloud is fabulous for things that need to scale wide, but a stateful model is required for dealing with things which update you rapidly or regularly about their changes in state.”
One obstacle in the proliferation of edge computing is the lack of skilled data scientists and engineers to train the algorithms and deploy them at the edge. To eliminate this, Crosby offers an alternative worldview where uncomplicated algorithms are deployed at scale to stateful representatives.
...
“Then we let data build the model by essentially creating these little concurrent objects for each thing, and they will then link to each other and solve the problem,” he said. “If you adopt a stateful computing architecture … you get to go a million times faster. The applications always have an answer. They analyze, learn and predict on the fly, and they go a million times faster. They use 10% of the infrastructure of a store than the analyze approach. And it’s the way of the future.”