Dave Rensin Director, CRE & Network Capacity, Google | @drensin, sits with John Furrier & Jeff Frick for Google Cloud Next 2018 from the Moscone Center in San Francisco, CA.
https://siliconangle.com/2018/07/30/lessons-from-googles-internal-sre-methods-for-cloud-efficiencies-googlecloudnext18/ #theCUBE #GCP #GoogleCloud #SiliconANGLE
Lessons from Google’s internal SRE methods for cloud efficiencies
As a new generation of corporations navigate the efficiencies of cloud computing, they are faced with a new challenge: running a business in a brand-new environment without the benefit of tried and true methods.
“The industry has done a really fabulous job of telling people how to get to cloud, but we’re awful about telling them how to live there,” said Dave Rensin (pictured), director of customer reliability engineering and network capacity at Google Cloud.
Rensin spoke with John Furrier (@furrier) and Jeff Frick (@JeffFrick ), co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the recently concluded Google Cloud Next event in San Francisco. They discussed Google site reliability engineering and how the concept is being turned outwards to help businesses operate successfully in the cloud. (* Disclosure below.)
Parsing work for machines and human judgment
In 2004 Google LLC had just gone public, and internal calculations showed that in 10 years the company would need a million systems operators just for their popular search function. In its unorthodox way, Google reimagined its production systems by applying software engineering skills to operations problems and named the method Site Reliability Engineering, or SRE.
“The basic philosophy is simple, give to the machines all the things machines can do, and keep for the humans all the things that require human judgment. That’s how we get to a place where like, 2,500 SREs run all of Google,” Rensin said.
A primary principle of SRE is to forget about aiming for perfection. “Any system involving people is going to have errors. So any goal you have that assumes perfection, 100 percent uptime, 100 percent customer satisfaction, zero error, that kind of thing, is a lie,” Rensin said, going on to explain that there is a “magic line” — known as the service level objective — marking the boundary between satisfied, and unsatisfied customers. Operate below the SLO line and customers are angry; operate above it and resources are being wasted on incremental improvements that customers don’t notice.
“The difference between perfection, 100 percent, and the line you need [the SLO], which is very business-specific, we say treat as a budget,” Rensin said. This “error budget” represents time and money that can be spent on innovation.
As director of customer reliability engineering, Rensin takes Google’s internal SRE methodology and turns it outwards to work with businesses of all sizes. Google has published a book on SRE, with an accompanying workbook to help guide companies through implementing SRE in their own operations.
“Our goal is that every firm from five to 50,000 can follow these principles. And they can. We know they can do it, and it’s not as hard as they think,” Rensin concluded.
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Google Cloud Next event. (* Disclosure: Google Cloud sponsored this segment of theCUBE. Neither Google nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
For more information:
https://www.thecube.net/google-cloud-next-18
SiliconANGLE BLOG Posts:
https://siliconangle.com/ @Google Cloud Platform @Google @SiliconANGLE theCUBE
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
Google Cloud Next 2018 | San Francisco. If you don’t think you received an email check your
spam folder.
Sign in to Google Cloud Next 2018 | San Francisco.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Register For Google Cloud Next 2018 | San Francisco
Please fill out the information below. You will recieve an email with a verification link confirming your registration. Click the link to automatically sign into the site.
You’re almost there!
We just sent you a verification email. Please click the verification button in the email. Once your email address is verified, you will have full access to all event content for Google Cloud Next 2018 | San Francisco.
I want my badge and interests to be visible to all attendees.
Checking this box will display your presense on the attendees list, view your profile and allow other attendees to contact you via 1-1 chat. Read the Privacy Policy. At any time, you can choose to disable this preference.
Select your Interests!
add
Upload your photo
Uploading..
OR
Connect via Twitter
Connect via Linkedin
EDIT PASSWORD
Share
Forgot Password
Almost there!
We just sent you a verification email. Please verify your account to gain access to
Google Cloud Next 2018 | San Francisco. If you don’t think you received an email check your
spam folder.
Sign in to Google Cloud Next 2018 | San Francisco.
In order to sign in, enter the email address you used to registered for the event. Once completed, you will receive an email with a verification link. Open this link to automatically sign into the site.
Sign in to gain access to Google Cloud Next 2018 | San Francisco
Please sign in with LinkedIn to continue to Google Cloud Next 2018 | San Francisco. Signing in with LinkedIn ensures a professional environment.
Are you sure you want to remove access rights for this user?
Details
Manage Access
email address
Community Invitation
Dave Rensin, Google | Google Cloud Next 2018
Dave Rensin Director, CRE & Network Capacity, Google | @drensin, sits with John Furrier & Jeff Frick for Google Cloud Next 2018 from the Moscone Center in San Francisco, CA.
https://siliconangle.com/2018/07/30/lessons-from-googles-internal-sre-methods-for-cloud-efficiencies-googlecloudnext18/ #theCUBE #GCP #GoogleCloud #SiliconANGLE
Lessons from Google’s internal SRE methods for cloud efficiencies
As a new generation of corporations navigate the efficiencies of cloud computing, they are faced with a new challenge: running a business in a brand-new environment without the benefit of tried and true methods.
“The industry has done a really fabulous job of telling people how to get to cloud, but we’re awful about telling them how to live there,” said Dave Rensin (pictured), director of customer reliability engineering and network capacity at Google Cloud.
Rensin spoke with John Furrier (@furrier) and Jeff Frick (@JeffFrick ), co-hosts of theCUBE, SiliconANGLE Media’s mobile livestreaming studio, during the recently concluded Google Cloud Next event in San Francisco. They discussed Google site reliability engineering and how the concept is being turned outwards to help businesses operate successfully in the cloud. (* Disclosure below.)
Parsing work for machines and human judgment
In 2004 Google LLC had just gone public, and internal calculations showed that in 10 years the company would need a million systems operators just for their popular search function. In its unorthodox way, Google reimagined its production systems by applying software engineering skills to operations problems and named the method Site Reliability Engineering, or SRE.
“The basic philosophy is simple, give to the machines all the things machines can do, and keep for the humans all the things that require human judgment. That’s how we get to a place where like, 2,500 SREs run all of Google,” Rensin said.
A primary principle of SRE is to forget about aiming for perfection. “Any system involving people is going to have errors. So any goal you have that assumes perfection, 100 percent uptime, 100 percent customer satisfaction, zero error, that kind of thing, is a lie,” Rensin said, going on to explain that there is a “magic line” — known as the service level objective — marking the boundary between satisfied, and unsatisfied customers. Operate below the SLO line and customers are angry; operate above it and resources are being wasted on incremental improvements that customers don’t notice.
“The difference between perfection, 100 percent, and the line you need [the SLO], which is very business-specific, we say treat as a budget,” Rensin said. This “error budget” represents time and money that can be spent on innovation.
As director of customer reliability engineering, Rensin takes Google’s internal SRE methodology and turns it outwards to work with businesses of all sizes. Google has published a book on SRE, with an accompanying workbook to help guide companies through implementing SRE in their own operations.
“Our goal is that every firm from five to 50,000 can follow these principles. And they can. We know they can do it, and it’s not as hard as they think,” Rensin concluded.
Watch the complete video interview below, and be sure to check out more of SiliconANGLE’s and theCUBE’s coverage of the Google Cloud Next event. (* Disclosure: Google Cloud sponsored this segment of theCUBE. Neither Google nor other sponsors have editorial control over content on theCUBE or SiliconANGLE.)
For more information:
https://www.thecube.net/google-cloud-next-18
SiliconANGLE BLOG Posts:
https://siliconangle.com/ @Google Cloud Platform @Google @SiliconANGLE theCUBE