Book description
Although service-level objectives (SLOs) continue to grow in importance, there’s a distinct lack of information about how to implement them. Practical advice that does exist usually assumes that your team already has the infrastructure, tooling, and culture in place. In this book, recognized SLO expert Alex Hidalgo explains how to build an SLO culture from the ground up.
Ideal as a primer and daily reference for anyone creating both the culture and tooling necessary for SLO-based approaches to reliability, this guide provides detailed analysis of advanced SLO and service-level indicator (SLI) techniques. Armed with mathematical models and statistical knowledge to help you get the most out of an SLO-based approach, you’ll learn how to build systems capable of measuring meaningful SLIs with buy-in across all departments of your organization.
- Define SLIs that meaningfully measure the reliability of a service from a user’s perspective
- Choose appropriate SLO targets, including how to perform statistical and probabilistic analysis
- Use error budgets to help your team have better discussions and make better data-driven decisions
- Build supportive tooling and resources required for an SLO-based approach
- Use SLO data to present meaningful reports to leadership and your users
Publisher resources
Table of contents
- Foreword
- Preface
- I. SLO Development
- 1. The Reliability Stack
- 2. How to Think About Reliability
- 3. Developing Meaningful Service Level Indicators
- 4. Choosing Good Service Level Objectives
- 5. How to Use Error Budgets
- II. SLO Implementation
- 6. Getting Buy-In
- 7. Measuring SLIs and SLOs
- 8. SLO Monitoring and Alerting
- 9. Probability and Statistics for SLIs and SLOs
-
10. Architecting for Reliability
-
Example System: Image-Serving Service
- Architectural Considerations: Hardware
- Architectural Considerations: Monolith or Microservices
- Architectural Considerations: Anticipating Failure Modes
- Architectural Considerations: Three Types of Requests
- Systems and Building Blocks
- Quantitative Analysis of Systems
- Instrumentation! The System Also Needs Instrumentation!
- Architectural Considerations: Hardware, Revisited
- SLOs as a Result of System SLIs
- The Importance of Identifying and Understanding Dependencies
- Summary
-
Example System: Image-Serving Service
- 11. Data Reliability
- 12. A Worked Example
- III. SLO Culture
- 13. Building an SLO Culture
- 14. SLO Evolution
- 15. Discoverable and Understandable SLOs
- 16. SLO Advocacy
- 17. Reliability Reporting
- A. SLO Definition Template
- B. Proofs for Chapter 9
- Index
Product information
- Title: Implementing Service Level Objectives
- Author(s):
- Release date: August 2020
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781492076766
You might also like
video
Implementing Service Level Objectives
Service-level objectives (SLOs)—the bedrock upon which the discipline of site reliability engineering (SRE) was built—have never …
book
The Manager's Path
Managing people is difficult wherever you work. But in the tech industry, where management is also …
audiobook
The Manager's Path
Managing people is difficult wherever you work. But in the tech industry, where management is also …
book
Building Microservices, 2nd Edition
As organizations shift from monolithic applications to smaller, self-contained microservices, distributed systems have become more fine-grained. …