Chapter 2. SRE Mindset
It starts with curiosity.
How does a system work? How does it fail?
For SRE, the primary question is not, “How is it supposed to work?” but rather, “How does it really work? How does it really work in production?”
Here’s a little example scenario: your frontend talks to a database. But what happens when it can’t? What happens when multiple instances that shouldn’t be running talk to that database at the same time? What if the database responds 20%…34%…60% slower than it did when the code was (presumably) tested? How does the code know it is talking to the right database? What are the implicit dependencies? I could fill this entire chapter with nothing but questions like these because understanding how a system really works is an exercise in intense curiosity.
In this chapter, I am going to explore a fundamental question on which much of this book turns: What is the SRE mindset? What are the qualities that define it, how does it differ from other mindsets, how do we begin to think in this direction, and so on?
This is an easy question to ask but a hard one to answer, so I reached out to a sizable group of some of the smartest SREs I know to get their take on the subject (and the SRE culture topic we will discuss in Chapter 3). I’ve included their answers with my own but attributed their responses as appropriate, so you will find this chapter cites other people more than usual (to give credit where credit is due).
The SRE mindset is going to be foundational ...
Get Becoming SRE now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.