Chapter 3. Debugging with Observability

As mentioned at the beginning of Chapter 2, observability signals can be roughly broken down into two categories, based on the value they bring: availability and debuggability. Aggregated application metrics provide the best availability signal. In this chapter, we will discuss the other two main signals, distributed tracing and logs.

We’ll show one approach to correlating metrics and traces using only open source tooling. Some commercial vendors also work to provide this unified experience. Like in Chapter 2, the purpose in showing a specific approach is to develop an expectation about the minimum level of sophistication you should be able to expect from your observability stack when it is fully assembled.

Lastly, distributed tracing instrumentation, given that it needs to propagate context across a microservice hierarchy, can be an efficient place to govern behavior deeper in a system. We’ll discuss a hypothetical failure injection testing feature as an example of the possibilities.

The Three Pillars of Observability…or Is It Two?

As discussed in Distributed Systems Observability by Cindy Sridharan (O’Reilly), three different types of telemetry form the “three pillars of observability”: logs, distributed traces, and metrics. This three pillars classification is common, to such an extent that it’s difficult to pinpoint its origin.

While logs, distributed traces, and metrics are three distinct forms of telemetry with unique characteristics, ...

Get SRE with Java Microservices now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.