Chapter 2. Why Do You Need Metrics?

You might have noticed that we’re focusing specifically on metrics here rather than logs or traces. Why not logs first? Why metrics?

Metrics as a Starting Point

Sridharan defines a metric as “a numeric representation of data measured over intervals of time,” adding, “Metrics can harness the power of mathematical modeling and prediction to derive knowledge of the behavior of a system over intervals of time in the present and future.”1 Figure 2-1 shows an example of measuring HTTP requests as a metric.

Figure 2-1. An example metric

The Case for Metrics

If solving your problem requires a deep dive, you might need all three signals. Logs will tell you what happened in a specific period of time. Traces allow you to track a request from beginning to end.

However, when you are starting your investigation, you need a bird’s-eye view. Starting with metrics is logical because it lets you move from the broadest view down to the narrowest. Metrics also can provide that perspective with what we call a low-latency impact analysis, which provides an efficient view of the system’s current state. What’s more, metrics are easy to implement and use, and they let you aggregate data quickly and compare it over time. Let’s look at each of these factors in turn.

Metrics Provide an Efficient Snapshot of the System

First, metrics allow you to understand the ...

Get Cloud Native Monitoring now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.