5 Effective Alerting with Prometheus

Thus far, we’ve looked primarily at how to get data into Prometheus through scrape jobs, discovering scrape targets, and manually querying data. But no monitoring system is truly useful if you need to constantly check if everything is okay; we need some system running in the background evaluating the state of our systems and alerting us if they’re not working correctly. In this chapter, we’ll look at how Prometheus achieves that through a combination of its rule subsystem and the separate Alertmanager component.

We’ll cover the following main topics:

Alertmanager configuration and routing
Alertmanager templating
Highly available (HA) alerting
Making robust alerts
Unit-testing alerting rules

Let’s get started! ...

Get Mastering Prometheus now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Mastering Prometheus by William Hegedus

5

Effective Alerting with Prometheus

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly