Chapter 11. Finding Your Way with Monitoring

Operating a distributed system without monitoring data is a bit like trying to get out of a forest without a GPS navigator device or compass.

RabbitMQ monitoring documentation1

As an avid outdoors person and survivor of cloud data pipelines with scant monitoring, I heartily agree with the opening quotation. The ability to see what’s going on in your data pipelines is as essential as a good map in the woods: it helps you keep track of where the system has been and where it’s headed, and it can help you get back on track when things careen off the beaten path.

In this chapter, you’ll see how to create and interpret the maps of good monitoring. Being able to inspect pipeline operation gives you insight on performance improvements, scaling, and cost optimization opportunities. It can also be handy for communicating.

To provide some motivation, the chapter opens with my experience working without a map. I’ll share the challenges our team faced and what kind of monitoring could have improved pipeline performance and reliability and reduced costs.

The rest of the chapter follows a similar blueprint, where you’ll get specific advice on monitoring, metrics, and alerting across different levels of pipeline observability. Starting with the system level to provide a high-level view, the chapter continues with sections that dig into more granular areas of monitoring: resource utilization, pipeline performance, and query costs.

A map provides you ...

Get Cost-Effective Data Pipelines now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.