Chapter 7. Observing Infrastructure

We build our computer systems the way we build our cities: over time, without a plan, on top of ruins.

Ellen Ullman1

Despite the many advances in cloud computing, serverless, and other technologies that promise to shield programmers from having to care about where and how their programs run, we are still stymied by a basic fact: software has to run on hardware. What has changed, though, is how we interact with hardware. Rather than relying on bare syscalls, we rely heavily on increasingly sophisticated APIs and other abstractions of the underlying infrastructure that powers our software.

Infrastructure isn’t limited to physical hardware either. Planet-scale cloud computing platforms offer managed services for everything from key management to caches to text-message gateways. New AI- and ML-powered services seem to crop up weekly, and new orchestration and deployment methods promise more speed and flexibility in where and how code runs.

Infrastructure is a key part of any software system, and understanding your infrastructure resources is a key part of observability. In this chapter, we’ll cover infrastructure observability with OpenTelemetry and discuss how to understand and model this part of your systems.

What Is Infrastructure Observability?

Just about every developer or operator has done some infrastructure monitoring, such as watching a system’s CPU utilization, memory usage, or free disk space, or even a remote host’s uptime. Monitoring ...

Get Learning OpenTelemetry now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.