Chapter 2. Cloud Native Challenges in the Real World

Observability in a cloud native world is difficult; gathering data from a single output source and correctly inferring a view of the world about that cloud native service is impossible. We are now in a world where cloud native observability needs to be correlated and processed in myriad ways for a single assumption to be proven true or false.

In a survey titled “Filling the Observability Gap” conducted by O’Reilly about observability, respondents revealed three main challenges: lack of observability data, high costs related to tools and training, and difficulties coordinating the teams that were trying to solve system and network problems.1

This chapter delves into real-world scenarios highlighting the performance, cost, and reliability issues associated with observability data. We will explore case studies from companies that illustrate practical approaches and possible solutions to these challenges. Finally, we will try to come up with a reusable solution.

While the overarching challenges of cloud native observability are clear, one of the most immediate impacts is seen in system performance. Let’s explore how uncontrolled data growth can significantly strain our systems.

Impact of Uncontrolled Data Growth on System Performance

A key factor contributing to this uncontrolled data growth is automatic instrumentation. Consider the example of NGINX: installing the NGINX ingress controller in a cluster is straightforward, and ...

Get Cloud Native Observability now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.