Chapter 10. Business Dashboards for Data Pipelines

Valliappa (Lak) Lakshmanan

Show them the data; they’ll tell you when it’s wrong.

When you build data pipelines to ingest data, how often are you not quite sure whether you are processing the data correctly? Are the outliers you are clipping really a result of malfunctioning equipment? Is the timestamp really in Coordinated Universal Time (UTC)? Is a certain field populated only if the customer accepts the order?

If you are diligent, you will ask a stakeholder these questions at the time you are building the pipeline. But what about the questions you didn’t know you had to ask? What if the answer changes next month?

One of the best ways to get many eyes, especially eyes that belong to domain experts, continually on your data pipeline is to build a visual representation of the data flowing through it.1 By this, I don’t mean the engineering bits—not the amount of data flowing through, the number of errors, or the number of connections. You should build a visual representation of the business data flowing through. The following is an example of such a dashboard.

Using real-time dashboards to get more eyes on your data pipeline

Build a dashboard showing aspects of your data that your stakeholders find meaningful. For example, show the number of times a particular piece of equipment malfunctioned ...

Get 97 Things Every Data Engineer Should Know now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.