Chapter 2. Streaming Data Mesh Introduction

In Chapter 1 we introduced and summarized all four pillars of a data mesh architecture. Now we will apply that introductory knowledge to a streaming data mesh. Simply put, a streaming data mesh is a data mesh (with all its pillars satisfied) that is implemented as streams. In other words, after data is ingested from the source, there isn’t any point where that data is rested into a data store before reaching the consuming domain. Data products are kept in a stream until their retention expires.

Keeping data products in a stream requires all the self-service tools and services available to the data mesh. Consider a simple ETL process. The component that extracts data from the source needs to set the data in motion in a stream. Next, the engine that transforms the data needs to transform it in a stream. Lastly, the component that publishes the data product needs to support integrations so consumers can easily stream the data product into their own domain while following the federated computational data governance for streaming data products. Table 2-1 shows the four data mesh pillars and explains what happens in a streaming setting.

Table 2-1. The data mesh pillars in the context of a streaming data mesh
Streaming data mesh pillar Description

Data ownership (domain ownership)

The domain sets its data products in motion as streams.

Data as a product (data product)

Domains are responsible for transforming data into discoverable ...

Get Streaming Data Mesh now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.