Chapter 9. Integrating Event-Driven Data into Data at Rest

Event-driven data products provide exceptional flexibility for consumers but they may not be suitable for every use case. Existing systems and dependencies play a big role in any architecture, and shifting to a data mesh depends on supporting existing use cases while simultaneously promoting incremental change. Many systems, processing jobs, and computations rely heavily on data at rest, particularly those in the analytics domain.

In this chapter, we’ll focus on integrating event-driven data into data at rest. We’ll look at the Medallion architecture and the role it plays in modern data analytics workflows. We’ll explore strategies and trade-offs for determining when to convert data from a flow of events into a batch of files at rest. Finally, we’ll take a look at a real-world example to tie theory into practice. Let’s get into it.

Analytics and the Medallion Architecture

Change works best by first meeting your users where they are. Batch-based data analytics pipelines and workflows are extremely common in most industries, and many organizations have invested heavily in batch-based data engineering, data science, data analytics, and reporting workflows. “Data Products Are Multimodal” introduced the idea of multimodal data products, but until now we’ve been working primarily in event streams. While they’re often the best choice for driving both operational and real-time analytical use cases, we still need to integrate ...

Get Building an Event-Driven Data Mesh now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.