Book description
Unlocking the value of modern data is critical for data-driven companies. This report provides a concise, practical guide to building a data architecture that efficiently delivers big, complex, and streaming data to both internal users and customers.
Authors Ori Rafael, Roy Hasson, and Rick Bilodeau from Upsolver examine how modern data pipelines can improve business outcomes. Tech leaders and data engineers will explore the role these pipelines play in the data architecture and learn how to intelligently consider tradeoffs between different data architecture patterns and data pipeline development approaches.
You will:
- Examine how recent changes in data, data management systems, and data consumption patterns have made data pipelines challenging to engineer
- Learn how three data architecture patterns (event sourcing, stateful streaming, and declarative data pipelines) can help you upgrade your practices to address modern data
- Compare five approaches for building modern data pipelines, including pure data replication, ELT over a data warehouse, Apache Spark over data lakes, declarative pipelines over data lakes, and declarative data lake staging to a data warehouse
Table of contents
- Introduction
- 1. The Modern Data Landscape and Its Impact on Data Engineering
- 2. Emerging Architecture Patterns
-
3. Modern Data Pipeline Alternatives
- Criteria for Evaluating Approaches to Data Pipelines
- Option 1: Pure Data Replication
- Option 2: ELT over a Data Warehouse
- Option 3: Apache Spark (Hadoop) over Data Lakes
- Option 4: Declarative Pipelines over Data Lakes
- Option 5: Declarative Data Lake Staging to a Data Warehouse or Other Analytics Systems
- Choosing the Best Approach for You
- Conclusion
- About the Authors
Product information
- Title: Unlock Complex and Streaming Data with Declarative Data Pipelines
- Author(s):
- Release date: July 2022
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098135829
You might also like
article
Run Llama-2 Models Locally with llama.cpp
Llama is Meta’s answer to the growing demand for LLMs. Unlike its well-known technological relative, ChatGPT, …
article
Use Github Copilot for Prompt Engineering
Using GitHub Copilot can feel like magic. The tool automatically fills out entire blocks of code--but …
article
Use GitHub Copilot: Additional Tips
Using GitHub Copilot can feel like magic. The tool automatically fills out entire blocks of code--but …
book
Machine Learning for Streaming Data with Python
Apply machine learning to streaming data with the help of practical examples, and deal with challenges …