Book description
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern data warehouse. These new architectures have solid benefits, but they're also surrounded by a lot of hyperbole and confusion. This practical book provides a guided tour of these architectures to help data professionals understand the pros and cons of each. James Serra, big data and data warehousing solution architect at Microsoft, examines common data architecture concepts, including how data warehouses have had to evolve to work with data lake features. You'll learn what data lakehouses can help you achieve, as well as how to distinguish data mesh hype from reality. Best of all, you'll be able to determine the most appropriate data architecture for your needs. With this book, you'll:
- Gain a working understanding of several data architectures
- Learn the strengths and weaknesses of each approach
- Distinguish data architecture theory from reality
- Pick the best architecture for your use case
- Understand the differences between data warehouses and data lakes
- Learn common data architecture concepts to help you build better solutions
- Explore the historical evolution and characteristics of data architectures
- Learn essentials of running an architecture design session, team organization, and project success factors
Free from product discussions, this book will serve as a timeless resource for years to come.
Publisher resources
Table of contents
- Foreword
- Preface
- I. Foundation
- 1. Big Data
- 2. Types of Data Architectures
- 3. The Architecture Design Session
- II. Common Data Architecture Concepts
- 4. The Relational Data Warehouse
- 5. Data Lake
- 6. Data Storage Solutions and Processes
- 7. Approaches to Design
- 8. Approaches to Data Modeling
- 9. Approaches to Data Ingestion
- III. Data Architectures
- 10. The Modern Data Warehouse
- 11. Data Fabric
- 12. Data Lakehouse
- 13. Data Mesh Foundation
-
14. Should You Adopt Data Mesh? Myths, Concerns, and the Future
-
Myths
- Myth: Using Data Mesh Is a Silver Bullet That Solves All Data Challenges Quickly
- Myth: A Data Mesh Will Replace Your Data Lake and Data Warehouse
- Myth: Data Warehouse Projects Are All Failing, and a Data Mesh Will Solve That Problem
- Myth: Building a Data Mesh Means Decentralizing Absolutely Everything
- Myth: You Can Use Data Virtualization to Create a Data Mesh
- Concerns
- Organizational Assessment: Should You Adopt a Data Mesh?
- Recommendations for Implementing a Successful Data Mesh
- The Future of Data Mesh
- Zooming Out: Understanding Data Architectures and Their Applications
- Summary
-
Myths
- IV. People, Processes, and Technology
-
15. People and Processes
- Team Organization: Roles and Responsibilities
-
Why Projects Fail: Pitfalls and Prevention
- Pitfall: Allowing Executives to Think That BI Is “Easy”
- Pitfall: Using the Wrong Technologies
- Pitfall: Gathering Too Many Business Requirements
- Pitfall: Gathering Too Few Business Requirements
- Pitfall: Presenting Reports Without Validating Their Contents First
- Pitfall: Hiring an Inexperienced Consulting Company
- Pitfall: Hiring a Consulting Company That Outsources Development to Offshore Workers
- Pitfall: Passing Project Ownership Off to Consultants
- Pitfall: Neglecting the Need to Transfer Knowledge Back into the Organization
- Pitfall: Slashing the Budget Midway Through the Project
- Pitfall: Starting with an End Date and Working Backward
- Pitfall: Structuring the Data Warehouse to Reflect the Source Data Rather Than the Business’s Needs
- Pitfall: Presenting End Users with a Solution with Slow Response Times or Other Performance Issues
- Pitfall: Overdesigning (or Underdesigning) Your Data Architecture
- Pitfall: Poor Communication Between IT and the Business Domains
- Tips for Success
- Summary
- 16. Technologies
- Index
- About the Author
Product information
- Title: Deciphering Data Architectures
- Author(s):
- Release date: February 2024
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098150761
You might also like
book
Foundations of Scalable Systems
In many systems, scalability becomes the primary driver as the user base grows. Attractive features and …
book
Prompt Engineering for Generative AI
Large language models (LLMs) and diffusion models such as ChatGPT and Stable Diffusion have unprecedented potential. …
book
Building LLM Powered Applications
Get hands-on with GPT 3.5, GPT 4, LangChain, Llama 2, Falcon LLM and more, to build …
book
Practical Lakehouse Architecture
This concise yet comprehensive guide explains how to adopt a data lakehouse architecture to implement modern …