Book description
The Data Vault was invented by Dan Linstedt at the U.S. Department of Defense, and the standard has been successfully applied to data warehousing projects at organizations of different sizes, from small to large-size corporations. Due to its simplified design, which is adapted from nature, the Data Vault 2.0 standard helps prevent typical data warehousing failures.
"Building a Scalable Data Warehouse" covers everything one needs to know to create a scalable data warehouse end to end, including a presentation of the Data Vault modeling technique, which provides the foundations to create a technical data warehouse layer. The book discusses how to build the data warehouse incrementally using the agile Data Vault 2.0 methodology. In addition, readers will learn how to create the input layer (the stage layer) and the presentation layer (data mart) of the Data Vault 2.0 architecture including implementation best practices. Drawing upon years of practical experience and using numerous examples and an easy to understand framework, Dan Linstedt and Michael Olschimke discuss:
- How to load each layer using SQL Server Integration Services (SSIS), including automation of the Data Vault loading processes.
- Important data warehouse technologies and practices.
- Data Quality Services (DQS) and Master Data Services (MDS) in the context of the Data Vault architecture.
- Provides a complete introduction to data warehousing, applications, and the business context so readers can get-up and running fast
- Explains theoretical concepts and provides hands-on instruction on how to build and implement a data warehouse
- Demystifies data vault modeling with beginning, intermediate, and advanced techniques
- Discusses the advantages of the data vault approach over other techniques, also including the latest updates to Data Vault 2.0 and multiple improvements to Data Vault 1.0
Table of contents
- Cover
- Title page
- Table of Contents
- Copyright
- Authors Biography
- Foreword
- Preface
- Acknowledgments
- Chapter 1: Introduction to Data Warehousing
- Chapter 2: Scalable Data Warehouse Architecture
- Chapter 3: The Data Vault 2.0 Methodology
- Chapter 4: Data Vault 2.0 Modeling
- Chapter 5: Intermediate Data Vault Modeling
- Chapter 6: Advanced Data Vault Modeling
- Chapter 7: Dimensional Modeling
- Chapter 8: Physical Data Warehouse Design
-
Chapter 9: Master Data Management
- Abstract
- 9.1. Definitions
- 9.2. Master Data Management Goals
- 9.3. Drivers for Managing Master Data
- 9.4. Operational vs. Analytical Master Data Management
- 9.5. Master Data Management as an Enabler for Managed Self-Service BI
- 9.6. Master Data Management as an Enabler for Total Quality Management
- 9.7. Creating a Model
- 9.8. Importing a Model
- 9.9. Integrating MDS with the Data Vault and Operational Systems
- Chapter 10: Metadata Management
-
Chapter 11: Data Extraction
- Abstract
- 11.1. Purpose of Staging Area
- 11.2. Hashing in the Data Warehouse
- 11.3. Purpose of the Load Date
- 11.4. Purpose of the Record Source
- 11.5. Types of Data Sources
- 11.6. Sourcing Flat Files
- 11.7. Sourcing Historical Data
- 11.8. Sourcing the Sample Airline Data
- 11.9. Sourcing Denormalized Data Sources
- 11.10. Sourcing Master Data from MDS
- Chapter 12: Loading the Data Vault
-
Chapter 13: Implementing Data Quality
- Abstract
- 13.1. Business Expectations Regarding Data Quality
- 13.2. The Costs of Low Data Quality
- 13.3. The Value of Bad Data
- 13.4. Data Quality in the Architecture
- 13.5. Correcting Errors in the Data Warehouse
- 13.6. Transform, Enhance and Calculate Derived Data
- 13.7. Standardization of Data
- 13.8. Correct and Complete Data
- 13.9. Match and Consolidate Data
- 13.10. Creating Dimensions from Same-As Links
-
Chapter 14: Loading the Dimensional Information Mart
- Abstract
- 14.1. Using the Business Vault as an Intermediate to the Information Mart
- 14.2. Materializing the Information Mart
- 14.3. Leveraging PIT and Bridge Tables for Virtualization
- 14.4. Implementing Temporal Dimensions
- 14.5. Implementing Data Quality Using PIT Tables
- 14.6. Dealing with Reference Data
- 14.7. About Hash Keys in the Information Mart
- Chapter 15: Multidimensional Database
- Subject Index
Product information
- Title: Building a Scalable Data Warehouse with Data Vault 2.0
- Author(s):
- Release date: September 2015
- Publisher(s): Morgan Kaufmann
- ISBN: 9780128026489
You might also like
book
Building an Event-Driven Data Mesh
The exponential growth of data combined with the need to derive real-time business value is a …
video
Learning Data Modeling
In this Learning Data Modeling training course, expert author Michael Blaha will teach you how to …
book
Deciphering Data Architectures
Data fabric, data lakehouse, and data mesh have recently appeared as viable alternatives to the modern …
book
Data Engineering with dbt
Use easy-to-apply patterns in SQL and Python to adopt modern analytics engineering to build agile platforms …