Book description
The world's businesses ingest a combined 2.5 quintillion bytes of data every day. But how much of this vast amount of data--used to build products, power AI systems, and drive business decisions--is poor quality or just plain bad? This practical book shows you how to ensure that the data your organization relies on contains only high-quality records.
Most data engineers, data analysts, and data scientists genuinely care about data quality, but they often don't have the time, resources, or understanding to create a data quality monitoring solution that succeeds at scale. In this book, Jeremy Stanley and Paige Schwartz from Anomalo explain how you can use automated data quality monitoring to cover all your tables efficiently, proactively alert on every category of issue, and resolve problems immediately.
This book will help you:
- Learn why data quality is a business imperative
- Understand and assess unsupervised learning models for detecting data issues
- Implement notifications that reduce alert fatigue and let you triage and resolve issues quickly
- Integrate automated data quality monitoring with data catalogs, orchestration layers, and BI and ML systems
- Understand the limits of automated data quality monitoring and how to overcome them
- Learn how to deploy and manage your monitoring solution at scale
- Maintain automated data quality monitoring for the long term
Publisher resources
Table of contents
- Foreword
- Preface
- 1. The Data Quality Imperative
- 2. Data Quality Monitoring Strategies and the Role of Automation
- 3. Assessing the Business Impact of Automated Data Quality Monitoring
- 4. Automating Data Quality Monitoring with Machine Learning
- 5. Building a Model That Works on Real-World Data
- 6. Implementing Notifications While Avoiding Alert Fatigue
- 7. Integrating Monitoring with Data Tools and Systems
- 8. Operating Your Solution at Scale
- Appendix. Types of Data Quality Issues
- Index
- About the Authors
Product information
- Title: Automating Data Quality Monitoring
- Author(s):
- Release date: January 2024
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098145934
You might also like
book
Driving Data Quality with Data Contracts
Everything you need to know to apply data contracts and build a truly data-driven organization that …
book
Data Quality Fundamentals
Do your product dashboards look funky? Are your quarterly reports stale? Is the data set you're …
book
Data Governance: The Definitive Guide
As you move data to the cloud, you need to consider a comprehensive approach to data …
book
Data Management at Scale, 2nd Edition
As data management continues to evolve rapidly, managing all of your data in a central place, …