CHAPTER 4Causes for Poor Data Quality

INTRODUCTION

In the first three chapters we covered the define phase of the DARS model. Specifically, we looked at the definition of data quality, the business case for data quality, the different types of data, the 12 key dimensions of data quality, and more. The next three chapters will look at the second phase of the DARS models: that is, the analyze phase. Specifically, this chapter will explore or analyze the key reasons for data to decay or depreciate or degrade.

Data decay refers to the gradual reduction or loss in utility of data for business. In general, there are two types of data decay: physical and logical.

  1. Physical data decay is data loss from the storage medium. Examples include server crashes, hard disk corruptions, data records getting purged without a trace, and more. Physical data decay is instantaneous, and often out of one's control. The most common solution to address physical data decay is regular backup of the database or recovery of the system in an alternative or secondary data center.
  2. Logical data decay “silent killer of data” is commonly due to the compromises on the different data quality dimensions. Logical data decay reduces the utility of data for business activities. Logical data decay is the main reason for poor data quality in business.

While the physical data decay can be solved easily by periodic refreshing by rewriting the data with backups, alleviating logical data decay is very complex and time consuming. ...

Get Data Quality now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.