Chapter 1: Data Management – Introduction and Concepts

A vast amount of data is being generated by people, organizations, devices, and software applications, and the volume of data being generated is growing rapidly. The numbers vary significantly, depending on the source, but it is estimated that approximately 60% to 80% of data gathered by organizations is dark data. Essentially, data is being collected, processed, and stored for a long time by organizations for compliance reasons, but the data is not used for any other purposes, such as analytics or direct monetization. In most cases, storing and securing this data can be more expensive than the value extracted.

In today’s digital economy, organizations are striving to be data-driven by ...

Get Serverless ETL and Analytics with AWS Glue now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.