Book description
The concept of a data lake is less than 10 years old, but they are already hugely implemented within large companies. Their goal is to efficiently deal with ever-growing volumes of heterogeneous data, while also facing various sophisticated user needs. However, defining and building a data lake is still a challenge, as no consensus has been reached so far. Data Lakes presents recent outcomes and trends in the field of data repositories. The main topics discussed are the data-driven architecture of a data lake; the management of metadata – supplying key information about the stored data, master data and reference data; the roles of linked data and fog computing in a data lake ecosystem; and how gravity principles apply in the context of data lakes. A variety of case studies are also presented, thus providing the reader with practical examples of data lake management.
Table of contents
- Cover
- Preface
- 1 Introduction to Data Lakes: Definitions and Discussions
- 2 Architecture of Data Lakes
- 3 Exploiting Software Product Lines and Formal Concept Analysis for the Design of Data Lake Architectures
-
4 Metadata in Data Lake Ecosystems
- 4.1. Definitions and concepts
- 4.2. Classification of metadata by NISO
- 4.3. Other categories of metadata
- 4.4. Sources of metadata
- 4.5. Metadata classification
- 4.6. Why metadata are needed
- 4.7. Business value of metadata
- 4.8. Metadata architecture
- 4.9. Metadata management
- 4.10. Metadata and data lakes
- 4.11. Metadata management in data lakes
- 4.12. Metadata and master data management
- 4.13. Conclusion
- 5 A Use Case of Data Lake Metadata Management
- 6 Master Data and Reference Data in Data Lake Ecosystems
- 7 Linked Data Principles for Data Lakes
-
8 Fog Computing
- 8.1. Introduction
- 8.2. A little bit of context
- 8.3. Every machine talks
- 8.4. The volume paradox
- 8.5. The fog, a shift in paradigm
- 8.6. Constraint environment challenges
- 8.7. Calculations and local drift
- 8.8. Quality is everything
- 8.9. Fog computing versus cloud computing and edge computing
- 8.10. Concluding remarks: fog computing and data lake
- 9 The Gravity Principle in Data Lakes
- Glossary
- References
- List of Authors
- Index
- End User License Agreement
Product information
- Title: Data Lakes
- Author(s):
- Release date: June 2020
- Publisher(s): Wiley-ISTE
- ISBN: 9781786305855
You might also like
book
Architecting Data Lakes
Many organizations use Hadoop-driven data lakes as an adjunct staging area for their enterprise data warehouses …
book
Operationalizing the Data Lake
Big data and advanced analytics have increasingly moved to the cloud as organizations pursue actionable insights …
video
Data Superstream: Data Lakes and Warehouses
Storing, processing, and moving data in the cloud efficiently and cost-effectively is a must for working …
book
Data Lake for Enterprises
A practical guide to implementing your enterprise data lake using Lambda Architecture as the base Key …