What Is a Data Lake?

Book description

A revolution is occurring in data management regarding how data is collected, stored, processed, governed, managed, and provided to decision makers. The data lake is a popular approach that harnesses the power of big data and marries it with the agility of self-service. With this report, IT executives and data architects will focus on the technical aspects of building a data lake for your organization.

Alex Gorelik from Facebook explains the requirements for building a successful data lake that business users can easily access whenever they have a need. You'll learn the phases of data lake maturity, common mistakes that lead to data swamps, and the importance of aligning data with your company's business strategy and gaining executive sponsorship.

You'll explore:

  • The ingredients of modern data lakes, such as the use of different ingestion methods for different data formats, and the importance of the three Vs: volume, variety, and velocity
  • Building blocks of successful data lakes, including data ingestion, integration, persistence, data governance, and business intelligence and self-service analytics
  • State-of-the-art data lake architectures offered by Amazon Web Services, Microsoft Azure, and Google Cloud

Product information

  • Title: What Is a Data Lake?
  • Author(s): Alex Gorelik
  • Release date: November 2020
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781492088882