Chapter 6. Data (and AI) Governance and Security in Lakehouse Architecture

A data platform revolves around three key pillars: people, process, and technology. In the previous chapters, we discussed various technologies for implementing a lakehouse.  This chapter focuses on the people and process aspects of lakehouse implementations.  This chapter will help you understand how lakehouse architecture implements unified governance and security processes across all of your data and ML/AI assets.

You need sound governance and security processes to enable those who work on lakehouse data to collaborate, exchange data securely, and maintain trust in the data. These governance and security processes lay the foundation of a robust data ecosystem.

I’ll first introduce you to the key governance concepts and explain why governance is required and how it helps improve overall data management activities. We will focus on data quality, auditing, lineage, sharing, and different compliances you should consider while designing the data governance strategy for your platform. We will discuss why data security is essential for modern data platforms and how to secure your data when it is at rest, in transit, and in use by consumers. We will also explore the various options to identify and protect the sensitive data within your platform.

The last section of the chapter will guide you through your responsibilities in terms of governance and security, based on your role in the lakehouse implementation ...

Get Practical Lakehouse Architecture now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.