Part II. Common Data Architecture Concepts

Before diving into data architectures, it’s important to make sure you understand all the data architecture concepts that could be used within an architecture. I find there tends to be a lot of confusion about many of these concepts, which I hope to clear up. Therefore, you will find discussion of over 20 such concepts in the upcoming chapters. At the very least, these chapters will be a refresher for those who may have not used these concepts in a while. I don’t claim that all my definitions of these concepts are universally agreed upon by everyone, but at least these chapters will help get everyone on the same page to make it easier to discuss architectures.

I have included the relational data warehouse and the data lake under concepts instead of architectures. At one time, when they were the only products used in a solution, they could have been considered data architectures. Now, however, they are almost always combined with other products to form the solution. For example, many years ago there were relational data warehouse products that included the relational storage, compute, ETL (extract, transform, and load) software, and reporting software—basically everything you needed bundled together from one vendor. Nowadays, you will stitch together multiple products from possibly multiple vendors, with each product having a particular focus (e.g., ETL), to complete your data architecture.

Get Deciphering Data Architectures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.