Chapter 8. Queries, Modeling, and Transformation

Up to this point, the stages of the data engineering lifecycle have primarily been about passing data from one place to another or storing it. In this chapter, you’ll learn how to make data useful. By understanding queries, modeling, and transformations (see Figure 8-1), you’ll have the tools to turn raw data ingredients into something consumable by downstream stakeholders.

Figure 8-1. Transformations allow us to create value from data

We’ll first discuss queries and the significant patterns underlying them. Second, we will look at the major data modeling patterns you can use to introduce business logic into your data. Then, we’ll cover transformations, which take the logic of your data models and the results of queries and make them useful for more straightforward downstream consumption. Finally, we’ll cover whom you’ll work with and the undercurrents as they relate to this chapter.

A variety of techniques can be used to query, model, and transform data in SQL and NoSQL databases. This section focuses on queries made to an OLAP system, such as a data warehouse or data lake. Although many languages exist for querying, for the sake of convenience and familiarity, throughout most of this chapter, we’ll focus heavily on SQL, the most popular and universal query language. Most of the concepts for OLAP databases and SQL will translate ...

Get Fundamentals of Data Engineering now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.