Chapter 9. Approaches to Data Ingestion

In our ever-growing digital world, handling and making sense of all sorts of data has become incredibly important for businesses of all types. This chapter is all about the first step in handling data—getting it into the system in the first place.

Here, we’ll unravel some of the key ways data can be brought into different systems. I’ll kick things off by explaining two common methods, known as ETL and ELT, in a way that is easy to understand. I’ll also introduce you to a new idea called reverse ETL and explain how it flips the traditional methods on their head.

Since not all data needs are the same, we’ll explore different techniques like batch and real-time processing. This will help you figure out what might work best based on how much data you’re dealing with and how quickly you need it. Finally, we’ll talk about the importance of data governance—ensuring your data is accurate, consistent, and accessible.

This chapter aims to simplify these complex ideas and show you how they all connect to the bigger picture of data handling. Whether you’re a data whiz or a complete novice, I’m excited to guide you through this fascinating world of data ingestion. Welcome aboard!

ETL Versus ELT

For many years, extract-transform-load (ETL) was the most common method for transferring data from a source system to a relational data warehouse. But recently, extract-load-transform (ELT) has become popular, especially with data lakes.

The ETL process involves ...

Get Deciphering Data Architectures now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.