Chapter 3. Data Ingestion

This chapter discusses the first stage of a data analysis project—getting the data into your systems so you can work with it. For this book, this will always include, but not be restricted to, GA4.

However, having the ability to ingest data from many sources is powerful because you can merge complementary systems, and typically the insights you glean from your data are more powerful. To help get you there, the next section details how to pull data from these different systems.

Breaking Down Data Silos

The more data sources you have, the more complicated a project gets. This occurs not only for technical reasons such as finding common join keys but also due to company politics as you involve more stakeholders who control different data sitting separately within a business organization. This is often referred to as sitting in data silos, where an organization may have a lot of good data but it’s unconnected and in different systems so it’s hard to make use of it. The politics of merging data can usually be solved only by involving the stakeholders as soon as possible, ideally when you are creating the business case to use that data in the first place.

This can feel like an impossible mountain to climb when you first start. A good way to take the first step is to make sure you’re not asking for more data than you actually need. In some cases, aggregated data is plenty for you to get started, rather than the initial dream of merging every individual ...

Get Learning Google Analytics now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.