Chapter 3. Setting Up Your Data Models and Ingesting Data

Now that you have set up your Amazon Redshift data warehouse, let’s consider a data management strategy.

In this chapter, we will discuss a few options for your data management strategy and whether you should employ a “Data Lake First Versus Data Warehouse First Strategy”. Next, we’ll go into “Defining Your Data Model” and use the “Student Information Learning Analytics Dataset” to illustrate how to create tables and “Load Batch Data into Amazon Redshift” using a sample of this data in Amazon S3. However, in today’s world, where speed to insights is critical to maintaining your competetive edge, we’ll also show you how to “Load Real-Time and Near Real-Time Data”. Lastly, we’ll cover how you can “Optimize Your Data Structures”.

Data Lake First Versus Data Warehouse First Strategy

In today’s digital age, organizations are constantly collecting and generating large amounts of data. This data can come from various sources such as user interactions, sensor readings, and social media activity. Managing this data effectively is crucial for organizations to gain insights and make informed business decisions. One of the key challenges in managing this data is deciding on the appropriate data management strategy. Two popular strategies that organizations use are the Data Lake first strategy and the Data Warehouse first strategy. When you are considering your cloud-based data management strategy, whether you are migrating an on-premises ...

Get Amazon Redshift: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.