Book description
Over 90 recipes to help you orchestrate modern ETL/ELT workflows and perform analytics using Azure services more easily
Key Features
- Build highly efficient ETL pipelines using the Microsoft Azure Data services
- Create and execute real-time processing solutions using Azure Databricks, Azure Stream Analytics, and Azure Data Explorer
- Design and execute batch processing solutions using Azure Data Factory
Book Description
Data engineering is one of the faster growing job areas as Data Engineers are the ones who ensure that the data is extracted, provisioned and the data is of the highest quality for data analysis. This book uses various Azure services to implement and maintain infrastructure to extract data from multiple sources, and then transform and load it for data analysis.
It takes you through different techniques for performing big data engineering using Microsoft Azure Data services. It begins by showing you how Azure Blob storage can be used for storing large amounts of unstructured data and how to use it for orchestrating a data workflow. You'll then work with different Cosmos DB APIs and Azure SQL Database. Moving on, you'll discover how to provision an Azure Synapse database and find out how to ingest and analyze data in Azure Synapse. As you advance, you'll cover the design and implementation of batch processing solutions using Azure Data Factory, and understand how to manage, maintain, and secure Azure Data Factory pipelines. You'll also design and implement batch processing solutions using Azure Databricks and then manage and secure Azure Databricks clusters and jobs. In the concluding chapters, you'll learn how to process streaming data using Azure Stream Analytics and Data Explorer.
By the end of this Azure book, you'll have gained the knowledge you need to be able to orchestrate batch and real-time ETL workflows in Microsoft Azure.
What you will learn
- Use Azure Blob storage for storing large amounts of unstructured data
- Perform CRUD operations on the Cosmos Table API
- Implement elastic pools and business continuity with Azure SQL Database
- Ingest and analyze data using Azure Synapse Analytics
- Develop Data Factory data flows to extract data from multiple sources
- Manage, maintain, and secure Azure Data Factory pipelines
- Process streaming data using Azure Stream Analytics and Data Explorer
Who this book is for
This book is for Data Engineers, Database administrators, Database developers, and extract, load, transform (ETL) developers looking to build expertise in Azure Data engineering using a recipe-based approach. Technical architects and database architects with experience in designing data or ETL applications either on-premise or on any other cloud vendor who wants to learn Azure Data engineering concepts will also find this book useful. Prior knowledge of Azure fundamentals and data engineering concepts is needed.
Table of contents
- Azure Data Engineering Cookbook
- Contributors
- About the author
- About the reviewers
- Preface
-
Chapter 1: Working with Azure Blob Storage
- Technical requirements
- Provisioning an Azure storage account using the Azure portal
- Provisioning an Azure storage account using PowerShell
- Creating containers and uploading files to Azure Blob storage using PowerShell
- Managing blobs in Azure Storage using PowerShell
- Managing an Azure blob snapshot in Azure Storage using PowerShell
- Configuring blob life cycle management for blob objects using the Azure portal
- Configuring a firewall for an Azure storage account using the Azure portal
- Configuring virtual networks for an Azure storage account using the Azure portal
- Configuring a firewall for an Azure storage account using PowerShell
- Configuring virtual networks for an Azure storage account using PowerShell
- Creating an alert to monitor an Azure storage account
- Securing an Azure storage account with SAS using PowerShell
-
Chapter 2: Working with Relational Databases in Azure
- Provisioning and connecting to an Azure SQL database using PowerShell
- Provisioning and connecting to an Azure PostgreSQL database using the Azure CLI
- Provisioning and connecting to an Azure MySQL database using the Azure CLI
- Implementing active geo-replication for an Azure SQL database using PowerShell
- Implementing an auto-failover group for an Azure SQL database using PowerShell
- Implementing vertical scaling for an Azure SQL database using PowerShell
- Implementing an Azure SQL database elastic pool using PowerShell
- Monitoring an Azure SQL database using the Azure portal
-
Chapter 3: Analyzing Data with Azure Synapse Analytics
- Technical requirements
- Provisioning and connecting to an Azure Synapse SQL pool using PowerShell
- Pausing or resuming a Synapse SQL pool using PowerShell
- Scaling an Azure Synapse SQL pool instance using PowerShell
- Loading data into a SQL pool using PolyBase with T-SQL
- Loading data into a SQL pool using the COPY INTO statement
- Implementing workload management in an Azure Synapse SQL pool
- Optimizing queries using materialized views in Azure Synapse Analytics
- Chapter 4: Control Flow Activities in Azure Data Factory
-
Chapter 5: Control Flow Transformation and the Copy Data Activity in Azure Data Factory
- Technical requirements
- Implementing HDInsight Hive and Pig activities
- Implementing an Azure Functions activity
- Implementing a Data Lake Analytics U-SQL activity
- Copying data from Azure Data Lake Gen2 to an Azure Synapse SQL pool using the copy activity
- Copying data from Azure Data Lake Gen2 to Azure Cosmos DB using the copy activity
- Chapter 6: Data Flows in Azure Data Factory
- Chapter 7: Azure Data Factory Integration Runtime
- Chapter 8: Deploying Azure Data Factory Pipelines
- Chapter 9: Batch and Streaming Data Processing with Azure Databricks
- Other Books You May Enjoy
Product information
- Title: Azure Data Engineering Cookbook
- Author(s):
- Release date: April 2021
- Publisher(s): Packt Publishing
- ISBN: 9781800206557
You might also like
book
Azure Data Engineering Cookbook - Second Edition
Nearly 80 recipes to help you collect and transform data from multiple sources into a single …
book
Azure Databricks Cookbook
Get to grips with building and productionizing end-to-end big data solutions in Azure and learn best …
book
Azure Cookbook
How do you deal with the problems you face when using Azure? This practical guide provides …
book
Azure Data Factory Cookbook
Solve real-world data problems and create data-driven workflows for easy data movement and processing at scale …