Mastering Azure Machine Learning - Second Edition

Book description

Supercharge and automate your deployments to Azure Machine Learning clusters and Azure Kubernetes Service using Azure Machine Learning services

Key Features

  • Implement end-to-end machine learning pipelines on Azure
  • Train deep learning models using Azure compute infrastructure
  • Deploy machine learning models using MLOps

Book Description

Azure Machine Learning is a cloud service for accelerating and managing the machine learning (ML) project life cycle that ML professionals, data scientists, and engineers can use in their day-to-day workflows. This book covers the end-to-end ML process using Microsoft Azure Machine Learning, including data preparation, performing and logging ML training runs, designing training and deployment pipelines, and managing these pipelines via MLOps.

The first section shows you how to set up an Azure Machine Learning workspace; ingest and version datasets; as well as preprocess, label, and enrich these datasets for training. In the next two sections, you'll discover how to enrich and train ML models for embedding, classification, and regression. You'll explore advanced NLP techniques, traditional ML models such as boosted trees, modern deep neural networks, recommendation systems, reinforcement learning, and complex distributed ML training techniques - all using Azure Machine Learning.

The last section will teach you how to deploy the trained models as a batch pipeline or real-time scoring service using Docker, Azure Machine Learning clusters, Azure Kubernetes Services, and alternative deployment targets.

By the end of this book, you'll be able to combine all the steps you've learned by building an MLOps pipeline.

What you will learn

  • Understand the end-to-end ML pipeline
  • Get to grips with the Azure Machine Learning workspace
  • Ingest, analyze, and preprocess datasets for ML using the Azure cloud
  • Train traditional and modern ML techniques efficiently using Azure ML
  • Deploy ML models for batch and real-time scoring
  • Understand model interoperability with ONNX
  • Deploy ML models to FPGAs and Azure IoT Edge
  • Build an automated MLOps pipeline using Azure DevOps

Who this book is for

This book is for machine learning engineers, data scientists, and machine learning developers who want to use the Microsoft Azure cloud to manage their datasets and machine learning experiments and build an enterprise-grade ML architecture using MLOps. This book will also help anyone interested in machine learning to explore important steps of the ML process and use Azure Machine Learning to support them, along with building powerful ML cloud applications. A basic understanding of Python and knowledge of machine learning are recommended.

Table of contents

  1. Mastering Azure Machine Learning
  2. Second Edition
  3. Contributors
  4. About the authors
  5. About the reviewers
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Share Your Thoughts
  7. Section 1: Introduction to Azure Machine Learning
  8. Chapter 1: Understanding the End-to-End Machine Learning Process
    1. Grasping the idea behind ML
      1. Problems and scenarios requiring ML
      2. The history of ML
      3. Understanding the inner workings of ML through the example of ANNs
    2. Understanding the mathematical basis for statistical analysis and ML modeling
      1. The case for statistics in ML
      2. Basics of statistics
      3. Understanding bias
      4. Classifying ML algorithms
      5. Analyzing errors and the quality of results of model training
    3. Discovering the end-to-end ML process
      1. Excavating data and sources
      2. Preparing and cleaning data
      3. Defining labels and engineering features
      4. Training models
      5. Deploying models
      6. Developing and operating enterprise-grade ML solutions
    4. Summary
  9. Chapter 2: Choosing the Right Machine Learning Service in Azure
    1. Choosing an Azure service for ML
      1. Navigating the Azure AI landscape
      2. Consuming a managed AI service
      3. Building a custom AI service
      4. What is the Azure Machine Learning service?
    2. Managed ML services
      1. Azure Cognitive Services
      2. Custom Cognitive Services
      3. Azure Applied AI Services
    3. Custom ML services
      1. Azure Machine Learning Studio (classic)
      2. Azure Machine Learning designer
      3. Azure Automated Machine Learning
      4. Azure Machine Learning workspace
    4. Custom compute services for ML
      1. Azure Databricks
      2. Azure Batch
      3. Data Science Virtual Machines
    5. Summary
  10. Chapter 3: Preparing the Azure Machine Learning Workspace
    1. Technical requirements
    2. Deploying an Azure Machine Learning workspace
      1. Understanding the available tooling for Azure deployments
      2. Deploying the workspace
    3. Exploring the Azure Machine Learning service
      1. Analyzing the deployed services
      2. Understanding the workspace interior
      3. Surveying Azure Machine Learning Studio
    4. Running ML experiments with Azure Machine Learning
      1. Setting up a local environment
      2. Enhancing a simple experiment
      3. Logging metrics and tracking results
      4. Scheduling the script execution
      5. Running experiments on a cloud compute
    5. Summary
  11. Section 2: Data Ingestion, Preparation, Feature Engineering, and Pipelining
  12. Chapter 4: Ingesting Data and Managing Datasets
    1. Technical requirements
    2. Choosing data storage solutions for Azure Machine Learning
      1. Organizing data in Azure Machine Learning
      2. Understanding the default storage accounts of Azure Machine Learning
      3. Exploring options for storing training data in Azure
    3. Creating a datastore and ingesting data
      1. Creating Blob Storage and connecting it with the Azure Machine Learning workspace
      2. Ingesting data into Azure
    4. Using datasets in Azure Machine Learning
      1. Tracking datasets in Azure Machine Learning
      2. Accessing data during training
      3. Using external datasets with open datasets
    5. Summary
  13. Chapter 5: Performing Data Analysis and Visualization
    1. Technical requirements
    2. Understanding data exploration techniques
      1. Exploring and analyzing tabular datasets
      2. Exploring and analyzing file datasets
    3. Performing data analysis on a tabular dataset
      1. Initial exploration and cleansing of the Melbourne Housing dataset
      2. Running statistical analysis on the dataset
      3. Finding and handling missing values
      4. Calculating correlations and feature importance
      5. Tracking figures from exploration in Azure Machine Learning
    4. Understanding dimensional reduction techniques
      1. Unsupervised dimensional reduction using PCA
      2. Supervised dimensional reduction using LDA
      3. Non-linear dimensional reduction using t-SNE
      4. Generalizing t-SNE with UMAP
    5. Summary
  14. Chapter 6: Feature Engineering and Labeling
    1. Technical requirements
    2. Understanding and applying feature engineering
      1. Classifying feature engineering techniques
      2. Discovering feature transformation and extraction methods
      3. Testing feature engineering techniques on a tabular dataset
    3. Handling data labeling
      1. Analyzing scenarios that require labels
      2. Performing data labeling for image classification using the Azure Machine Learning labeling service
    4. Summary
  15. Chapter 7: Advanced Feature Extraction with NLP
    1. Technical requirements
    2. Understanding categorical data
      1. Comparing textual, categorical, and ordinal data
      2. Transforming categories into numeric values
      3. Orthogonal embedding using one-hot encoding
      4. Semantics and textual values
    3. Building a simple bag-of-words model
      1. A naïve bag-of-words model using counting
      2. Tokenization – turning a string into a list of words
      3. Stemming – the rule-based removal of affixes
      4. Lemmatization – dictionary-based word normalization
      5. A bag-of-words model in scikit-learn
    4. Leveraging term importance and semantics
      1. Generalizing words using n-grams and skip-grams
      2. Reducing word dictionary size using SVD
      3. Measuring the importance of words using TF-IDF
      4. Extracting semantics using word embeddings
    5. Implementing end-to-end language models
      1. The end-to-end learning of token sequences
      2. State-of-the-art sequence-to-sequence models
      3. Text analytics using Azure Cognitive Services
    6. Summary
  16. Chapter 8: Azure Machine Learning Pipelines
    1. Technical requirements
    2. Using pipelines in ML workflows
      1. Why build pipelines?
      2. What are Azure Machine Learning pipelines?
    3. Building and publishing an ML pipeline
      1. Creating a simple pipeline
      2. Connecting data inputs and outputs between steps
      3. Publishing, triggering, and scheduling a pipeline
      4. Parallelizing steps to speed up large pipelines
      5. Reusing pipeline steps through modularization
    4. Integrating pipelines with other Azure services
      1. Building pipelines with Azure Machine Learning designer
      2. Azure Machine Learning pipelines in Azure Data Factory
      3. Azure Pipelines for CI/CD
    5. Summary
  17. Section 3: The Training and Optimization of Machine Learning Models
  18. Chapter 9: Building ML Models Using Azure Machine Learning
    1. Technical requirements
    2. Working with tree-based ensemble classifiers
      1. Understanding a simple decision tree
      2. Combining classifiers with bagging
      3. Optimizing classifiers with boosting rounds
    3. Training an ensemble classifier model using LightGBM
      1. LightGBM in a nutshell
      2. Preparing the data
      3. Setting up the compute cluster and execution environment
      4. Building a LightGBM classifier
      5. Scheduling the training script on the Azure Machine Learning cluster
    4. Summary
  19. Chapter 10: Training Deep Neural Networks on Azure
    1. Technical requirements
    2. Introduction to Deep Learning
      1. Why Deep Learning?
      2. From neural networks to deep learning
      3. DL versus traditional ML
      4. Using traditional ML with DL-based feature extractors
    3. Training a CNN for image classification
      1. Training a CNN from scratch in your notebook
      2. Generating more input data using augmentation
      3. Training on a GPU cluster using Azure Machine Learning
      4. Improving your performance through transfer learning
    4. Summary
  20. Chapter 11: Hyperparameter Tuning and Automated Machine Learning
    1. Technical requirements
    2. Finding the optimal model parameters with HyperDrive
      1. Sampling all possible parameter combinations using grid search
      2. Testing random combinations using random search
      3. Converging faster using early termination
      4. Optimizing parameter choices using Bayesian optimization
    3. Finding the optimal model with Automated Machine Learning
      1. The unfair advantage of Automated Machine Learning
      2. A classification example with Automated Machine Learning
    4. Summary
  21. Chapter 12: Distributed Machine Learning on Azure
    1. Technical requirements
    2. Exploring methods for distributed ML
      1. Training independent models on small data in parallel
      2. Training a model ensemble on large datasets in parallel
      3. Fundamental building blocks for distributed ML
      4. Speeding up deep learning with data-parallel training
      5. Training large models with model-parallel training
    3. Using distributed ML in Azure
      1. Horovod – a distributed deep learning training framework
      2. Implementing the HorovodRunner API for a Spark job
      3. Training models with Horovod on Azure Machine Learning
    4. Summary
  22. Chapter 13: Building a Recommendation Engine in Azure
    1. Technical requirements
    2. Introduction to recommendation engines
    3. A content-based recommender system
      1. Measuring the similarity between items
      2. Feature engineering for content-based recommenders
      3. Content-based recommendations using gradient boosted trees
    4. Collaborative filtering – a rating-based recommender system
      1. What is a rating? Explicit feedback versus implicit feedback
      2. Predicting the missing ratings to make a recommendation
      3. Scalable recommendations using ALS factorization
    5. Combining content and ratings in hybrid recommendation engines
    6. Automatic optimization through reinforcement learning
    7. Summary
  23. Section 4: Machine Learning Model Deployment and Operations
  24. Chapter 14: Model Deployment, Endpoints, and Operations
    1. Technical requirements
    2. Preparations for model deployments
      1. Understanding the components of an ML model
      2. Registering your models in a model registry
      3. Auto-deployments of registered models
      4. Customizing your deployment environment
      5. Choosing a deployment target in Azure
    3. Deploying ML models in Azure
      1. Building a real-time scoring service
      2. Deploying to Azure Kubernetes Services
      3. Defining a schema for scoring endpoints
      4. Managing model endpoints
      5. Controlled rollouts and A/B testing
      6. Implementing a batch-scoring pipeline
    4. ML operations in Azure
      1. Profiling models for optimal resource configuration
      2. Collecting logs and infrastructure metrics
      3. Tracking telemetry and application metrics
      4. Detecting data drift
    5. Summary
  25. Chapter 15: Model Interoperability, Hardware Optimization, and Integrations
    1. Technical requirements
    2. Model interoperability with ONNX
      1. What is model interoperability and how can ONNX help?
      2. Converting models to ONNX format with ONNX frontends
      3. Native scoring of ONNX models with ONNX backends
    3. Hardware optimization with FPGAs
      1. Understanding FPGAs
      2. Comparing GPUs and FPGAs for deep neural networks
      3. Running DNN inferencing on Intel FPGAs with Azure
    4. Integrating ML models and endpoints with Azure services
      1. Integrating with Azure IoT Edge
      2. Integrating with Power BI
    5. Summary
  26. Chapter 16: Bringing Models into Production with MLOps
    1. Technical requirements
    2. Ensuring reproducible builds and deployments
      1. Version-controlling your code
      2. Registering snapshots of your data
      3. Tracking your model metadata and artifacts
      4. Scripting your environments and deployments
    3. Validating the code, data, and models
      1. Testing data quality with unit tests
      2. Integration testing for ML
      3. End-to-end testing using Azure Machine Learning
      4. Continuous profiling of your model
    4. Building an end-to-end MLOps pipeline
      1. Setting up Azure DevOps
      2. Continuous integration – building code with pipelines
      3. Continuous deployment – deploying models with release pipelines
    5. Summary
  27. Chapter 17: Preparing for a Successful ML Journey
    1. Remembering the importance of data
    2. Starting with a thoughtful infrastructure
    3. Automating recurrent tasks
    4. Expecting constant change
    5. Thinking about your responsibility
      1. Interpreting a model
      2. Fairness in model training
      3. Handling PII data and compliance requirements
    6. Summary
    7. Why subscribe?
  28. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts

Product information

  • Title: Mastering Azure Machine Learning - Second Edition
  • Author(s): Christoph Körner, Marcel Alsdorf
  • Release date: May 2022
  • Publisher(s): Packt Publishing
  • ISBN: 9781803232416