Optimization: Gradient Descent and Deep Learning (ML Foundations Series)

Published by Pearson

Intermediate

State-of-the-Art Approaches for Accurate and Efficient Model Fitting

Capstone ML class: Serves as the final class in the 14-part "ML Foundations" series by Jon Krohn, blending material from the fields of linear algebra, calculus, probability, statistics and algorithms.
Gradient descent mastery: Develop a deep understanding of the essential theory behind the ubiquitous gradient descent approach to optimization, along with hands-on experience applying it using PyTorch and TensorFlow.
Latest optimization techniques: Learn about state-of-the-art optimizers, such as Adam and Nadam, widely used for training deep neural networks, while also receiving guidance on next steps in your ML journey.

The Machine Learning Foundations series of online trainings provides a comprehensive overview of all of the subjects — mathematics, statistics, and computer science — that underlie contemporary machine learning techniques, including deep learning and other artificial intelligence approaches. Extensive curriculum detail can be found at the course’s GitHub repo.

All of the classes in the ML Foundations series bring theory to life through the combination of vivid full-color illustrations, straightforward Python examples within hands-on Jupyter notebook demos, and comprehension exercises with fully-worked solutions.

The focus is on providing you with a practical, functional understanding of the content covered. Context will be given for each topic, highlighting its relevance to machine learning. You will be better positioned to understand cutting-edge machine learning papers and you will be provided with resources for digging even deeper into topics that pique your curiosity.

There are 14 classes in the series, organized into four subject areas:

Linear Algebra (three classes)

Linear Algebra for Machine Learning: Intro
Linear Algebra for Machine Learning, Level II: Matrix Tensors
Linear Algebra for Machine Learning, Level III: Eigenvectors

Calculus (four classes)

Calculus for Machine Learning: Intro
Calculus for Machine Learning, Level II: Automatic Differentiation
Calculus for Machine Learning, Level III: Partial Derivatives
Calculus for Machine Learning, Level IV: Gradients & Integrals

Probability and Statistics (four classes)

Intro to Probability Theory
Probability II and Information Theory
Intro to Statistics
Statistics II: Regression and Bayesian

Computer Science (three classes)

Intro to Data Structures and Algorithms
DSA II: Hashing, Trees, and Graphs
Optimization

Each of the four subject areas are fairly independent, however theory within a given subject area generally builds over the 3-4 classes — topics in later classes of a given subject area often assume an understanding of topics from earlier classes. Work through the individual classes based on your particular interests or your existing familiarity with the material.

(Note that at any given time, only a subset of the ML Foundations classes will be scheduled and open for registration.)

This class, Optimization, is the final class in the 14-part Machine Learning Foundations series. It builds upon the material from each of the other classes in the series — on linear algebra, calculus, probability, statistics, and algorithms — in order to provide a detailed introduction to training ML models. Through the measured exposition of theory paired with interactive examples, you’ll develop a working understanding of all of the essential theory behind the ubiquitous gradient descent approach to optimization as well as how to apply it yourself — both at a granular, matrix operations level and a quick, abstract level — with TensorFlow and PyTorch. You’ll also learn about the latest optimizers, such as Adam and Nadam, that are widely used for training deep neural networks. Now well-equipped with all the foundational knowledge underlying ML, at the end of class you’ll be left with guidance on where to proceed from here on your ML journey.

What you’ll learn and how you can apply it

Discover how the statistical and machine learning approaches to optimization differ, and why you would select one or the other for a given problem you’re solving.
Find out how the extremely versatile (stochastic) gradient descent optimization algorithm works, including how to apply it — from a low, in-depth level as well as from a high, abstracted level — within the most popular deep learning libraries, Tensorflow and PyTorch
Get acquainted with the “fancy” optimizers that are available for advanced machine learning approaches (e.g., deep learning) and when you should consider using them.

This live event is for you because...

You use high-level software (e.g., scikit-learn, the Keras API, PyTorch Lightning) to train or deploy machine learning algorithms, and would now like to understand the fundamentals underlying the abstractions, enabling you to expand your capabilities
You’re a software developer who would like to develop a firm foundation for the deployment of machine learning algorithms into production systems
You’re a data scientist who would like to reinforce your understanding of the subjects at the core of your professional discipline
You’re a data analyst or A.I. enthusiast who would like to become a data scientist or data/ML engineer, and so you’re keen to deeply understand the field you’re entering from the ground up (very wise of you!)

Prerequisites

Programming: All code demos will be in Python so experience with it or another object-oriented programming language would be helpful for following along with the code examples.
Mathematics: You should either have attended the Calculus IV: Gradients and Integrals live training or be familiar with the content in Lessons 1-7 of Jon Krohn’s Calculus for ML LiveLessons.

Course Set-up

During class, we’ll work on Jupyter notebooks interactively in the cloud via Google Colab. This requires zero setup and instructions will be provided in class.

Recommended Preparation

If you’re feeling extremely ambitious, you can get a headstart on the content we’ll be covering in class by viewing Lessons 8-9 of Jon Krohn’s Data Structures, Algorithms, and ML Optimization LiveLessons.

Note: The remainder of Jon’s ML Foundations curriculum is split across the following videos:

Recommended Follow-up

Watch: Data Structures, Algorithms, and ML Optimization LiveLessons by Jon Krohn
Explore: Math for Machine Learning by Jon Krohn
Explore: Deep Learning: The Complete Guide by Jon Krohn

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Segment 1: Optimization Approaches (30 min)

The Statistical Approach to Regression: Ordinary Least Squares
When Statistical Approaches to Optimization Break Down
The Machine Learning Solution

Q&A: 5 minutes

Break: 10 minutes

Segment 2: Gradient Descent (105 min)

Objective Functions
Cost / Loss / Error Functions
Minimizing Cost with Gradient Descent
Learning Rate
Critical Points, incl. Saddle Points
Gradient Descent from Scratch with PyTorch
Checkpoint, Q&A, and Break
The Global Minimum and Local Minima
Mini-Batches and Stochastic Gradient Descent (SGD)
Learning Rate Scheduling
Maximizing Reward with Gradient Ascent

Q&A: 5 minutes

Break: 10 minutes

Segment 3: Fancy Deep Learning Optimizers (60 min)

A Layer of Artificial Neurons in PyTorch
Jacobian Matrices
Hessian Matrices and Second-Order Optimization
Momentum
Nesterov Momentum
AdaGrad
AdaDelta
RMSProp
Adam
Nadam
Training a Deep Neural Net
Resources for the Further Study of Machine Learning

Q&A: 15 minutes

Course wrap-up and next steps (15 minutes)

Your Instructor

Jon Krohn
Jon Krohn is Co-Founder and Chief Data Scientist at the machine learning company Nebula. He authored the book Deep Learning Illustrated, an instant #1 bestseller that was translated into seven languages. He is also the host of SuperDataScience, the data science industry’s most listened-to podcast. Jon is renowned for his compelling lectures, which he offers at leading universities and conferences, as well as via his award-winning YouTube channel. He holds a PhD from Oxford and has been publishing on machine learning in prominent academic journals since 2010.

linkedin link search