Machine Learning from Scratch
Published by O'Reilly Media, Inc.
Build machine learning algorithms from scratch with Python
Machine learning is becoming more accessible thanks to libraries like scikit-learn and PyTorch. In fact, it’s becoming so accessible that few practitioners actually take the time to understand what happens under the hood. As machine learning becomes increasingly commoditized, those interested in machine learning should take the time to understand these algorithms intimately to stay competitive. The risks of trusting black boxes, from lack of insight to complete misuse, often outweigh the benefits of convenience.
Expert Thomas Nield takes you on a highly practical deep dive into building machine learning algorithms and models—including linear regression, k-means clustering, logistic regression, naive Bayes, decision trees, and neural networks—from scratch with Python. You’ll gain a better understanding of how machine learning works and how to use libraries (or build them from scratch) more confidently. You’ll also learn simple but powerful optimization tools to explore ideas quickly in plain Python, without the need for calculus, partial derivatives, and linear algebra.
What you’ll learn and how you can apply it
By the end of this live, hands-on, online course, you’ll understand:
- The fundamental concepts behind different machine learning algorithms, regression, and classification tasks
- The challenges and strengths of each machine learning model
- What makes machine learning tick and different ways to perform supervised machine learning
And you’ll be able to:
- Build linear regression, logistic regression, Naive Bayes, decision trees, and neural network models completely from scratch
- Leverage hill climbing to optimize machine learning parameters easily and without calculus
This live event is for you because...
- You’re a data science professional who wants to interpret machine learning beyond a black box understanding.
- You’re a programmer who wants to see what machine learning is all about and how to do it from scratch.
- You’re not intimidated by code and basic math and want to see how these two can be combined to perform supervised machine learning.
Prerequisites
- Comfort and proficiency with Python, including variables, functions, loops, generators, and classes.
- Basic knowledge of NumPy and/or Pandas is recommended, but not required.
Course Setup Instructions:
- Set up a Python environment of your choice. The instructor will be using PyCharm with Python 3.7.
- Course GitHub files: https://github.com/thomasnield/oreilly_machine_learning_from_scratch/
Recommended preparation:
- Attend Python in 5 Weeks: Python Programming for Beginners (Live online Course with Reuven Lerner)
- If you are new to NumPy or Pandas, consider reviewing chapters 4 “NumPy Basics: Arrays and Vectorized Computation” and 5 “Getting Started with pandas” in Python for Data Analysis, 2nd Edition (book).
Recommended follow-up:
- Attend Essential Math for Data Science in 4 weeks (Live online course with Thomas Nield)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Getting started (10 minutes)
- Lecture: Overview and expectations
- Group discussion: The importance of optimization
- Group discussion: Understanding Python Performance
- Q&A
Linear regression and k-means clustering (45 minutes)
- Lecture: Fundamentals, minimizing the sum and mean of squares; simple linear regression; multivariable linear regression; k-means clustering
- Hands-on exercise: Complete simple linear regression
- Q&A
- Break (10 minutes)
Multiple Linear Regression and Supervised ML Caveats (45 minutes)
- Lecture: multivariable linear regression; k-means clustering
- Lecture: Overfitting, Bias, and Computational Complexity
- Hands-on exercise: Perform multivariable linear regression
- Q&A
- Break (10 minutes)
Logistic regression (45 minutes)
- Lecture: Logistic regression concepts; simple logistic regression; multivariable logistic regression; logistic regression to categorize text
- Hands-on exercise: Build a logistic regression
- Q&A
- Break (10 minutes)
Naive Bayes (15 minutes)
- Lecture: Categorizing text; how to implement naive Bayes
- Hands-on exercise: Build an email spam classifier; test the model
- Q&A
Decision trees and Random Forests (50 minutes)
- Lecture: Decision tree fundamentals; building a decision tree; building a random forest
- Hands-on exercise: Score GINI; test the model
- Q&A
- Break (10 minutes)
Neural networks (50 minutes)
- Lecture: Neural network fundamentals; building a neural network to classify colors; forward propagation; convex vs non-convex problems
- Hands-on exercise: Choose hyperparameters
Your Instructor
Thomas Nield
Thomas Nield is the founder of Nield Consulting Group and an instructor at O’Reilly Media and the University of Southern California, teaching classes on data analysis, machine learning, mathematical optimization, AI system safety, and practical artificial intelligence. He’s authored multiple books including Getting Started with SQL and Essential Math for Data Science, both for O’Reilly. He’s also the founder and inventor of Yawman Flight, a company that develops universal handheld controls for flight simulation and unmanned aerial vehicles. Thomas enjoys making technical content relatable and relevant to those unfamiliar with or intimidated by it.