Machine Learning from Scratch
Published by O'Reilly Media, Inc.
Machine learning is becoming more accessible thanks to libraries like Scikit-learn and Tensorflow. As a matter of fact, it is becoming so accessible that few practitioners actually take time to understand what happens under the hood. As machine learning becomes increasingly commoditized, those interested in machine learning should take time to understand these algorithms intimately to stay competitive. The risks of trusting black boxes, from lack of insight to complete misuse, will often outweigh the benefits of convenience.
In this course, we will take a highly practical approach to building machine learning algorithms from scratch with Python including linear regression, logistic regression, Naïve Bayes, decision trees, and neural networks. This will give you a better understanding on how machine learning works, and allow you to use libraries (or build them from scratch) more confidently. We will learn some simple but powerful optimization tools to generalize solutions quickly, while avoiding distracting concepts like Calculus, partial derivatives, and linear algebra.
What you’ll learn and how you can apply it
By the end of this live, hands-on, online course, you’ll understand:
- The fundamental concepts behind different machine learning algorithms, as well as regression and classification tasks
- The challenges and strengths of each machine learning model
- What makes “machine learning” tick, and different ways to perform regression and classification
And you’ll be able to:
- Build linear regression, logistic regression, Naïve Bayes, decision trees, and neural network models completely from scratch
- Leverage hill climbing to optimize machine learning parameters easily and without calculus
- Develop intuition on how machine learning libraries work
This live event is for you because...
- You’re a data science professional wanting to interpret machine learning beyond a “black box” understanding
- You’re a programmer who wants to see what machine learning is all about, and how to do it from scratch
- You’re someone not intimidated by some code and basic math, and want to see how these two areas can be combined to do regression and classification tasks.
Prerequisites
- Comfort and proficiency with Python, including variables, functions, loops, generators, and classes.
- Basic knowledge of NumPy and/or Pandas is recommended, but not required.
Recommended preparation:
- Set up a Python environment of your choice. The instructor will be using PyCharm with Python 3.7.
- GitHub files: https://github.com/thomasnield/oreilly_machine_learning_from_scratch/
- If you are new to NumPy or Pandas, consider reviewing chapters 4 “NumPy Basics: Arrays and Vectorized Computation” and 5 “Getting Started with pandas” in Python for Data Analysis, 2nd Edition (book).
Recommended follow-up:
Attend Intro to Mathematical Optimization (live online training course with Thomas Nield)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Getting Started (10 minutes)
- Presentation: Overview and Expectations
- Demo: Using Hill Climbing to find square root
- Discussion: The importance of optimization
- Exercise: Hill Climbing to find cubed root
- Q&A
Linear Regression and K-Means Clustering (50 minutes)
- Presentation: Fundamentals, minimizing the sum/mean of squares
- Walkthrough: Simple linear regression
- Walkthrough: Multivariable linear regression
- Walkthrough: K-Means clustering
- Exercise: Linear regression
- Q&A
- Break (5 minutes)
Logistic Regression (40 minutes)
- Presentation: Logistic regression concepts
- Walkthrough: Simple logistic regression
- Walkthrough: Multivariable logistic regression
- Walkthough: Logistic regression to categorize text
- Exercise: Testing the model
- Q&A
Naïve Bayes (35 minutes)
- Walkthrough: Categorizing text demo
- Presentation: How to implement naïve bayes
- Walkthrough/Exercise: Building an email spam classifier
- Exercise: Testing the model
- Q&A
- Break (5 minutes)
Decision Trees (40 minutes)
- Presentation: Decision tree fundamentals
- Walkthrough: Building a decision tree
- Exercise: GINI scoring, testing the model
- Q&A
- Break (5 minutes)
Neural Networks (50 minutes)
- Presentation: Neural network fundamentals
- Walkthrough: Building a neural network to classify colors
- Exercise: Testing the model
- Q&A
Your Instructor
Thomas Nield
Thomas Nield is the founder of Nield Consulting Group and an instructor at O’Reilly Media and the University of Southern California, teaching classes on data analysis, machine learning, mathematical optimization, AI system safety, and practical artificial intelligence. He’s authored multiple books including Getting Started with SQL and Essential Math for Data Science, both for O’Reilly. He’s also the founder and inventor of Yawman Flight, a company that develops universal handheld controls for flight simulation and unmanned aerial vehicles. Thomas enjoys making technical content relatable and relevant to those unfamiliar with or intimidated by it.