Deep Learning

Book description

Ever since computers began beating us at chess, they've been getting better at a wide range of human activities, from writing songs and generating news articles to helping doctors provide healthcare.

Deep learning is the source of many of these breakthroughs, and its remarkable ability to find patterns hiding in data has made it the fastest growing field in artificial intelligence (AI). Digital assistants on our phones use deep learning to understand and respond intelligently to voice commands; automotive systems use it to safely navigate road hazards; online platforms use it to deliver personalized suggestions for movies and books – the possibilities are endless.

Deep Learning: A Visual Approach is for anyone who wants to understand this fascinating field in depth, but without any of the advanced math and programming usually required to grasp its internals. If you want to know how these tools work, and use them yourself, the answers are all within these pages. And, if you’re ready to write your own programs, there are also plenty of supplemental Python notebooks in the accompanying Github repository to get you going.

The book’s conversational style, extensive color illustrations, illuminating analogies, and real-world examples expertly explain the key concepts in deep learning, including:

•How text generators create novel stories and articles •How deep learning systems learn to play and win at human games •How image classification systems identify objects or people in a photo •How to think about probabilities in a way that’s useful to everyday life •How to use the machine learning techniques that form the core of modern AI

Intellectual adventurers of all kinds can use the powerful ideas covered in Deep Learning: A Visual Approach to build intelligent systems that help us better understand the world and everyone who lives in it. It’s the future of AI, and this book allows you to fully envision it.

Publisher resources

View/Submit Errata

Table of contents

  1. Title Page
  2. Copyright
  3. Dedication
  4. About the Author
  5. Acknowledgments
  6. Introduction
    1. Who This Book Is For
    2. This Book Has No Complex Math and No Code
    3. There Is Code, If You Want It
    4. The Figures Are Available, Too!
    5. Errata
    6. About This Book
      1. Part I: Foundational Ideas
      2. Part II: Basic Machine Learning
      3. Part III: Deep Learning Basics
      4. Part IV: Beyond the Basics
    7. Final Words
  7. Part I: Foundational Ideas
    1. Chapter 1: An Overview of Machine Learning
      1. Expert Systems
      2. Supervised Learning
      3. Unsupervised Learning
      4. Reinforcement Learning
      5. Deep Learning
      6. Summary
    2. Chapter 2: Essential Statistics
      1. Describing Randomness
      2. Random Variables and Probability Distributions
      3. Some Common Distributions
        1. Continuous Distributions
        2. Discrete Distributions
      4. Collections of Random Values
        1. Expected Value
        2. Dependence
        3. Independent and Identically Distributed Variables
      5. Sampling and Replacement
        1. Selection with Replacement
        2. Selection Without Replacement
      6. Bootstrapping
      7. Covariance and Correlation
        1. Covariance
        2. Correlation
      8. Statistics Don’t Tell Us Everything
      9. High-Dimensional Spaces
      10. Summary
    3. Chapter 3: Measuring Performance
      1. Different Types of Probability
        1. Dart Throwing
        2. Simple Probability
        3. Conditional Probability
        4. Joint Probability
        5. Marginal Probability
      2. Measuring Correctness
        1. Classifying Samples
        2. The Confusion Matrix
        3. Characterizing Incorrect Predictions
        4. Measuring Correct and Incorrect
        5. Accuracy
        6. Precision
        7. Recall
        8. Precision-Recall Tradeoff
        9. Misleading Measures
        10. f1 Score
        11. About These Terms
        12. Other Measures
      3. Constructing a Confusion Matrix Correctly
      4. Summary
    4. Chapter 4: Bayes’ Rule
      1. Frequentist and Bayesian Probability
        1. The Frequentist Approach
        2. The Bayesian Approach
        3. Frequentists vs. Bayesians
      2. Frequentist Coin Flipping
      3. Bayesian Coin Flipping
        1. A Motivating Example
        2. Picturing the Coin Probabilities
        3. Expressing Coin Flips as Probabilities
        4. Bayes’ Rule
        5. Discussion of Bayes’ Rule
      4. Bayes’ Rule and Confusion Matrices
      5. Repeating Bayes’ Rule
        1. The Posterior-Prior Loop
        2. The Bayes Loop in Action
      6. Multiple Hypotheses
      7. Summary
    5. Chapter 5: Curves and Surfaces
      1. The Nature of Functions
      2. The Derivative
        1. Maximums and Minimums
        2. Tangent Lines
        3. Finding Minimums and Maximums with Derivatives
      3. The Gradient
        1. Water, Gravity, and the Gradient
        2. Finding Maximums and Minimums with Gradients
        3. Saddle Points
      4. Summary
    6. Chapter 6: Information Theory
      1. Surprise and Context
        1. Understanding Surprise
        2. Unpacking Context
      2. Measuring Information
      3. Adaptive Codes
        1. Speaking Morse
        2. Customizing Morse Code
      4. Entropy
      5. Cross Entropy
        1. Two Adaptive Codes
        2. Using the Codes
        3. Cross Entropy in Practice
      6. Kullback–Leibler Divergence
      7. Summary
  8. Part II: Basic Machine Learning
    1. Chapter 7: Classification
      1. Two-Dimensional Binary Classification
      2. 2D Multiclass Classification
      3. Multiclass Classification
        1. One-Versus-Rest
        2. One-Versus-One
      4. Clustering
      5. The Curse of Dimensionality
        1. Dimensionality and Density
        2. High-Dimensional Weirdness
      6. Summary
    2. Chapter 8: Training and Testing
      1. Training
      2. Testing the Performance
        1. Test Data
        2. Validation Data
      3. Cross-Validation
      4. k-Fold Cross-Validation
      5. Summary
    3. Chapter 9: Overfitting and Underfitting
      1. Finding a Good Fit
        1. Overfitting
        2. Underfitting
      2. Detecting and Addressing Overfitting
        1. Early Stopping
        2. Regularization
      3. Bias and Variance
        1. Matching the Underlying Data
        2. High Bias, Low Variance
        3. Low Bias, High Variance
        4. Comparing Curves
      4. Fitting a Line with Bayes’ Rule
      5. Summary
    4. Chapter 10: Data Preparation
      1. Basic Data Cleaning
      2. The Importance of Consistency
      3. Types of Data
      4. One-Hot Encoding
      5. Normalizing and Standardizing
        1. Normalization
        2. Standardization
        3. Remembering the Transformation
      6. Types of Transformations
        1. Slice Processing
        2. Samplewise Processing
        3. Featurewise Processing
        4. Elementwise Processing
      7. Inverse Transformations
      8. Information Leakage in Cross-Validation
      9. Shrinking the Dataset
        1. Feature Selection
        2. Dimensionality Reduction
      10. Principal Component Analysis
        1. PCA for Simple Images
        2. PCA for Real Images
      11. Summary
    5. Chapter 11: Classifiers
      1. Types of Classifiers
      2. k-Nearest Neighbors
      3. Decision Trees
        1. Using Decision Trees
        2. Overfitting Trees
        3. Splitting Nodes
      4. Support Vector Machines
        1. The Basic Algorithm
        2. The SVM Kernel Trick
      5. Naive Bayes
      6. Comparing Classifiers
      7. Summary
    6. Chapter 12: Ensembles
      1. Voting
      2. Ensembles of Decision Trees
        1. Bagging
        2. Random Forests
        3. Extra Trees
      3. Boosting
      4. Summary
  9. Part III: Deep Learning Basics
    1. Chapter 13: Neural Networks
      1. Real Neurons
      2. Artificial Neurons
        1. The Perceptron
        2. Modern Artificial Neurons
      3. Drawing the Neurons
      4. Feed-Forward Networks
      5. Neural Network Graphs
      6. Initializing the Weights
      7. Deep Networks
      8. Fully Connected Layers
      9. Tensors
      10. Preventing Network Collapse
      11. Activation Functions
        1. Straight-Line Functions
        2. Step Functions
        3. Piecewise Linear Functions
        4. Smooth Functions
        5. Activation Function Gallery
        6. Comparing Activation Functions
      12. Softmax
      13. Summary
    2. Chapter 14: Backpropagation
      1. A High-Level Overview of Training
        1. Punishing Error
        2. A Slow Way to Learn
        3. Gradient Descent
      2. Getting Started
      3. Backprop on a Tiny Neural Network
        1. Finding Deltas for the Output Neurons
        2. Using Deltas to Change Weights
        3. Other Neuron Deltas
      4. Backprop on a Larger Network
      5. The Learning Rate
        1. Building a Binary Classifier
        2. Picking a Learning Rate
        3. An Even Smaller Learning Rate
      6. Summary
    3. Chapter 15: Optimizers
      1. Error as a 2D Curve
      2. Adjusting the Learning Rate
        1. Constant-Sized Updates
        2. Changing the Learning Rate over Time
        3. Decay Schedules
      3. Updating Strategies
        1. Batch Gradient Descent
        2. Stochastic Gradient Descent
        3. Mini-Batch Gradient Descent
      4. Gradient Descent Variations
        1. Momentum
        2. Nesterov Momentum
        3. Adagrad
        4. Adadelta and RMSprop
        5. Adam
      5. Choosing an Optimizer
      6. Regularization
        1. Dropout
        2. Batchnorm
      7. Summary
  10. PART IV: Beyond the Basics
    1. Chapter 16: Convolutional Neural Networks
      1. Introducing Convolution
        1. Detecting Yellow
        2. Weight Sharing
        3. Larger Filters
        4. Filters and Features
        5. Padding
      2. Multidimensional Convolution
      3. Multiple Filters
      4. Convolution Layers
        1. 1D Convolution
        2. 1×1 Convolutions
      5. Changing Output Size
        1. Pooling
        2. Striding
        3. Transposed Convolution
      6. Hierarchies of Filters
        1. Simplifying Assumptions
        2. Finding Face Masks
        3. Finding Eyes, Noses, and Mouths
        4. Applying Our Filters
      7. Summary
    2. Chapter 17: Convnets in Practice
      1. Categorizing Handwritten Digits
      2. VGG16
      3. Visualizing Filters, Part 1
      4. Visualizing Filters, Part 2
      5. Adversaries
      6. Summary
    3. Chapter 18: Autoencoders
      1. Introduction to Encoding
        1. Lossless and Lossy Encoding
      2. Blending Representations
      3. The Simplest Autoencoder
      4. A Better Autoencoder
      5. Exploring the Autoencoder
        1. A Closer Look at the Latent Variables
        2. The Parameter Space
        3. Blending Latent Variables
        4. Predicting from Novel Input
      6. Convolutional Autoencoders
        1. Blending Latent Variables
        2. Predicting from Novel Input
      7. Denoising
      8. Variational Autoencoders
        1. Distribution of Latent Variables
        2. Variational Autoencoder Structure
      9. Exploring the VAE
        1. Working with the MNIST Samples
        2. Working with Two Latent Variables
        3. Producing New Input
      10. Summary
    4. Chapter 19: Recurrent Neural Networks
      1. Working with Language
        1. Common Natural Language Processing Tasks
        2. Transforming Text into Numbers
        3. Fine-Tuning and Downstream Networks
      2. Fully Connected Prediction
        1. Testing Our Network
        2. Why Our Network Failed
      3. Recurrent Neural Networks
        1. Introducing State
        2. Rolling Up Our Diagram
        3. Recurrent Cells in Action
        4. Training a Recurrent Neural Network
        5. Long Short-Term Memory and Gated Recurrent Networks
      4. Using Recurrent Neural Networks
        1. Working with Sunspot Data
        2. Generating Text
        3. Different Architectures
      5. Seq2Seq
      6. Summary
    5. Chapter 20: Attention and Transformers
      1. Embedding
        1. Embedding Words
        2. ELMo
      2. Attention
        1. A Motivating Analogy
        2. Self-Attention
        3. Q/KV Attention
        4. Multi-Head Attention
        5. Layer Icons
      3. Transformers
        1. Skip Connections
        2. Norm-Add
        3. Positional Encoding
        4. Assembling a Transformer
        5. Transformers in Action
      4. BERT and GPT-2
        1. BERT
        2. GPT-2
        3. Generators Discussion
        4. Data Poisoning
      5. Summary
    6. Chapter 21: Reinforcement Learning
      1. Basic Ideas
      2. Learning a New Game
      3. The Structure of Reinforcement Learning
        1. Step 1: The Agent Selects an Action
        2. Step 2: The Environment Responds
        3. Step 3: The Agent Updates Itself
        4. Back to the Big Picture
        5. Understanding Rewards
      4. Flippers
      5. L-Learning
        1. The Basics
        2. The L-Learning Algorithm
        3. Testing Our Algorithm
        4. Handling Unpredictability
      6. Q-Learning
        1. Q-Values and Updates
        2. Q-Learning Policy
        3. Putting It All Together
        4. The Elephant in the Room
        5. Q-learning in Action
      7. SARSA
        1. The Algorithm
        2. SARSA in Action
        3. Comparing Q-Learning and SARSA
      8. The Big Picture
      9. Summary
    7. Chapter 22: Generative Adversarial Networks
      1. Forging Money
        1. Learning from Experience
        2. Forging with Neural Networks
        3. A Learning Round
        4. Why Adversarial?
      2. Implementing GANs
        1. The Discriminator
        2. The Generator
        3. Training the GAN
      3. GANs in Action
        1. Building a Discriminator and Generator
        2. Training Our Network
        3. Testing Our Network
      4. DCGANs
      5. Challenges
        1. Using Big Samples
        2. Modal Collapse
        3. Training with Generated Data
      6. Summary
    8. Chapter 23: Creative Applications
      1. Deep Dreaming
        1. Stimulating Filters
        2. Running Deep Dreaming
      2. Neural Style Transfer
        1. Representing Style
        2. Representing Content
        3. Style and Content Together
        4. Running Style Transfer
      3. Generating More of This Book
      4. Summary
      5. Final Thoughts
  11. References
    1. Chapter 1
    2. Chapter 2
    3. Chapter 3
    4. Chapter 4
    5. Chapter 5
    6. Chapter 6
    7. Chapter 7
    8. Chapter 8
    9. Chapter 9
    10. Chapter 10
    11. Chapter 11
    12. Chapter 12
    13. Chapter 13
    14. Chapter 14
    15. Chapter 15
    16. Chapter 16
    17. Chapter 17
    18. Chapter 18
    19. Chapter 19
    20. Chapter 20
    21. Chapter 21
    22. Chapter 22
    23. Chapter 23
  12. Image Credits
    1. Chapter 1
    2. Chapter 10
    3. Chapter 16
    4. Chapter 17
    5. Chapter 18
    6. Chapter 23
  13. Index
  14. PART V: Bonus Chapters
    1. Chapter B1: SciKit-Learn
      1. Python Conventions and Libraries
      2. Estimators
        1. Creation
        2. Learning with fit()
        3. Predicting with predict()
        4. Using decision_function() and predict_proba()
      3. Clustering
      4. Transformations
        1. Inverse Transformations
      5. Data Refinement
      6. Ensembles
      7. Automation
        1. Cross-Validation
        2. Hyperparameter Searching
        3. Exhaustive Grid Search
        4. Random Grid Search
        5. Pipelines
        6. Looking at the Decision Boundary
        7. Applying Pipelined Transformations
      8. Datasets
      9. Utilities
      10. Wrapping Up
      11. References
    2. Chapter B2: Keras Part 1
      1. The Structure of This Chapter
      2. Libraries, Programming, and Debugging
        1. Versions and Programming Style
        2. Python Programming and Debugging
        3. Running Externally
        4. A Workaround Note
      3. Overview
        1. Tensors and Arrays
        2. Setting Up Keras
        3. Shapes of Tensors Holding Images
        4. GPUs and Other Accelerators
      4. Getting Started
        1. Hello, World
      5. Preparing the Data
        1. Reshaping
        2. Loading the Data
        3. Looking at the Data
        4. Train-test Splitting
        5. Fixing the Data Type
        6. Normalizing the Data
        7. Fixing the Labels
        8. Pre-Processing All in One Place
      6. Making the Model
        1. Turning Grids into Lists
        2. Creating the Model
        3. Compiling the Model
        4. Model Creation Summary
      7. Training the Model
      8. Training and Using Our Model
        1. Looking at the Output
        2. Prediction
        3. Analysis of Training History
      9. Saving and Loading
        1. Saving Everything in One File
        2. Saving Just the Weights
        3. Saving Just the Architecture
        4. Using Pre-Trained Models
        5. Saving the Pre-Processing Steps
      10. Callbacks
        1. Checkpoints
        2. Learning Rate
        3. Early Stopping
      11. Wrapping Up
      12. References
      13. Image Credits
    3. Chapter B3: Keras Part 2
      1. Improving the Model
        1. Counting Up Hyperparameters
        2. Changing One Hyperparameter
        3. Other Ways to Improve
        4. Adding Another Dense Layer
        5. Less Is More
        6. Adding Dropout
        7. Observations
      2. Using Scikit-Learn
        1. Keras Wrappers
        2. Cross-Validation
        3. Cross-Validation with Normalization
        4. Hyperparameter Searching
      3. Convolution Networks
        1. Utility Layers
        2. Preparing the Data for A CNN
        3. Convolution Layers
        4. Using Convolution for MNIST
        5. Patterns
        6. Image Data Augmentation
        7. Synthetic Data
        8. Parameter Searching for Convnets
      4. RNNs
        1. Generating Sequence Data
        2. RNN Data Preparation
        3. Building, Compiling, and Running the RNN
        4. Analyzing RNN Performance
        5. A More Complex Dataset
        6. Deep RNNS
        7. The Value of More Data
        8. Returning Sequences
        9. Stateful RNNs
        10. Time-Distributed Layers
        11. Generating Text
      5. The Functional API
        1. Input Layers
        2. Making A Functional Model
      6. Summary
      7. References
      8. Image Credits

Product information

  • Title: Deep Learning
  • Author(s): Andrew Glassner
  • Release date: June 2021
  • Publisher(s): No Starch Press
  • ISBN: 9781718500723