Hands-On Mathematics for Deep Learning

Book description

A comprehensive guide to getting well-versed with the mathematical techniques for building modern deep learning architectures

Key Features

  • Understand linear algebra, calculus, gradient algorithms, and other concepts essential for training deep neural networks
  • Learn the mathematical concepts needed to understand how deep learning models function
  • Use deep learning for solving problems related to vision, image, text, and sequence applications

Book Description

Most programmers and data scientists struggle with mathematics, having either overlooked or forgotten core mathematical concepts. This book uses Python libraries to help you understand the math required to build deep learning (DL) models.

You'll begin by learning about core mathematical and modern computational techniques used to design and implement DL algorithms. This book will cover essential topics, such as linear algebra, eigenvalues and eigenvectors, the singular value decomposition concept, and gradient algorithms, to help you understand how to train deep neural networks. Later chapters focus on important neural networks, such as the linear neural network and multilayer perceptrons, with a primary focus on helping you learn how each model works. As you advance, you will delve into the math used for regularization, multi-layered DL, forward propagation, optimization, and backpropagation techniques to understand what it takes to build full-fledged DL models. Finally, you'll explore CNN, recurrent neural network (RNN), and GAN models and their application.

By the end of this book, you'll have built a strong foundation in neural networks and DL mathematical concepts, which will help you to confidently research and build custom models in DL.

What you will learn

  • Understand the key mathematical concepts for building neural network models
  • Discover core multivariable calculus concepts
  • Improve the performance of deep learning models using optimization techniques
  • Cover optimization algorithms, from basic stochastic gradient descent (SGD) to the advanced Adam optimizer
  • Understand computational graphs and their importance in DL
  • Explore the backpropagation algorithm to reduce output error
  • Cover DL algorithms such as convolutional neural networks (CNNs), sequence models, and generative adversarial networks (GANs)

Who this book is for

This book is for data scientists, machine learning developers, aspiring deep learning developers, or anyone who wants to understand the foundation of deep learning by learning the math behind it. Working knowledge of the Python programming language and machine learning basics is required.

Table of contents

  1. Title Page
  2. Copyright and Credits
    1. Hands-On Mathematics for Deep Learning
  3. About Packt
    1. Why subscribe?
  4. Contributors
    1. About the author
    2. About the reviewers
    3. Packt is searching for authors like you
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
      1. Download the color images
      2. Conventions used
    4. Get in touch
      1. Reviews
  6. Section 1: Essential Mathematics for Deep Learning
  7. Linear Algebra
    1. Comparing scalars and vectors
    2. Linear equations
      1. Solving linear equations in n-dimensions
      2. Solving linear equations using elimination
    3. Matrix operations
      1. Adding matrices
      2. Multiplying matrices
      3. Inverse matrices
      4. Matrix transpose
      5. Permutations
    4. Vector spaces and subspaces
      1. Spaces
      2. Subspaces
    5. Linear maps
      1. Image and kernel
      2. Metric space and normed space
      3. Inner product space
    6. Matrix decompositions
      1. Determinant
      2. Eigenvalues and eigenvectors
      3. Trace
      4. Orthogonal matrices
      5. Diagonalization and symmetric matrices
      6. Singular value decomposition
      7. Cholesky decomposition
    7. Summary
  8. Vector Calculus
    1. Single variable calculus
      1. Derivatives
        1. Sum rule
        2. Power rule
        3. Trigonometric functions
        4. First and second derivatives
        5. Product rule
        6. Quotient rule
        7. Chain rule
        8. Antiderivative
      2. Integrals
        1. The fundamental theorem of calculus
        2. Substitution rule
        3. Areas between curves
        4. Integration by parts
    2. Multivariable calculus
      1. Partial derivatives
        1. Chain rule
      2. Integrals
    3. Vector calculus
      1. Derivatives
      2. Vector fields
      3. Inverse functions
    4. Summary
  9. Probability and Statistics
    1. Understanding the concepts in probability
      1. Classical probability
        1. Sampling with or without replacement
        2. Multinomial coefficient
        3. Stirling's formula
        4. Independence
        5. Discrete distributions
        6. Conditional probability
        7. Random variables
        8. Variance
        9. Multiple random variables
        10. Continuous random variables
      2. Joint distributions
      3. More probability distributions
        1. Normal distribution
        2. Multivariate normal distribution
        3. Bivariate normal distribution
        4. Gamma distribution
    2. Essential concepts in statistics
      1. Estimation
        1. Mean squared error
        2. Sufficiency
        3. Likelihood
        4. Confidence intervals
        5. Bayesian estimation
      2. Hypothesis testing
        1. Simple hypotheses
        2. Composite hypothesis
        3. The multivariate normal theory
      3. Linear models
        1. Hypothesis testing
    3. Summary
  10. Optimization
    1. Understanding optimization and it's different types
      1. Constrained optimization
      2. Unconstrained optimization
      3. Convex optimization
      4. Convex sets
      5. Affine sets
      6. Convex functions
      7. Optimization problems
      8. Non-convex optimization
    2. Exploring the various optimization methods
      1. Least squares
      2. Lagrange multipliers
      3. Newton's method
      4. The secant method
      5. The quasi-Newton method
      6. Game theory
      7. Descent methods
        1. Gradient descent
        2. Stochastic gradient descent
        3. Loss functions
        4. Gradient descent with momentum
        5. The Nesterov's accelerated gradient
        6. Adaptive gradient descent
        7. Simulated annealing
        8. Natural evolution
    3. Exploring population methods
      1. Genetic algorithms
      2. Particle swarm optimization
    4. Summary
  11. Graph Theory
    1. Understanding the basic concepts and terminology
    2. Adjacency matrix
    3. Types of graphs
      1. Weighted graphs
      2. Directed graphs
      3. Directed acyclic graphs
      4. Multilayer and dynamic graphs
      5. Tree graphs
    4. Graph Laplacian
    5. Summary
  12. Section 2: Essential Neural Networks
  13. Linear Neural Networks
    1. Linear regression
    2. Polynomial regression
    3. Logistic regression
    4. Summary
  14. Feedforward Neural Networks
    1. Understanding biological neural networks
    2. Comparing the perceptron and the McCulloch-Pitts neuron
      1. The MP neuron
      2. Perceptron
      3. Pros and cons of the MP neuron and perceptron
    3. MLPs
      1. Layers
      2. Activation functions
        1. Sigmoid
        2. Hyperbolic tangent
        3. Softmax
        4. Rectified linear unit
        5. Leaky ReLU
        6. Parametric ReLU
        7. Exponential linear unit
      3. The loss function
        1. Mean absolute error
        2. Mean squared error
        3. Root mean squared error
        4. The Huber loss
        5. Cross entropy
        6. Kullback-Leibler divergence
        7. Jensen-Shannon divergence
      4. Backpropagation
    4. Training neural networks
      1. Parameter initialization
        1. All zeros
        2. Random initialization
        3. Xavier initialization
      2. The data
    5. Deep neural networks
    6. Summary
  15. Regularization
    1. The need for regularization
    2. Norm penalties
      1. L2 regularization
      2. L1 regularization
    3. Early stopping
    4. Parameter tying and sharing
    5. Dataset augmentation
    6. Dropout
    7. Adversarial training
    8. Summary
  16. Convolutional Neural Networks
    1. The inspiration behind ConvNets
    2. Types of data used in ConvNets
    3. Convolutions and pooling
      1. Two-dimensional convolutions
      2. One-dimensional convolutions
      3. 1 × 1 convolutions
      4. Three-dimensional convolutions
      5. Separable convolutions
      6. Transposed convolutions
      7. Pooling
      8. Global average pooling
      9. Convolution and pooling size
    4. Working with the ConvNet architecture
    5. Training and optimization
    6. Exploring popular ConvNet architectures
      1. VGG-16
      2. Inception-v1
    7. Summary
  17. Recurrent Neural Networks
    1. The need for RNNs
    2. The types of data used in RNNs
    3. Understanding RNNs
      1. Vanilla RNNs
      2. Bidirectional RNNs
    4. Long short-term memory
    5. Gated recurrent units
    6. Deep RNNs
    7. Training and optimization
    8. Popular architecture
      1. Clockwork RNNs
    9. Summary
  18. Section 3: Advanced Deep Learning Concepts Simplified
  19. Attention Mechanisms
    1. Overview of attention
    2. Understanding neural Turing machines
      1. Reading
      2. Writing
      3. Addressing mechanisms
        1. Content-based addressing mechanism
        2. Location-based address mechanism
    3. Exploring the types of attention
      1. Self-attention
      2. Comparing hard and soft attention
      3. Comparing global and local attention
    4. Transformers
    5. Summary
  20. Generative Models
    1. Why we need generative models
    2. Autoencoders
      1. The denoising autoencoder
      2. The variational autoencoder
    3. Generative adversarial networks
      1. Wasserstein GANs
    4. Flow-based networks
      1. Normalizing flows
      2. Real-valued non-volume preserving
    5. Summary
  21. Transfer and Meta Learning
    1. Transfer learning
    2. Meta learning
      1. Approaches to meta learning
      2. Model-based meta learning
        1. Memory-augmented neural networks
        2. Meta Networks
      3. Metric-based meta learning
        1. Prototypical networks
        2. Siamese neural networks
      4. Optimization-based meta learning
        1. Long Short-Term Memory meta learners
        2. Model-agnostic meta learning
    3. Summary
  22. Geometric Deep Learning
    1. Comparing Euclidean and non-Euclidean data
      1. Manifolds
      2. Discrete manifolds
      3. Spectral decomposition
    2. Graph neural networks
    3. Spectral graph CNNs
    4. Mixture model networks
    5. Facial recognition in 3D
    6. Summary
  23. Other Books You May Enjoy
    1. Leave a review - let other readers know what you think

Product information

  • Title: Hands-On Mathematics for Deep Learning
  • Author(s): Jay Dawani
  • Release date: June 2020
  • Publisher(s): Packt Publishing
  • ISBN: 9781838647292