Deep Learning with TensorFlow and Keras - Third Edition

Book description

Build cutting edge machine and deep learning systems for the lab, production, and mobile devices. Purchase of the print or Kindle book includes a free eBook in PDF format.

Key Features

  • Understand the fundamentals of deep learning and machine learning through clear explanations and extensive code samples
  • Implement graph neural networks, transformers using Hugging Face and TensorFlow Hub, and joint and contrastive learning
  • Learn cutting-edge machine and deep learning techniques

Book Description

Deep Learning with TensorFlow and Keras teaches you neural networks and deep learning techniques using TensorFlow (TF) and Keras. You'll learn how to write deep learning applications in the most powerful, popular, and scalable machine learning stack available.

TensorFlow 2.x focuses on simplicity and ease of use, with updates like eager execution, intuitive higher-level APIs based on Keras, and flexible model building on any platform. This book uses the latest TF 2.0 features and libraries to present an overview of supervised and unsupervised machine learning models and provides a comprehensive analysis of deep learning and reinforcement learning models using practical examples for the cloud, mobile, and large production environments.

This book also shows you how to create neural networks with TensorFlow, runs through popular algorithms (regression, convolutional neural networks (CNNs), transformers, generative adversarial networks (GANs), recurrent neural networks (RNNs), natural language processing (NLP), and graph neural networks (GNNs)), covers working example apps, and then dives into TF in production, TF mobile, and TensorFlow with AutoML.

What you will learn

  • Learn how to use the popular GNNs with TensorFlow to carry out graph mining tasks
  • Discover the world of transformers, from pretraining to fine-tuning to evaluating them
  • Apply self-supervised learning to natural language processing, computer vision, and audio signal processing
  • Combine probabilistic and deep learning models using TensorFlow Probability
  • Train your models on the cloud and put TF to work in real environments
  • Build machine learning and deep learning systems with TensorFlow 2.x and the Keras API

Who this book is for

This hands-on machine learning book is for Python developers and data scientists who want to build machine learning and deep learning systems with TensorFlow. This book gives you the theory and practice required to use Keras, TensorFlow, and AutoML to build machine learning systems. Some machine learning knowledge would be useful. We don’t assume TF knowledge.

Table of contents

  1. Preface
    1. Who this book is for
    2. What this book covers
    3. Get in touch
    4. References
  2. Neural Network Foundations with TF
    1. What is TensorFlow (TF)?
    2. What is Keras?
    3. Introduction to neural networks
    4. Perceptron
      1. Our first example of TensorFlow code
    5. Multi-layer perceptron: our first example of a network
      1. Problems in training the perceptron and solution
      2. Activation function: sigmoid
      3. Activation function: tanh
      4. Activation function: ReLU
      5. Two additional activation functions: ELU and Leaky ReLU
      6. Activation functions
      7. In short: what are neural networks after all?
    6. A real example: recognizing handwritten digits
      1. One hot-encoding (OHE)
      2. Defining a simple neural net in TensorFlow
      3. Running a simple TensorFlow net and establishing a baseline
      4. Improving the simple net in TensorFlow with hidden layers
      5. Further improving the simple net in TensorFlow with dropout
      6. Testing different optimizers in TensorFlow
      7. Increasing the number of epochs
      8. Controlling the optimizer learning rate
      9. Increasing the number of internal hidden neurons
      10. Increasing the size of batch computation
      11. Summarizing experiments run to recognizing handwritten digits
    7. Regularization
      1. Adopting regularization to avoid overfitting
      2. Understanding batch normalization
    8. Playing with Google Colab: CPUs, GPUs, and TPUs
    9. Sentiment analysis
      1. Hyperparameter tuning and AutoML
    10. Predicting output
    11. A practical overview of backpropagation
    12. What have we learned so far?
    13. Toward a deep learning approach
    14. Summary
    15. References
  3. Regression and Classification
    1. What is regression?
    2. Prediction using linear regression
      1. Simple linear regression
      2. Multiple linear regression
      3. Multivariate linear regression
    3. Neural networks for linear regression
      1. Simple linear regression using TensorFlow Keras
      2. Multiple and multivariate linear regression using the TensorFlow Keras API
    4. Classification tasks and decision boundaries
      1. Logistic regression
      2. Logistic regression on the MNIST dataset
    5. Summary
    6. References
  4. Convolutional Neural Networks
    1. Deep convolutional neural networks
      1. Local receptive fields
      2. Shared weights and bias
      3. A mathematical example
      4. ConvNets in TensorFlow
      5. Pooling layers
        1. Max pooling
        2. Average pooling
      6. ConvNets summary
    2. An example of DCNN: LeNet
      1. LeNet code in TF
      2. Understanding the power of deep learning
    3. Recognizing CIFAR-10 images with deep learning
      1. Improving the CIFAR-10 performance with a deeper network
      2. Improving the CIFAR-10 performance with data augmentation
      3. Predicting with CIFAR-10
    4. Very deep convolutional networks for large-scale image recognition
      1. Recognizing cats with a VGG16 network
      2. Utilizing the tf.Keras built-in VGG16 net module
      3. Recycling pre-built deep learning models for extracting features
    5. Deep Inception V3 for transfer learning
    6. Other CNN architectures
      1. AlexNet
      2. Residual networks
      3. HighwayNets and DenseNets
      4. Xception
    7. Style transfer
      1. Content distance
      2. Style distance
    8. Summary
    9. References
  5. Word Embeddings
    1. Word embedding ‒ origins and fundamentals
    2. Distributed representations
    3. Static embeddings
      1. Word2Vec
      2. GloVe
    4. Creating your own embeddings using Gensim
    5. Exploring the embedding space with Gensim
    6. Using word embeddings for spam detection
      1. Getting the data
      2. Making the data ready for use
      3. Building the embedding matrix
      4. Defining the spam classifier
      5. Training and evaluating the model
      6. Running the spam detector
    7. Neural embeddings – not just for words
      1. Item2Vec
      2. node2vec
    8. Character and subword embeddings
    9. Dynamic embeddings
    10. Sentence and paragraph embeddings
    11. Language model-based embeddings
      1. Using BERT as a feature extractor
    12. Summary
    13. References
  6. Recurrent Neural Networks
    1. The basic RNN cell
      1. Backpropagation through time (BPTT)
      2. Vanishing and exploding gradients
    2. RNN cell variants
      1. Long short-term memory (LSTM)
      2. Gated recurrent unit (GRU)
      3. Peephole LSTM
    3. RNN variants
      1. Bidirectional RNNs
      2. Stateful RNNs
    4. RNN topologies
      1. Example ‒ One-to-many – Learning to generate text
      2. Example ‒ Many-to-one – Sentiment analysis
      3. Example ‒ Many-to-many – POS tagging
    5. Encoder-decoder architecture – seq2seq
      1. Example ‒ seq2seq without attention for machine translation
    6. Attention mechanism
      1. Example ‒ seq2seq with attention for machine translation
    7. Summary
    8. References
  7. Transformers
    1. Architecture
      1. Key intuitions
        1. Positional encoding
        2. Attention
        3. Self-attention
        4. Multi-head (self-)attention
      2. How to compute attention
      3. Encoder-decoder architecture
      4. Residual and normalization layers
      5. An overview of the transformer architecture
      6. Training
    2. Transformers’ architectures
      1. Categories of transformers
        1. Decoder or autoregressive
        2. Encoder or autoencoding
        3. Seq2seq
        4. Multimodal
        5. Retrieval
      2. Attention
        1. Full versus sparse
        2. LSH attention
        3. Local attention
    3. Pretraining
      1. Encoder pretraining
      2. Decoder pretraining
      3. Encoder-decoder pretraining
      4. A taxonomy for pretraining tasks
    4. An overview of popular and well-known models
      1. BERT
      2. GPT-2
      3. GPT-3
      4. Reformer
      5. BigBird
      6. Transformer-XL
      7. XLNet
      8. RoBERTa
      9. ALBERT
      10. StructBERT
      11. T5 and MUM
      12. ELECTRA
      13. DeBERTa
      14. The Evolved Transformer and MEENA
      15. LaMDA
      16. Switch Transformer
      17. RETRO
      18. Pathways and PaLM
    5. Implementation
      1. Transformer reference implementation: An example of translation
      2. Hugging Face
        1. Generating text
        2. Autoselecting a model and autotokenization
        3. Named entity recognition
        4. Summarization
        5. Fine-tuning
      3. TFHub
    6. Evaluation
      1. Quality
        1. GLUE
        2. SuperGLUE
        3. SQuAD
        4. RACE
        5. NLP-progress
      2. Size
        1. Larger doesn’t always mean better
      3. Cost of serving
    7. Optimization
      1. Quantization
      2. Weight pruning
      3. Distillation
    8. Common pitfalls: dos and don’ts
      1. Dos
      2. Don’ts
    9. The future of transformers
    10. Summary
  8. Unsupervised Learning
    1. Principal component analysis
      1. PCA on the MNIST dataset
      2. TensorFlow Embedding API
    2. K-means clustering
      1. K-means in TensorFlow
      2. Variations in k-means
    3. Self-organizing maps
      1. Colour mapping using a SOM
    4. Restricted Boltzmann machines
      1. Reconstructing images using an RBM
      2. Deep belief networks
    5. Summary
    6. References
  9. Autoencoders
    1. Introduction to autoencoders
    2. Vanilla autoencoders
      1. TensorFlow Keras layers ‒ defining custom layers
      2. Reconstructing handwritten digits using an autoencoder
    3. Sparse autoencoder
    4. Denoising autoencoders
      1. Clearing images using a denoising autoencoder
    5. Stacked autoencoder
      1. Convolutional autoencoder for removing noise from images
      2. A TensorFlow Keras autoencoder example ‒ sentence vectors
    6. Variational autoencoders
    7. Summary
    8. References
  10. Generative Models
    1. What is a GAN?
      1. MNIST using GAN in TensorFlow
    2. Deep convolutional GAN (DCGAN)
      1. DCGAN for MNIST digits
    3. Some interesting GAN architectures
      1. SRGAN
      2. CycleGAN
      3. InfoGAN
    4. Cool applications of GANs
    5. CycleGAN in TensorFlow
    6. Flow-based models for data generation
    7. Diffusion models for data generation
    8. Summary
    9. References
  11. Self-Supervised Learning
    1. Previous work
    2. Self-supervised learning
    3. Self-prediction
      1. Autoregressive generation
        1. PixelRNN
        2. Image GPT (IPT)
        3. GPT-3
        4. XLNet
        5. WaveNet
        6. WaveRNN
      2. Masked generation
        1. BERT
        2. Stacked denoising autoencoder
        3. Context autoencoder
        4. Colorization
      3. Innate relationship prediction
        1. Relative position
        2. Solving jigsaw puzzles
        3. Rotation
      4. Hybrid self-prediction
        1. VQ-VAE
        2. Jukebox
        3. DALL-E
        4. VQ-GAN
    4. Contrastive learning
      1. Training objectives
        1. Contrastive loss
        2. Triplet loss
        3. N-pair loss
        4. Lifted structural loss
        5. NCE loss
        6. InfoNCE loss
        7. Soft nearest neighbors loss
      2. Instance transformation
        1. SimCLR
        2. Barlow Twins
        3. BYOL
        4. Feature clustering
        5. DeepCluster
        6. SwAV
        7. InterCLR
      3. Multiview coding
        1. AMDIM
        2. CMC
      4. Multimodal models
        1. CLIP
        2. CodeSearchNet
        3. Data2Vec
    5. Pretext tasks
    6. Summary
    7. References
  12. Reinforcement Learning
    1. An introduction to RL
      1. RL lingo
      2. Deep reinforcement learning algorithms
        1. How does the agent choose its actions, especially when untrained?
        2. How does the agent maintain a balance between exploration and exploitation?
        3. How to deal with the highly correlated input state space
        4. How to deal with the problem of moving targets
      3. Reinforcement success in recent years
    2. Simulation environments for RL
    3. An introduction to OpenAI Gym
      1. Random agent playing Breakout
      2. Wrappers in Gym
    4. Deep Q-networks
      1. DQN for CartPole
      2. DQN to play a game of Atari
      3. DQN variants
        1. Double DQN
        2. Dueling DQN
        3. Rainbow
    5. Deep deterministic policy gradient
    6. Summary
    7. References
  13. Probabilistic TensorFlow
    1. TensorFlow Probability
    2. TensorFlow Probability distributions
      1. Using TFP distributions
        1. Coin Flip Example
        2. Normal distribution
      2. Bayesian networks
      3. Handling uncertainty in predictions using TensorFlow Probability
        1. Aleatory uncertainty
        2. Epistemic uncertainty
        3. Creating a synthetic dataset
        4. Building a regression model using TensorFlow
        5. Probabilistic neural networks for aleatory uncertainty
        6. Accounting for the epistemic uncertainty
    3. Summary
    4. References
  14. An Introduction to AutoML
    1. What is AutoML?
    2. Achieving AutoML
    3. Automatic data preparation
    4. Automatic feature engineering
    5. Automatic model generation
    6. AutoKeras
    7. Google Cloud AutoML and Vertex AI
      1. Using the Google Cloud AutoML Tables solution
      2. Using the Google Cloud AutoML Text solution
      3. Using the Google Cloud AutoML Video solution
      4. Cost
    8. Summary
    9. References
  15. The Math Behind Deep Learning
    1. History
    2. Some mathematical tools
      1. Vectors
      2. Derivatives and gradients everywhere
      3. Gradient descent
      4. Chain rule
      5. A few differentiation rules
      6. Matrix operations
    3. Activation functions
      1. Derivative of the sigmoid
      2. Derivative of tanh
      3. Derivative of ReLU
    4. Backpropagation
      1. Forward step
      2. Backstep
        1. Case 1: From hidden layer to output layer
        2. Case 2: From hidden layer to hidden layer
      3. Cross entropy and its derivative
      4. Batch gradient descent, stochastic gradient descent, and mini-batch
        1. Batch gradient descent
        2. Stochastic gradient descent
        3. Mini-batch gradient descent
      5. Thinking about backpropagation and ConvNets
      6. Thinking about backpropagation and RNNs
    5. A note on TensorFlow and automatic differentiation
    6. Summary
    7. References
  16. Tensor Processing Unit
    1. C/G/T processing units
      1. CPUs and GPUs
      2. TPUs
    2. Four generations of TPUs, plus Edge TPU
      1. First generation TPU
      2. Second generation TPU
      3. Third generation TPU
      4. Fourth generation TPUs
      5. Edge TPU
    3. TPU performance
    4. How to use TPUs with Colab
      1. Checking whether TPUs are available
      2. Keras MNIST TPU end-to-end training
    5. Using pretrained TPU models
    6. Summary
    7. References
  17. Other Useful Deep Learning Libraries
    1. Hugging Face
    2. OpenAI
      1. OpenAI GPT-3 API
      2. OpenAI DALL-E 2
      3. OpenAI Codex
    3. PyTorch
    4. ONNX
    5. H2O.ai
      1. H2O AutoML
      2. AutoML using H2O
      3. H2O model explainability
        1. Partial dependence plots
        2. Variable importance heatmap
        3. Model correlation
    6. Summary
  18. Graph Neural Networks
    1. Graph basics
    2. Graph machine learning
    3. Graph convolutions – the intuition behind GNNs
    4. Common graph layers
      1. Graph convolution network
      2. Graph attention network
      3. GraphSAGE (sample and aggregate)
      4. Graph isomorphism network
    5. Common graph applications
      1. Node classification
      2. Graph classification
      3. Link prediction
    6. Graph customizations
      1. Custom layers and message passing
      2. Custom graph dataset
        1. Single graphs in datasets
        2. Set of multiple graphs in datasets
    7. Future directions
      1. Heterogeneous graphs
      2. Temporal Graphs
    8. Summary
    9. References
  19. Machine Learning Best Practices
    1. The need for best practices
    2. Data best practices
      1. Feature selection
      2. Features and data
        1. Augmenting textual data
    3. Model best practices
      1. Baseline models
      2. Pretrained models, model APIs, and AutoML
      3. Model evaluation and validation
      4. Model improvements
    4. Summary
    5. References
  20. TensorFlow 2 Ecosystem
    1. TensorFlow Hub
      1. Using pretrained models for inference
    2. TensorFlow Datasets
      1. Load a TFDS dataset
      2. Building data pipelines using TFDS
    3. TensorFlow Lite
      1. Quantization
      2. FlatBuffers
      3. Mobile converter
      4. Mobile optimized interpreter
      5. Supported platforms
      6. Architecture
      7. Using TensorFlow Lite
      8. A generic example of an application
      9. Using GPUs and accelerators
      10. An example of an application
    4. Pretrained models in TensorFlow Lite
      1. Image classification
      2. Object detection
      3. Pose estimation
      4. Smart reply
      5. Segmentation
      6. Style transfer
      7. Text classification
      8. Large language models
      9. A note about using mobile GPUs
    5. An overview of federated learning at the edge
      1. TensorFlow FL APIs
    6. TensorFlow.js
      1. Vanilla TensorFlow.js
      2. Converting models
      3. Pretrained models
      4. Node.js
    7. Summary
    8. References
  21. Advanced Convolutional Neural Networks
    1. Composing CNNs for complex tasks
      1. Classification and localization
      2. Semantic segmentation
      3. Object detection
      4. Instance segmentation
    2. Application zoos with tf.Keras and TensorFlow Hub
      1. Keras Applications
      2. TensorFlow Hub
    3. Answering questions about images (visual Q&A)
    4. Creating a DeepDream network
    5. Inspecting what a network has learned
    6. Video
      1. Classifying videos with pretrained nets in six different ways
    7. Text documents
      1. Using a CNN for sentiment analysis
    8. Audio and music
      1. Dilated ConvNets, WaveNet, and NSynth
    9. A summary of convolution operations
      1. Basic CNNs
      2. Dilated convolution
      3. Transposed convolution
      4. Separable convolution
      5. Depthwise convolution
      6. Depthwise separable convolution
    10. Capsule networks
      1. What is the problem with CNNs?
      2. What is new with capsule networks?
    11. Summary
    12. References
  22. Other Books You May Enjoy
  23. Index

Product information

  • Title: Deep Learning with TensorFlow and Keras - Third Edition
  • Author(s): Amita Kapoor, Antonio Gulli, Sujit Pal
  • Release date: October 2022
  • Publisher(s): Packt Publishing
  • ISBN: 9781803232911