Book description
An example-rich guide for beginners to start their reinforcement and deep reinforcement learning journey with state-of-the-art distinct algorithms
Key Features
- Covers a vast spectrum of basic-to-advanced RL algorithms with mathematical explanations of each algorithm
- Learn how to implement algorithms with code by following examples with line-by-line explanations
- Explore the latest RL methodologies such as DDPG, PPO, and the use of expert demonstrations
Book Description
With significant enhancements in the quality and quantity of algorithms in recent years, this second edition of Hands-On Reinforcement Learning with Python has been revamped into an example-rich guide to learning state-of-the-art reinforcement learning (RL) and deep RL algorithms with TensorFlow 2 and the OpenAI Gym toolkit.
In addition to exploring RL basics and foundational concepts such as Bellman equation, Markov decision processes, and dynamic programming algorithms, this second edition dives deep into the full spectrum of value-based, policy-based, and actor-critic RL methods. It explores state-of-the-art algorithms such as DQN, TRPO, PPO and ACKTR, DDPG, TD3, and SAC in depth, demystifying the underlying math and demonstrating implementations through simple code examples.
The book has several new chapters dedicated to new RL techniques, including distributional RL, imitation learning, inverse RL, and meta RL. You will learn to leverage stable baselines, an improvement of OpenAI's baseline library, to effortlessly implement popular RL algorithms. The book concludes with an overview of promising approaches such as meta-learning and imagination augmented agents in research.
By the end, you will become skilled in effectively employing RL and deep RL in your real-world projects.
What you will learn
- Understand core RL concepts including the methodologies, math, and code
- Train an agent to solve Blackjack, FrozenLake, and many other problems using OpenAI Gym
- Train an agent to play Ms Pac-Man using a Deep Q Network
- Learn policy-based, value-based, and actor-critic methods
- Master the math behind DDPG, TD3, TRPO, PPO, and many others
- Explore new avenues such as the distributional RL, meta RL, and inverse RL
- Use Stable Baselines to train an agent to walk and play Atari games
Who this book is for
If you're a machine learning developer with little or no experience with neural networks interested in artificial intelligence and want to learn about reinforcement learning from scratch, this book is for you.
Basic familiarity with linear algebra, calculus, and the Python programming language is required. Some experience with TensorFlow would be a plus.
Table of contents
- Preface
-
Fundamentals of Reinforcement Learning
- Key elements of RL
- The basic idea of RL
- The RL algorithm
- How RL differs from other ML paradigms
- Markov Decision Processes
- Fundamental concepts of RL
- Applications of RL
- RL glossary
- Summary
- Questions
- Further reading
- A Guide to the Gym Toolkit
- The Bellman Equation and Dynamic Programming
-
Monte Carlo Methods
- Understanding the Monte Carlo method
- Prediction and control tasks
- Monte Carlo prediction
- Monte Carlo control
- Is the MC method applicable to all tasks?
- Summary
- Questions
- Understanding Temporal Difference Learning
- Case Study – The MAB Problem
-
Deep Learning Foundations
- Biological and artificial neurons
- ANN and its layers
- Exploring activation functions
- Forward propagation in ANNs
- How does an ANN learn?
- Putting it all together
- Recurrent Neural Networks
- LSTM to the rescue
- What are CNNs?
- The architecture of CNNs
- Generative adversarial networks
- Total loss
- Summary
- Questions
- Further reading
- A Primer on TensorFlow
- Deep Q Network and Its Variants
- Policy Gradient Method
- Actor-Critic Methods – A2C and A3C
- Learning DDPG, TD3, and SAC
- TRPO, PPO, and ACKTR Methods
-
Distributional Reinforcement Learning
- Why distributional reinforcement learning?
- Categorical DQN
- Quantile Regression DQN
- Distributed Distributional DDPG
- Summary
- Questions
- Further reading
- Imitation Learning and Inverse RL
-
Deep Reinforcement Learning with Stable Baselines
- Installing Stable Baselines
- Creating our first agent with Stable Baselines
- Vectorized environments
- Integrating custom environments
- Playing Atari games with a DQN and its variants
- Lunar lander using A2C
- Swinging up a pendulum using DDPG
- Training an agent to walk using TRPO
- Training a cheetah bot to run using PPO
- Implementing GAIL
- Summary
- Questions
- Further reading
- Reinforcement Learning Frontiers
-
Appendix 1 – Reinforcement Learning Algorithms
- Reinforcement learning algorithm
- Value Iteration
- Policy Iteration
- First-Visit MC Prediction
- Every-Visit MC Prediction
- MC Prediction – the Q Function
- MC Control Method
- On-Policy MC Control – Exploring starts
- On-Policy MC Control – Epsilon-Greedy
- Off-Policy MC Control
- TD Prediction
- On-Policy TD Control – SARSA
- Off-Policy TD Control – Q Learning
- Deep Q Learning
- Double DQN
- REINFORCE Policy Gradient
- Policy Gradient with Reward-To-Go
- REINFORCE with Baseline
- Advantage Actor Critic
- Asynchronous Advantage Actor-Critic
- Deep Deterministic Policy Gradient
- Twin Delayed DDPG
- Soft Actor-Critic
- Trust Region Policy Optimization
- PPO-Clipped
- PPO-Penalty
- Categorical DQN
- Distributed Distributional DDPG
- DAgger
- Deep Q learning from demonstrations
- MaxEnt Inverse Reinforcement Learning
- MAML in Reinforcement Learning
-
Appendix 2 – Assessments
- Chapter 1 – Fundamentals of Reinforcement Learning
- Chapter 2 – A Guide to the Gym Toolkit
- Chapter 3 – The Bellman Equation and Dynamic Programming
- Chapter 4 – Monte Carlo Methods
- Chapter 5 – Understanding Temporal Difference Learning
- Chapter 6 – Case Study – The MAB Problem
- Chapter 7 – Deep Learning Foundations
- Chapter 8 – A Primer on TensorFlow
- Chapter 9 – Deep Q Network and Its Variants
- Chapter 10 – Policy Gradient Method
- Chapter 11 – Actor-Critic Methods – A2C and A3C
- Chapter 12 – Learning DDPG, TD3, and SAC
- Chapter 13 – TRPO, PPO, and ACKTR Methods
- Chapter 14 – Distributional Reinforcement Learning
- Chapter 15 – Imitation Learning and Inverse RL
- Chapter 16 – Deep Reinforcement Learning with Stable Baselines
- Chapter 17 – Reinforcement Learning Frontiers
- Other Books You May Enjoy
- Index
Product information
- Title: Deep Reinforcement Learning with Python - Second Edition
- Author(s):
- Release date: September 2020
- Publisher(s): Packt Publishing
- ISBN: 9781839210686
You might also like
book
Deep Learning with Python, Second Edition
Printed in full color! Unlock the groundbreaking advances of deep learning with this extensively revised new …
book
Reinforcement Learning Algorithms with Python
Develop self-learning algorithms and agents using TensorFlow and other Python tools, frameworks, and libraries Key Features …
book
Mastering Object-Oriented Python - Second Edition
Gain comprehensive insights into programming practices, and code portability and reuse to build flexible and maintainable …
book
Machine Learning with PyTorch and Scikit-Learn
This book of the bestselling and widely acclaimed Python Machine Learning series is a comprehensive guide …