Foundations of Deep Reinforcement Learning: Theory and Practice in Python

9. Algorithm Summary

There are three defining characteristics of the algorithms we have introduced in this book. First, is an algorithm on-policy or off-policy? Second, what types of action spaces can it be applied to? And third, what functions does it learn?

REINFORCE, SARSA, A2C, and PPO are all on-policy algorithms, whereas DQN and Double DQN + PER are off-policy. SARSA, DQN, and Double DQN + PER are value-based algorithms that learn to approximate the Q^π function. Consequently, they are only applicable to environments with discrete action spaces.

REINFORCE is a pure policy-based algorithm and so only learns a policy π. A2C and PPO are hybrid methods which learn a policy π and the V^π function. REINFORCE, A2C, and PPO can all be applied to ...

Get Foundations of Deep Reinforcement Learning: Theory and Practice in Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Foundations of Deep Reinforcement Learning: Theory and Practice in Python by Laura Graesser, Wah Loon Keng

9. Algorithm Summary

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly