A. Deep Reinforcement Learning Timeline

  • 1947: Monte Carlo Sampling

  • 1958: Perceptron

  • 1959: Temporal Difference Learning

  • 1983: ASE-ALE—the first Actor-Critic algorithm

  • 1986: Backpropagation algorithm

  • 1989: CNNs

  • 1989: Q-Learning

  • 1991: TD-Gammon

  • 1992: REINFORCE

  • 1992: Experience Replay

  • 1994: SARSA

  • 1999: Nvidia invents the GPU

  • 2007: CUDA released

  • 2012: Arcade Learning Environment (ALE)

  • 2013: DQN

  • 2015 Feb: DQN human-level control in Atari

  • 2015 Feb: TRPO

  • 2015 Jun: Generalized Advantage Estimation

  • 2015 Sep: Deep Deterministic Policy Gradient (DDPG) [81]

  • 2015 Sep: Double DQN

  • 2015 Nov: Dueling DQN [144]

  • 2015 Nov: Prioritized Experience Replay

  • 2015 Nov: TensorFlow

  • 2016 Feb: A3C

  • 2016 Mar: AlphaGo beats Lee Sedol 4-1

  • 2016 Jun: OpenAI Gym

  • 2016 Jun: Generative ...

Get Foundations of Deep Reinforcement Learning: Theory and Practice in Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.