16. Rewards

This short chapter looks at reward design. We discuss the role of rewards in an RL problem and some important design choices. In particular, we consider the scale, magnitude, frequency, and potential for exploitation when designing a reward signal. The chapter ends with a set of simple design guidelines.

16.1 The Role of Rewards

Reward signals define the objective that an agent should maximize. A reward is a scalar from an environment assigning credit to a particular transition s, a, s′ that has happened due to an agent’s action a.

Reward design is one of the fundamental problems in RL, and it is known to be difficult for several reasons. First, it takes deep knowledge of the environment to have an intuition about proper credit ...

Get Foundations of Deep Reinforcement Learning: Theory and Practice in Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.