5 Evaluating agents’ behaviors

In this chapter

You will learn about estimating policies when learning from feedback that is simultaneously sequential and evaluative.
You will develop algorithms for evaluating policies in reinforcement learning environments when the transition and reward functions are unknown.
You will write code for estimating the value of policies in environments in which the full reinforcement learning problem is on display.

I conceive that the great part of the miseries of mankind are brought upon them by false estimates they have made of the value of things.

— Benjamin Franklin Founding Father of the United States an author, politician, inventor, and a civic activist

You know how challenging it is to balance immediate ...

Get Grokking Deep Reinforcement Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Grokking Deep Reinforcement Learning by Miguel Morales