Skip to main content Accessibility help
Internet Explorer 11 is being discontinued by Microsoft in August 2021. If you have difficulties viewing the site on Internet Explorer 11 we recommend using a different browser such as Microsoft Edge, Google Chrome, Apple Safari or Mozilla Firefox.

Chapter 44: Markov Decision Processes

Chapter 44: Markov Decision Processes

pp. 1807-1852

Authors

, École Polytechnique Fédérale de Lausanne
Resources available Unlock the full potential of this textbook with additional resources. There are Instructor restricted resources available for this textbook. Explore resources
  • Add bookmark
  • Cite
  • Share

Summary

Markov decision processes (MDPs) are at the core of reinforcement learning theory. Similar to Markov chains, MDPs involve an underlying Markovian process that evolves from one state to another, with the probability of visiting a new state being dependent on the most recent state. Different from Markov chains, MDPs involve both agents and actions taken by these agents. As a result, the next state is dependent on which action was chosen at the state preceding it. MDPs therefore provide a powerful framework to explore state spaces and to learn from actions and rewards.

About the book

Access options

Review the options below to login to check your access.

Purchase options

eTextbook
US$110.00
Hardback
US$110.00

Have an access code?

To redeem an access code, please log in with your personal login.

If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.

Also available to purchase from these educational ebook suppliers