Skip to main content Accessibility help
×
Hostname: page-component-5db58dd55d-4jdj6 Total loading time: 0 Render date: 2026-05-25T20:02:07.460Z Has data issue: false hasContentIssue false

18 - Expectation maximisation methods for solving (PO)MDPs and optimal control problems

from VI - Agent-based models

Published online by Cambridge University Press:  07 September 2011

Marc Toussaint
Affiliation:
Universität Berlin
Amos Storkey
Affiliation:
University of Edinburgh
Stefan Harmeling
Affiliation:
Biological Cybernetics
David Barber
Affiliation:
University College London
A. Taylan Cemgil
Affiliation:
Boğaziçi Üniversitesi, Istanbul
Silvia Chiappa
Affiliation:
University of Cambridge
Get access

Summary

Introduction

As this book demonstrates, the development of efficient probabilistic inference techniques has made considerable progress in recent years, in particular with respect to exploiting the structure (e.g., factored, hierarchical or relational) of discrete and continuous problem domains. In this chapter we show that these techniques can be used also for solving Markov decision processes (MDPs) or partially observable MDPs (POMDPs) when formulated in terms of a structured dynamic Bayesian network (DBN).

The problems of planning in stochastic environments and inference in state space models are closely related, in particular in view of the challenges both of them face: scaling to large state spaces spanned by multiple state variables, or realising planning (or inference) in continuous or mixed continuous-discrete state spaces. Both fields developed techniques to address these problems. For instance, in the field of planning, they include work on factored Markov decision processes [5, 17, 9, 18], abstractions [10], and relational models of the environment [37]. On the other hand, recent advances in inference techniques show how structure can be exploited both for exact inference as well as for making efficient approximations. Examples are message-passing algorithms (loopy belief propagation, expectation propagation), variational approaches, approximate belief representations (particles, assumed density filtering, Boyen–Koller) and arithmetic compilation (see, e.g., [22, 23, 7]).

In view of these similarities one may ask whether existing techniques for probabilistic inference can directly be translated to solving stochastic planning problems.

Information

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

Save book to Kindle

To save this book to your Kindle, first ensure no-reply@cambridge.org is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

Available formats
×