To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This chapter is about representing state-transition systems and using them in acting. The first section gives formal definitions of state-transition systems and planning problems, and a simple acting algorithm. The second section describes state-variable representations of state-transition systems, and the third section describes several acting procedures that use this representation. The fourth section describes classical representation, an alternative to state-variable representation that is often used in the AI planning literature.
The chapters in Part I are about acting, planning, and learning using deterministic state-transition (or "classical planning") models. The relative ease of constructing and using such models can make them desirable even though most real-world environments do not satisfy all of their underlying assumptions. The chapters in this part also introduce several concepts that will be used throughout the book, such as state-variable representation.
This part of the book is about planning, acting, and learning approaches in which time is explicit. It describes several algorithms and methods for handling durative and concurrent activities with respect to a predicted dynamics. Acting with temporal models raises dispatching and temporal controllability issues that rely heavily on planning concepts.
This chapter provides a comprehensive overview of the foundational concepts essential for scalable Bayesian learning and Monte Carlo methods. It introduces Monte Carlo integration and its relevance to Bayesian statistics, focusing on techniques such as importance sampling and control variates. The chapter outlines key applications, including logistic regression, Bayesian matrix factorization, and Bayesian neural networks, which serve as illustrative examples throughout the book. It also offers a primer on Markov chains and stochastic differential equations, which are critical for understanding the advanced methods discussed in later chapters. Additionally, the chapter introduces kernel methods in preparation for their application in scalable Markov Chain Monte Carlo (MCMC) diagnostics.
Nondeterministic models, like probabilistic models (see Part III), drop the assumption that an action applied in a state leads to only one state. The main difference with probabilistic models is that nondeterministic models do not have information about the probability distribution of transitions. In spite of this, the main motivation for acting, planning, and learning using nondeterministic models is the same as that of probabilistic approaches, namely, the need to model uncertainty: most often, the future is never entirely predictable without uncertainty. Nondeterministic models might be thought to be a special case of probabilistic models with a uniform probability distribution. This is not the case. In nondeterministic models we do not know that the probability distribution is uniform; we simply do not have any information about the distribution.
HTN planning algorithms require a set of HTN methods that provide knowledge about potential problem-solving strategies. Typically these methods are written by a domain expert, but this chapter is about some ways to learn HTN methods from examples. It describes how to learn HTN methods in learning-by-demonstration situations in which a learner is given examples of plans for various tasks, and also in situations where the learner is given only the plans and must infer what tasks the plans accomplish. The chapter also speculates briefly about prospects for a “planning-to-learn” approach in which a learner generates its own examples using a classical planner.
This chapter focuses on continuous-time MCMC algorithms, particularly those based on piecewise deterministic Markov processes (PDMPs). It introduces PDMPs as a scalable alternative to traditional MCMC, with a detailed explanation of their simulation, invariant distribution, and limiting processes. Various continuous-time samplers, including the bouncy particle sampler and zig-zag process, are compared in terms of efficiency and performance. The chapter also addresses practical aspects of simulating PDMPs, including techniques for exploiting model sparsity and data subsampling. Extensions to these methods, such as handling discontinuous target distributions or distributions defined on spaces of different dimensions, are discussed.
Learning for nondeterministic models can take advantage of most of the techniques developed for probabilistic models (Chapter 10). Indeed, note that in reinforcement learning (RL), probabilities of action transitions are not needed, so RL techniques can be applied to nondeterministic models too. For instance, we can use the algorithms for Q-learning, parametric Q-learning, and deep Q-learning. However, these algorithms do not give explicit description models of actions. In this chapter, we therefore discuss some intuitions and also some challenges of how the techniques for learning deterministic action specifications could be extended to deal with nondeterministic models. Note, however, that learning lifted action schemas in nondeterministic models is still an open problem.
Temporal models are quite rich, allowing concurrency and temporal constraints to be handled. But the development of the temporal models is a bottleneck, to be eased with machine learning techniques. In this chapter, we first briefly address the problem of learning heuristics for temporal planning (Section 19.1). We then consider the issue of learning durative action schema and temporal methods (Section 19.2). The chapter outlines the proposed approaches, based on techniques seen earlier in the book, without getting into detailed descriptions of the corresponding procedures.
This chapter addresses the issues of acting with temporal models . It presents methods for handling dynamic controllability (Section 18.1), dispatching (Section 18.2), and execution and refinement of a temporal plan (Section 18.3). It proposes methods for acting with a reactive temporal refinement engine (Section 18.4), planning with Monte Carlo rollouts (Section 18.5), and integrating planning and acting (Section 18.6).