Dynamic Programming Problems
A dynamic programming problem is an optimization problem in which decisions have to be taken sequentially over several time periods. To make the problem non-trivial, it is usually assumed that periods are “linked” in some fashion, viz., that actions taken in any particular period affect the decision environment (and thereby, the reward possibilities) in all future periods. In practice, this is typically achieved by positing the presence of a “state” variable, representing the environment, which restricts the set of actions available to the decision-maker at any point in time, but which also moves through time in response to the decision-maker's actions. These twin features of the state variable provide the problem with “bite”: actions that look attractive from the standpoint of immediate reward (for instance, a Carribean vacation) might have the effect of forcing the state variable (the consumer's wealth or savings) into values from which the continuation possibilities (future consumption levels) are not as pleasant. The modelling and study of this trade-off between current payoffs and future rewards is the focus of the theory of dynamic programming.
In this book, we focus on two classes of dynamic programming problems—Finite-Horizon Markovian Dynamic Programming Problems, which are the subject of this chapter, and Infinite-Horizon Stationary Discounted Dynamic Programming Problems, which we examine in the next chapter.
Review the options below to login to check your access.
Log in with your Cambridge Higher Education account to check access.
If you believe you should have access to this content, please contact your institutional librarian or consult our FAQ page for further information about accessing our content.