Hostname: page-component-89b8bd64d-7zcd7 Total loading time: 0 Render date: 2026-05-06T00:54:11.318Z Has data issue: false hasContentIssue false

Markov Branching Decision Chains with Interest-Rate-Dependent Rewards*

Published online by Cambridge University Press:  27 July 2009

Ying Huang
Affiliation:
Philips Laboratories, 345 Scarborough Road, Briarcliff Manor, New York 10510
Arthur F. Veinott Jr
Affiliation:
Department of Operations Research, Terman Engineering Center, Stanford University, Stanford, California 94305

Abstract

Finite-state-and-action Markov branching decision chains are studied with bounded endogenous expected population sizes and interest-rate-dependent one-period rewards that are analytic in the interest rate at zero. The existence of a stationary strong-maximum-present-value policy is established. Miller and Veinott's [1969] strong policy-improvement method is generalized to find in finite time a stationary n-present-value optimal policy and, when the one-period rewards are rational in the interest rate, a stationary strong-maximum-present-value policy. This extends previous studies of Blackwell [1962], Miller and Veinott [1969], Veinott [1974], and Rothblum [1974, 1975], in which the one-period rewards are independent of the interest rate, and Denardo [1971] in which semi-Markov decision chains with small interest rates are studied. The problem of finding a stationary n-present-value optimal policy is also formulated as a staircase linear program in which the objective function and right-hand sides, but not the constraint matrix, depend on the interest rate, and solutions for all small enough positive interest rates are sought. The optimal solutions of the primal and dual are polynomials in the reciprocal of the interest rate. A constructive rule is given for finding a stationary n-present-value optimal policy from an optimal solution of the asymptotic linear program. This generalizes the linear programming approaches for finding maximum-reward-rate and maximum-present-value policies for Markov decision chains studied by Manne [1960], d'Epenoux [1960, 1963], Balinski [1961], Derman [1962], Denardo and Fox [1968], Denardo [1970], Derman and Veinott [1972], Veinott [1973], and Hordijk and Kallenberg [1979, 1984].

Information

Type
Research Article
Copyright
Copyright © Cambridge University Press 1995

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable