Hostname: page-component-6766d58669-bkrcr Total loading time: 0 Render date: 2026-05-23T22:13:12.670Z Has data issue: false hasContentIssue false

Policy Improvement and the Newton-Raphson Algorithm

Published online by Cambridge University Press:  27 July 2009

P. Whittle
Affiliation:
Statistical Laboratory University of Cambridge
N. Komarova
Affiliation:
All-Union Correspondence Polytechnic Institute, Moscow, USSR

Abstract

We show that the calculation of the infinite-horizon value function for a linear/quadratic Markov decision process by policy improvement is exactly equivalent to solution of the equilibrium Riccati equation by the Newton-Raphson method. The assertion extends to risk-sensitive and non-Markov forinulations and thus shows, for example, that the Newton-Raphson method provides an iterative algorithm for the canonical factorization of operators which shows second-order convergence and has a variational basis.

Information

Type
Articles
Copyright
Copyright © Cambridge University Press 1988

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable