Bayesian dynamic programming

Ulrich Rieder

doi:10.2307/1426080

Bayesian dynamic programming

Published online by Cambridge University Press: 01 July 2016

Ulrich Rieder

Show author details

Ulrich Rieder*: Affiliation:
University of Hamburg

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We consider a non-stationary Bayesian dynamic decision model with general state, action and parameter spaces. It is shown that this model can be reduced to a non-Markovian (resp. Markovian) decision model with completely known transition probabilities. Under rather weak convergence assumptions on the expected total rewards some general results are presented concerning the restriction on deterministic generalized Markov policies, the criteria of optimality and the existence of Bayes policies. These facts are based on the above transformations and on results of Hindererand Schäl.

Keywords

BAYESIAN DYNAMIC PROGRAMMING A POSTERIORI DISTRIBUTION MAXIMAL EXPECTED REWARD BAYES POLICY REDUCTION TO MODELS WITH COMPLETELY KNOWN TRANSITION LAW GENERALIZED MARKOV POLICY STATIONARY BAYESIAN DECISION MODELS

Information

Type: Research Article
Information: Advances in Applied Probability , Volume 7 , Issue 2 , June 1975 , pp. 330 - 348

DOI: https://doi.org/10.2307/1426080 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 1975

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aoki, M. (1967) Optimization of Stochastic Systems. Academic Press, New York.Google Scholar

Bauer, H. (1968) Wahrscheinlichkeitstheorie und Grundzüge der Masstheorie. Walter de Gruyter and Co., Berlin.Google Scholar

Bellman, R. (1961) Adaptive Control Processes — A Guided Tour. Princeton University Press, Princeton.Google Scholar

Billingsley, P. (1968) Convergence of Probability Measures. John Wiley, New York.Google Scholar

Blackwell, D. (1965) Discounted dynamic programming. Ann. Math. Statist. 36, 226–235.Google Scholar

Blackwell, D. (1967) Positive dynamic programming. Proc. Fifth Berkeley Symp. Math. Statist. Prob. 1, 415–418.Google Scholar

Degroot, M. E. (1970) Optimal Statistical Decisions. McGraw-Hill, New York.Google Scholar

Ferguson, T. S. (1967) Mathematical Statistics — A Decision Theoretical Approach. Academic Press, New York.Google Scholar

Furukawa, N. (1967) A Bayes controlled process. Mem. Fac. Science (Kyushu University) 21, 249–258.Google Scholar

Furukawa, N. (1970) Fundamental theorems in a Bayes controlled process. Bull. Math. Statist. 14, 103–110.Google Scholar

Hinderer, K. (1970) Foundations of non-stationary dynamic programming with discrete time parameter. Lee. Notes in Operat. Res. and Math. Systems, Vol. 33. Springer, Berlin.Google Scholar

Hinderer, K. (1971) Instationäre dynamische Optimierung bei schwachen Voraussetzungen über die Gewinnfunktionen. Abh. Math. Sem. Univ. Hamburg 36, 208–223.Google Scholar

Martin, J. J. (1967) Bayesian Decision Problems and Markov Chains. John Wiley, New York.Google Scholar

Rhenius, D. (1971) Markoffsche Entscheidungsmodelle mit unvollständiger Information und Anwendungen in der Lerntheorie. Doctoral dissertation, University of Hamburg.Google Scholar

Schäl, M. (1974) On non-stationary continuous dynamic programming with discrete time parameter. Research report, University of Bonn.Google Scholar

Strauch, R. E. (1966) Negative dynamic programming. Ann. Math. Statist. 37, 871–890.Google Scholar

Sworder, D. D. (1966) Optimal Adaptive Systems. Academic Press, New York.Google Scholar

Wessels, J. (1968) Decision Rules in Markovian Decision Processes with Incompletely Known Transition Probabilities. Doctoral disseration, TH Eindhoven.Google Scholar

White, D. J. (1969) Dynamic Programming. Oliver and Boyd, Edinburgh and London.Google Scholar

Whittle, P. (1969) Sequential decision processes with essential unobservables. Adv. Appl. Prob. 1, 271–287.Google Scholar

Yakowitz, S. J. (1969) Mathematics of Adaptive Control Processes. Elsevier, New York.Google Scholar

Article contents

Bayesian dynamic programming

Abstract

Keywords

Information

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests