Hostname: page-component-8448b6f56d-mp689 Total loading time: 0 Render date: 2024-04-19T08:40:59.968Z Has data issue: false hasContentIssue false

On the correctness of monadic backward induction

Published online by Cambridge University Press:  29 October 2021

NURIA BREDE
Affiliation:
University of Potsdam, Potsdam, Germany Potsdam Institute for Climate Impact Research, Potsdam, Germany (e-mail: brede@uni-potsdam.de)
NICOLA BOTTA
Affiliation:
Potsdam Institute for Climate Impact Research, Potsdam, Germany Chalmers University of Technology, Göteborg, Sweden (e-mail: botta@pik-potsdam.de)
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

In control theory, to solve a finite-horizon sequential decision problem (SDP) commonly means to find a list of decision rules that result in an optimal expected total reward (or cost) when taking a given number of decision steps. SDPs are routinely solved using Bellman’s backward induction. Textbook authors (e.g. Bertsekas or Puterman) typically give more or less formal proofs to show that the backward induction algorithm is correct as solution method for deterministic and stochastic SDPs. Botta, Jansson and Ionescu propose a generic framework for finite horizon, monadic SDPs together with a monadic version of backward induction for solving such SDPs. In monadic SDPs, the monad captures a generic notion of uncertainty, while a generic measure function aggregates rewards. In the present paper, we define a notion of correctness for monadic SDPs and identify three conditions that allow us to prove a correctness result for monadic backward induction that is comparable to textbook correctness proofs for ordinary backward induction. The conditions that we impose are fairly general and can be cast in category-theoretical terms using the notion of Eilenberg–Moore algebra. They hold in familiar settings like those of deterministic or stochastic SDPs, but we also give examples in which they fail. Our results show that backward induction can safely be employed for a broader class of SDPs than usually treated in textbooks. However, they also rule out certain instances that were considered admissible in the context of Botta et al. ’s generic framework. Our development is formalised in Idris as an extension of the Botta et al. framework and the sources are available as supplementary material.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2021. Published by Cambridge University Press

References

Audebaud, P. & Paulin-Mohring, C. (2009) Proofs of randomized algorithms in Coq. Sci. Comput. Program. 74(8), 568589.CrossRefGoogle Scholar
Bellman, R. (1957) Dynamic Programming. Princeton University Press.Google ScholarPubMed
Bertsekas, D. P. & Shreve, S. E. (1996) Stochastic Optimal Control: The Discrete Time Case. Athena Scientific.Google Scholar
Bertsekas, D., Nedić, A. & Ozdaglar, A. (2003) Convex Analysis and Optimization. Athena Scientific Optimization and Computation Series. Athena Scientific.Google Scholar
Bertsekas, D., P. (1995) Dynamic Programming and Optimal Control. Athena Scientific.Google Scholar
Bird, R. (2014) Thinking Functionally with Haskell. Cambridge University Press.CrossRefGoogle Scholar
Bird, R. & Gibbons, J. (2020) Algorithm Design with Haskell. Cambridge University Press.CrossRefGoogle Scholar
Botta, N., Mandel, A., Hofmann, M., Schupp, S. & Ionescu, C. (2013) Mathematical specification of an agent-based model of exchange. In Proceedings of the AISB Convention 2013, “Do-Form: Enabling Domain Experts to use Formalized Reasoning” Symposium.Google Scholar
Botta, N., Jansson, P. & Ionescu, C. (2018) The impact of uncertainty on optimal emission policies. Earth Syst. Dyn. 9(2), 525542.CrossRefGoogle Scholar
Botta, N., Brede, N., Jansson, P. & Richter, T. (in press) Extensional equality preservation and verified generic programming. J. Funct. Program. (Accepted for publication August 2021). https://arxiv.org/abs/2008.02123.CrossRefGoogle Scholar
Botta, N., Jansson, P. & Ionescu, C. (2017a) Contributions to a computational theory of policy advice and avoidability. J. Funct. Program. 27, e23.CrossRefGoogle Scholar
Botta, N., Jansson, P., Ionescu, C., Christiansen, D. R. & Brady, E. (2017b) Sequential decision problems, dependent types and generic solutions. Log. Meth. Comput. Sci. 13(1).Google Scholar
Brady, E. (2013) Idris, a general-purpose dependently typed programming language: Design and implementation. J. Funct. Program. 23(9), 552593.CrossRefGoogle Scholar
Brady, E. (2017) Type-Driven Development in Idris. Manning Publications Co.Google Scholar
Brede, N. & Botta, N. (2021) On the Correctness of Monadic Backward Induction. Git repository.Google Scholar
De Moor, O. (1995) A generic program for sequential decision processes. In PLILPS ’95 Proceedings of the 7th International Symposium on Programming Languages: Implementations, Logics and Programs, pp. 1–23. Springer.CrossRefGoogle Scholar
De Moor, O. (1999) Dynamic programming as a software component. In Proc. 3rd WSEAS Int. Conf. Circuits, Systems, Communications and Computers (CSCC 1999), pp. 48.Google Scholar
Diederich, A. (2001) Sequential decision making. In International Encyclopedia of the Social & Behavioral Sciences, Smelser, N. J. & Baltes, P. B. (eds), pp. 13917–13922. Pergamon.CrossRefGoogle Scholar
Erwig, M. and Kollmansberger, S. (2006) Functional Pearls: Probabilistic functional programming in Haskell. J. Funct. Program. 16(1), 2134.CrossRefGoogle Scholar
Finus, M., van Ierland, E. & Dellink, R. (2003) Stability of Climate Coalitions in a Cartel Formation Game. FEEM Working Paper No. 61.2003.CrossRefGoogle Scholar
Gintis, H. (2007) The dynamics of general equilibrium. Econ. J. 117, 12801309.CrossRefGoogle Scholar
Giry, M. (1981) A categorial approach to probability theory. In Categorical Aspects of Topology and Analysis, Banaschewski, B. (ed). Lecture Notes in Mathematics 915, pp. 68–85. Springer.CrossRefGoogle Scholar
Heitzig, J. (2012) Bottom-Up Strategic Linking of Carbon Markets: Which Climate Coalitions Would Farsighted Players Form? SSRN Environmental Economics eJournal.CrossRefGoogle Scholar
Helm, C. (2003) International emissions trading with endogenous allowance choices. J. Public Econ. 87, 27372747.CrossRefGoogle Scholar
Ionescu, C. (2009) Vulnerability Modelling and Monadic Dynamical Systems. PhD thesis, Freie Universität Berlin.Google Scholar
Jacobs, B. (2011) Probabilities, distribution monads, and convex categories. Theor. Comput. Sci. 412(28), 33233336.CrossRefGoogle Scholar
MacLane, S. (1978) Categories for the Working Mathematician. 2nd edn. Graduate Texts in Mathematics. Springer.CrossRefGoogle Scholar
Mercure, J.-F., Sharpe, S., Vinuales, J., Ives, M., Grubb, M., Pollitt, H., Knobloch, F. & Nijsse, F. (2020) Risk-opportunity analysis for transformative policy design and appraisal. C-EENRG Working Papers 2020-4, 1–40.Google Scholar
Puterman, M. L. (2014) Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons.Google Scholar
TiPES. (2019–2023) TiPES H2020 Project Website. https://www.tipes.dk/.Google Scholar
Wadler, P. (2015) Propositions as types. Commun. ACM 58(12), 7584.CrossRefGoogle Scholar
Supplementary material: File

Brede and Botta supplementary material

Brede and Botta supplementary material

Download Brede and Botta supplementary material(File)
File 19.7 KB
Submit a response

Discussions

No Discussions have been published for this article.