On the correctness of monadic backward induction

NURIA BREDE; NICOLA BOTTA

doi:10.1017/S0956796821000228

On the correctness of monadic backward induction

Published online by Cambridge University Press: 29 October 2021

NURIA BREDE

and

NICOLA BOTTA

Show author details

NURIA BREDE: Affiliation:
University of Potsdam, Potsdam, Germany Potsdam Institute for Climate Impact Research, Potsdam, Germany (e-mail: brede@uni-potsdam.de)
NICOLA BOTTA: Affiliation:
Potsdam Institute for Climate Impact Research, Potsdam, Germany Chalmers University of Technology, Göteborg, Sweden (e-mail: botta@pik-potsdam.de)

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

In control theory, to solve a finite-horizon sequential decision problem (SDP) commonly means to find a list of decision rules that result in an optimal expected total reward (or cost) when taking a given number of decision steps. SDPs are routinely solved using Bellman’s backward induction. Textbook authors (e.g. Bertsekas or Puterman) typically give more or less formal proofs to show that the backward induction algorithm is correct as solution method for deterministic and stochastic SDPs. Botta, Jansson and Ionescu propose a generic framework for finite horizon, monadic SDPs together with a monadic version of backward induction for solving such SDPs. In monadic SDPs, the monad captures a generic notion of uncertainty, while a generic measure function aggregates rewards. In the present paper, we define a notion of correctness for monadic SDPs and identify three conditions that allow us to prove a correctness result for monadic backward induction that is comparable to textbook correctness proofs for ordinary backward induction. The conditions that we impose are fairly general and can be cast in category-theoretical terms using the notion of Eilenberg–Moore algebra. They hold in familiar settings like those of deterministic or stochastic SDPs, but we also give examples in which they fail. Our results show that backward induction can safely be employed for a broader class of SDPs than usually treated in textbooks. However, they also rule out certain instances that were considered admissible in the context of Botta et al. ’s generic framework. Our development is formalised in Idris as an extension of the Botta et al. framework and the sources are available as supplementary material.

Type: Research Article
Information: Journal of Functional Programming , Volume 31 , 2021 , e26

DOI: https://doi.org/10.1017/S0956796821000228 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2021. Published by Cambridge University Press

References

Audebaud, P. & Paulin-Mohring, C. (2009) Proofs of randomized algorithms in Coq. Sci. Comput. Program. 74(8), 568–589.CrossRef Google Scholar

Bellman, R. (1957) Dynamic Programming. Princeton University Press.Google Scholar PubMed

Bertsekas, D. P. & Shreve, S. E. (1996) Stochastic Optimal Control: The Discrete Time Case. Athena Scientific.Google Scholar

Bertsekas, D., Nedić, A. & Ozdaglar, A. (2003) Convex Analysis and Optimization. Athena Scientific Optimization and Computation Series. Athena Scientific.Google Scholar

Bertsekas, D., P. (1995) Dynamic Programming and Optimal Control. Athena Scientific.Google Scholar

Bird, R. (2014) Thinking Functionally with Haskell. Cambridge University Press.CrossRef Google Scholar

Bird, R. & Gibbons, J. (2020) Algorithm Design with Haskell. Cambridge University Press.CrossRef Google Scholar

Botta, N., Mandel, A., Hofmann, M., Schupp, S. & Ionescu, C. (2013) Mathematical specification of an agent-based model of exchange. In Proceedings of the AISB Convention 2013, “Do-Form: Enabling Domain Experts to use Formalized Reasoning” Symposium.Google Scholar

Botta, N., Jansson, P. & Ionescu, C. (2018) The impact of uncertainty on optimal emission policies. Earth Syst. Dyn. 9(2), 525–542.CrossRef Google Scholar

Botta, N. (2016–2021) IdrisLibs. https://gitlab.pik-potsdam.de/botta/IdrisLibs.Google Scholar

Botta, N., Brede, N., Jansson, P. & Richter, T. (in press) Extensional equality preservation and verified generic programming. J. Funct. Program. (Accepted for publication August 2021). https://arxiv.org/abs/2008.02123.CrossRef Google Scholar

Botta, N., Jansson, P. & Ionescu, C. (2017a) Contributions to a computational theory of policy advice and avoidability. J. Funct. Program. 27, e23.CrossRef Google Scholar

Botta, N., Jansson, P., Ionescu, C., Christiansen, D. R. & Brady, E. (2017b) Sequential decision problems, dependent types and generic solutions. Log. Meth. Comput. Sci. 13(1).Google Scholar

Brady, E. (2013) Idris, a general-purpose dependently typed programming language: Design and implementation. J. Funct. Program. 23(9), 552–593.CrossRef Google Scholar

Brady, E. (2017) Type-Driven Development in Idris. Manning Publications Co.Google Scholar

Brede, N. & Botta, N. (2021) On the Correctness of Monadic Backward Induction. Git repository.Google Scholar

De Moor, O. (1995) A generic program for sequential decision processes. In PLILPS ’95 Proceedings of the 7th International Symposium on Programming Languages: Implementations, Logics and Programs, pp. 1–23. Springer.CrossRef Google Scholar

De Moor, O. (1999) Dynamic programming as a software component. In Proc. 3rd WSEAS Int. Conf. Circuits, Systems, Communications and Computers (CSCC 1999), pp. 4–8.Google Scholar

Diederich, A. (2001) Sequential decision making. In International Encyclopedia of the Social & Behavioral Sciences, Smelser, N. J. & Baltes, P. B. (eds), pp. 13917–13922. Pergamon.CrossRef Google Scholar

Erwig, M. and Kollmansberger, S. (2006) Functional Pearls: Probabilistic functional programming in Haskell. J. Funct. Program. 16(1), 21–34.CrossRef Google Scholar

Finus, M., van Ierland, E. & Dellink, R. (2003) Stability of Climate Coalitions in a Cartel Formation Game. FEEM Working Paper No. 61.2003.CrossRef Google Scholar

Gintis, H. (2007) The dynamics of general equilibrium. Econ. J. 117, 1280–1309.CrossRef Google Scholar

Giry, M. (1981) A categorial approach to probability theory. In Categorical Aspects of Topology and Analysis, Banaschewski, B. (ed). Lecture Notes in Mathematics 915, pp. 68–85. Springer.CrossRef Google Scholar

Heitzig, J. (2012) Bottom-Up Strategic Linking of Carbon Markets: Which Climate Coalitions Would Farsighted Players Form? SSRN Environmental Economics eJournal.CrossRef Google Scholar

Helm, C. (2003) International emissions trading with endogenous allowance choices. J. Public Econ. 87, 2737–2747.CrossRef Google Scholar

Ionescu, C. (2009) Vulnerability Modelling and Monadic Dynamical Systems. PhD thesis, Freie Universität Berlin.Google Scholar

Jacobs, B. (2011) Probabilities, distribution monads, and convex categories. Theor. Comput. Sci. 412(28), 3323–3336.CrossRef Google Scholar

MacLane, S. (1978) Categories for the Working Mathematician. 2nd edn. Graduate Texts in Mathematics. Springer.CrossRef Google Scholar

Mercure, J.-F., Sharpe, S., Vinuales, J., Ives, M., Grubb, M., Pollitt, H., Knobloch, F. & Nijsse, F. (2020) Risk-opportunity analysis for transformative policy design and appraisal. C-EENRG Working Papers 2020-4, 1–40.Google Scholar

Puterman, M. L. (2014) Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley & Sons.Google Scholar

TiPES. (2019–2023) TiPES H2020 Project Website. https://www.tipes.dk/.Google Scholar

Wadler, P. (2015) Propositions as types. Commun. ACM 58(12), 75–84.CrossRef Google Scholar

Brede and Botta supplementary material

File 19.7 KB

Submit a response

Discussions

No Discussions have been published for this article.

Article contents

On the correctness of monadic backward induction

Abstract

References

Brede and Botta supplementary material

Discussions

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests