Fault-tolerant functional reactive programming (extended version)

IVAN PEREZ; ALWYN E. GOODLOE

doi:10.1017/S0956796820000118

Fault-tolerant functional reactive programming (extended version)

Part of: ICFP2018

Published online by Cambridge University Press: 07 May 2020

IVAN PEREZ

and

ALWYN E. GOODLOE

Show author details

IVAN PEREZ: Affiliation:
National Institute of Aerospace, Hampton, VA, 23666, USA, (e-mail: ivan.perez@nianet.org)
ALWYN E. GOODLOE: Affiliation:
NASA Langley Research Center, Hampton, VA, 23681, USA, (e-mail: a.goodloe@nasa.gov)

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Highly critical application domains, like medicine and aerospace, require the use of strict design, implementation, and validation techniques. Functional languages have been used in these domains to develop synchronous dataflow programming languages for reactive systems. Causal stream functions and functional reactive programming (FRP) capture the essence of those languages in a way that is both elegant and robust. To guarantee that critical systems can operate under high stress over long periods of time, these applications require clear specifications of possible faults and hazards, and how they are being handled. Modeling failure is straightforward in functional languages, and many functional reactive abstractions incorporate support for failure or termination. However, handling unknown types of faults, and incorporating fault tolerance into FRP, requires a different construction and remains an open problem. This work demonstrates how to extend an existing functional reactive framework with fault tolerance features. At value level, we tag faulty signals with reliability and probability information and use random testing to inject faults and validate system properties encoded in temporal logic. At type level, we tag components with the kinds of faults they may exhibit and use type-level programming to obtain compile-time guarantees of key aspects of fault tolerance. Our approach is powerful enough to be used in systems with realistic complexity, and flexible enough to be used to guide system analysis and design, validate system properties in the presence of faults, perform runtime monitoring, and study the effects of different fault tolerance mechanisms.

Information

Type: Research Article
Information: Journal of Functional Programming , Volume 30 , 2020 , e12

DOI: https://doi.org/10.1017/S0956796820000118 [Opens in a new window]
Creative Commons: This is a work of the US Government and is not subject to copyright protection within the United States. Published by Cambridge University Press.
Copyright: © NASA 2020. Published by Cambridge University Press

References

Avižienis, A. (1967) Design of fault-tolerant computers. In Proceedings of the November 14-16, 1967, Fall Joint Computer Conference, AFIPS ’67 (Fall). New York, NY, USA: ACM, pp. 733–743.CrossRef Google Scholar

Avižienis, A. (1976) Fault-tolerant systems. IEEE Trans. Computers 25(12), 1304–1312.CrossRef Google Scholar

Bärenz, M. & Perez, I. (2018) Rhine: FRP with type-level clocks. In Proceedings of the 11th ACM SIGPLAN International Symposium on Haskell, Haskell 2018. New York, NY, USA: ACM, pp. 145–157.CrossRef Google Scholar

Bärenz, M., Perez, I. & Nilsson, H. (2016) Mathematical properties of monadic stream functions. Available at: http://cs.nott.ac.uk/~ixp/papers/msfmathprops.pdf Google Scholar

Bedingfield, K., Leach, R. D. & Alexander, M. B. (1996, August) Spacecraft System Failures and Anomalies Attributed to the Natural Space Environment. Technical report, National Aeronautics and Space Administration. NASA Reference Publication 1390.Google Scholar

Berry, G., Bouali, A., Fornari, X., Ledinot, E., Nassor, E. & De Simone, R. (2000) Esterel: A formal method applied to avionic software development. Sci. Comput. Program. 36(1), 5–25.CrossRef Google Scholar

Boussinot, F. & De Simone, R. (1991) The ESTEREL language. Proc. IEEE 79(9), 1293–1304.CrossRef Google Scholar

Bracker, J. & Nilsson, H. (2015) Polymonad programming in Haskell. Proceedings of the 27th Symposium on the Implementation and Application of Functional Programming Languages, IFL ’15. New York, NY, USA: ACM, pp. 3:1–3:12.CrossRef Google Scholar

Bracker, J. & Nilsson, H. (2016) Supermonads: One notion to bind them all. Proceedings of the 9th International Symposium on Haskell, Haskell 2016. New York, NY, USA: ACM, pp. 158–169.CrossRef Google Scholar

Brady, E. (2013) Programming and reasoning with algebraic effects and dependent types. ACM SIGPLAN Not. 48, 133–144.CrossRef Google Scholar

Butler, R. W. (2008, February). A Primer on Architectural Level Fault Tolerance. Technical report. NASA/TM-2008-215108, L-19403. NASA Langley Research Center.Google Scholar

Butler, R. W. & Finelli, G. B. (1993) The infeasibility of quantifying the reliability of life-critical real-time software. IEEE Trans. Softw. Eng. 19(1), 3–12.CrossRef Google Scholar

Butler, R. W. (1996) An Introduction to Requirements Capture Using PVS: Specification of a Simple Autopilot. Technical report.Google Scholar

Courtney, A., Nilsson, H. & Peterson, J. (2003) The Yampa arcade. Haskell Workshop, pp. 7–18.CrossRef Google Scholar

Crow, J. & Di Vito, B. (1998) Formalizing Space Shuttle software requirements: Four case studies. ACM Trans. Softw. Eng. Methodol. (TOSEM) 7(3), 296–332.CrossRef Google Scholar

Di Vito, B. L. (1996) Formalizing new navigation requirements for NASA’s space shuttle. In International Symposium of Formal Methods Europe. Springer, pp. 160–178.CrossRef Google Scholar

Di Vito, B. L. (1999) A model of cooperative noninterference for integrated modular avionics. Depend. Comput. Crit. Appl. 7, 269–286.CrossRef Google Scholar

Di Vito, B. L. & Roberts, L. W. (1996) Using Formal Methods to Assist in the Requirements Analysis of the Space Shuttle GPS Change Request. Technical report.Google Scholar

Dormoy, F.-X. (2008) Scade 6: A model based solution for safety critical software development. Proceedings of the 4th European Congress on Embedded Real Time Software (ERTS 2008), pp. 1–9.Google Scholar

Driscoll, K. (2008) Real system failures. https://c3.nasa.gov/dashlink/static/media/other/ObservedFailures4.html Google Scholar

Dutertre, B. & Stavridou, V. (1997) Formal requirements analysis of an avionics control system. IEEE Trans. Softw. Eng. 23(5), 267–278.CrossRef Google Scholar

Elliott, C. & Hudak, P. (1997) Functional reactive animation. ACM SIGPLAN Not. 32(8), pp. 263–273.CrossRef Google Scholar

Epstein, J., Black, A. P. & Peyton-Jones, S. (2011) Towards Haskell in the cloud. ACM SIGPLAN Not. 46, pp. 118–129. ACM.CrossRef Google Scholar

Erwig, M. & Kollmansberger, S. (2006) Functional pearls: Probabilistic functional programming in Haskell. J. Func. Program. 16(1), 21–34.CrossRef Google Scholar

Halbwachs, N., Caspi, P., Raymond, P. & Pilaud, D. (1991) The synchronous data flow programming language LUSTRE. Proc. IEEE 79(9), 1305–1320.CrossRef Google Scholar

Huch, F. & Norbisrath, U. (2000). Distributed programming in Haskell with ports. In Symposium on Implementation and Application of Functional Languages. Springer, pp. 107–121.Google Scholar

Hughes, J. (2000) Generalising monads to arrows. Sci. Comput. Program. 37(1), 67–111.CrossRef Google Scholar

Katsumata, S.-y. (2014) Parametric effect monads and semantics of effect systems. ACM SIGPLAN Not. 49(1), 633–645.CrossRef Google Scholar

Kiselyov, O., Sabry, A. & Swords, C. (2013) Extensible effects: An alternative to monad transformers. ACM SIGPLAN Not. 48, 59–70.CrossRef Google Scholar

Klein, G., Elphinstone, K., Heiser, G., Andronick, J., Cock, D., Derrin, P., Elkaduwe, D., Engelhardt, K., Kolanski, R., Norrish, M., et al. (2009) seL4: Formal verification of an OS kernel. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles. ACM, pp. 207–220.CrossRef Google Scholar

Klein, G., Andronick, J., Elphinstone, K., Murray, T., Sewell, T., Kolanski, R. & Heiser, G. (2014) Comprehensive formal verification of an OS microkernel. ACM Trans. Comput. Syst. (TOCS), 32(1), 2.CrossRef Google Scholar

Kshemkalyani, A. D. & Singhal, M. (2008) Distributed Computing. Cambridge University.CrossRef Google Scholar

Lamport, L., Shostak, R. & Pease, M. (1982) The Byzantine generals problem. ACM Trans. Program. Lang. Syst. (TOPLAS) 4(3), 382–401.CrossRef Google Scholar

Liang, S., Hudak, P. & Jones, M. (1995) Monad transformers and modular interpreters. In Proceedings of the 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, pp. 333–343.CrossRef Google Scholar

Lincoln, P. & Rushby, J. (1993) A formally verified algorithm for interactive consistency under a hybrid fault model. The Twenty-Third International Symposium on Fault-Tolerant Computing, 1993. FTCS-23. Digest of Papers. IEEE, pp. 402–411.CrossRef Google Scholar

Lincoln, P. & Rushby, J. (1994) Formal verification of an interactive consistency algorithm for the Draper FTP architecture under a hybrid fault model. In In Proceedings of the Ninth Annual Conference on Computer Assurance, 1994. COMPASS 1994 Safety, Reliability, Fault Tolerance, Concurrency and Real Time, Security. IEEE, pp. 107–120.CrossRef Google Scholar

Maier, P., Stewart, R. & Trinder, P. (2014) The HdpH DSLs for scalable reliable computation. In Proceedings of the 2014 ACM SIGPLAN Symposium on Haskell, Haskell 2014. New York, NY, USA: ACM, pp. 65–76.CrossRef Google Scholar

Nilsson, H., Courtney, A. & Peterson, J. (2002) Functional reactive programming, continued. In Haskell Workshop, pp. 51–64.CrossRef Google Scholar

Orchard, D. & Petricek, T. (2014) Embedding effect systems in Haskell. In Proceedings of the 2014 ACM SIGPLAN Symposium on Haskell, Haskell 2014. New York, NY, USA: ACM, pp. 13–24.CrossRef Google Scholar

Orchard, D. & Petricek, T. (2015) Embedding effect systems in Haskell. ACM SIGPLAN Not. 49(12), 13–24.CrossRef Google Scholar

Orchard, D. A., Petricek, T. & Mycroft, A. (2014) The semantic marriage of monads and effects. CoRR, abs/1401.5391.Google Scholar

Owre, S., Rushby, J. M. & Shankar, N. (1992) PVS: A prototype verification system. In 11th International Conference on Automated Deduction (CADE), Kapur, D. (ed.) Lecture Notes in Artificial Intelligence, vol. 607. Saratoga, NY: Springer-Verlag, pp. 748–752.CrossRef Google Scholar

Owre, S., Rajan, S., Rushby, J. M., Shankar, N. and Srivas, M. (1996) PVS: Combining specification, proof checking, and model checking. In International Conference on Computer Aided Verification. Springer, pp. 411–414.CrossRef Google Scholar

Owre, S., Shankar, N., Rushby, J. M. & Stringer-Calvert, D. W. J. (1999) PVS Language Reference. Technical report.Google Scholar

Pagano, B., Andrieu, O., Moniot, T., Canou, B., Chailloux, E., Wang, P., Manoury, P. & Colaço, J.-L. (2009) Experience report: Using objective caml to develop safety-critical embedded tools in a certification framework. In Proceedings of the 14th ACM SIGPLAN International Conference on Functional Programming, ICFP 2009. New York, NY, USA: ACM, pp. 215–220.CrossRef Google Scholar

Perez, I. (2017) Back to the future: Time travel in FRP. Proceedings of the 10th ACM SIGPLAN International Haskell Symposium, Haskell 2017. New York, NY, USA: ACM.CrossRef Google Scholar

Perez, I. (2018) Fault-tolerant functional reactive programming (functional pearl). Proc. ACM Program. Lang., 2(ICFP), 96:1–96:30.CrossRef Google Scholar

Perez, I. & Nilsson, H. (2017) Testing and debugging functional reactive programming. Proc. ACM Program. Lang. 1(ICFP), 2:1–2:27.CrossRef Google Scholar

Perez, I. & Nilsson, H. Runtime verification and validation of functional reactive systems. J. Func. Program. Submitted (Under evaluation).Google Scholar

Perez, I., Bärenz, M. & Nilsson, H. (2016) Functional reactive programming, refactored. In Proceedings of the 9th International Symposium on Haskell, Haskell 2016. New York, NY, USA: ACM, pp. 33–44.CrossRef Google Scholar

Perez, I., Goodloe, A. & Edmonson, W. (2019) Fault-tolerant swarms. In 7th IEEE International Conference on Space Mission Challenges for Information Technology (SMC-IT 2019). IEEE.CrossRef Google Scholar

Pfeifer, H., Schwier, D. & Von Henke, F. W. (1999) Formal verification for time-triggered clock synchronization. Depend. Comput. Crit. Appl. 7, 207–226.Google Scholar

Pike, L., Goodloe, A., Morisset, R. & Niller, S. (2010). Copilot: A hard real-time runtime monitor. In International Conference on Runtime Verification. Springer, pp. 345–359.CrossRef Google Scholar

Pike, L., Niller, S. & Wegmann, N. (2012) Runtime verification for ultra-critical systems. In Runtime Verification, Khurshid, S. & Sen, K. (eds), Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 310–324.CrossRef Google Scholar

RTCA. (2011) Software Considerations in Airborne Systems and Equipment Certification (178C). Technical report.Google Scholar

Rushby, J. (2006) PVS Bibliography. http://pvs.csl.sri.com/documentation.shtml Google Scholar

Spivey, M. (1990) A functional theory of exceptions. Sci. Comput. Program. 14(1), 25–42.CrossRef Google Scholar

Stewart, R. (2013) Reliable Massively Parallel Symbolic Computing: Fault Tolerance for a Distributed Haskell. Ph.D. thesis, Heriot-Watt University.Google Scholar

Stewart, R., Trinder, P. & Maier, P. (2013) Supervised workpools for reliable massively parallel computing. In Trends in Functional Programming, Loidl, H.-W. & Peña, R. (eds), Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 247–262.CrossRef Google Scholar

Stewart, R., Maier, P. & Trinder, P. (2016) Transparent fault tolerance for scalable functional computation. J. Func. Program. 26.Google Scholar

Trinder, P., Pointon, R. & Loidl, H.-W. (2000) Towards runtime system level fault tolerance for a distributed functional language. Trends Func. Program. 2, 103.Google Scholar

Vinoski, S. (2007) Reliability with Erlang. IEEE Internet Comput 11(6).Google Scholar

Wadler, P. (1985) How to replace failure by a list of successes. In Conference on Functional Programming Languages and Computer Architecture. Springer, pp. 113–128.CrossRef Google Scholar

Xu, L., Bai, Y., Cheng, K., Ge, L., Nie, D., Zhang, L. & Liu, W. (2016) Towards fault-tolerant real-time scheduling in the seL4 microkernel. In 2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE, pp. 711–718.CrossRef Google Scholar

Submit a response

Discussions

No Discussions have been published for this article.

Article contents

Fault-tolerant functional reactive programming (extended version)

Abstract

Information

References

Discussions

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests