A correct-by-construction conversion from lambda calculus to combinatory logic

Abstract This pearl defines a translation from well-typed lambda terms to combinatory logic, where both the preservation of types and the correctness of the translation are enforced statically.


Introduction
The correspondence between Curry's type-free lambda calculus and Schönfinkel's combinatory algebras is among the oldest known and the most aesthetically pleasing facts about the lambda calculus.
It is not so hard to define a translation from combinatory logic to lambda terms that preserves reduction behaviour.The following three lambda terms correspond to the combinators S, K, and I, respectively.λf g x .(f x) (g x) λx y .x λx .x Interestingly, there is also a translation in the other direction, from lambda terms to combinatory logic.The key ingredient in this translation scheme is known as bracket abstraction or combinatory abstraction.Given a variable x and term in combinatory logic t, we can define the term x .t by means of the following three cases: As its name suggests, the term in combinatory logic computed in this fashion simulates the reduction behaviour of a lambda abstraction in combinatory logic.
These translations are typically defined on untyped lambda terms.In this pearl, we try a different tack and explore how to prove that the translation from the simply typed lambda calculus to combinatory logic preserves both types and semantics.This is not a new result, but rather than prove these properties post hoc, we ensure the translation is correct by construction using the dependently typed programming language Agda (Norell, 2007).

Lambda calculus
To set the scene, we start by defining an evaluator for the simply typed lambda calculus.This evaluator features in numerous papers and introductions on programming with dependent types (McBride, 2004;Norell, 2009Norell, , 2013;;Abel, 2016), yet we include it here in its entirety for the sake of completeness.

Types
The types of our lambda calculus consist of a single base type (ι) and functions between types, denoted using the function space operator (⇒): We can map these types to their Agda counterparts.Here the interpretation of the base type, ι, is mapped to some type A : Set, which we pass as a parameter to this development; the functions and proofs that follow do not depend on the interpretation of our base type in any meaningful way.
Before defining lambda terms, we need one last definition.We will represent contexts or type environments as lists of types: Typically, we will use variable names drawn from the Greek alphabet to refer to types (such as σ and τ ) and contexts ( and ).

Terms
Before we define the terms of the simply typed lambda calculus, we need to decide on how to treat variables.We begin by defining the following inductive family, modelling valid references to a type σ in a given context : We can now define the datatype for well-typed, well-scoped lambda terms as follows: Each constructor mirrors a familiar typing rule: applications require the function's domain and argument's type to coincide; lambda abstractions introduce a new variable in the context of the lambda's body; the var constructor may be used to refer to any variable that is currently in scope.

Evaluation
The dependent types in the definition of Term pay dividends once we try to define an evaluator for lambda terms.Before we can do so, however, we need to introduce a datatype for environments: Note that this function is total.The type indices ensure that there is no valid variable in the empty context; correspondingly, the lookup function need never worry about returning a value when the environment is empty.
We can now define an evaluator for the simply typed lambda calculus: That this code type checks at all is somewhat surprising at first.It maps app constructors to Agda's application and lam constructors to Agda's built-in lambda construct.Once again, the type indices ensure that the evaluation of the lam construct must return a function (and hence we may introduce a lambda).Similarly in the case for applications, evaluating t 1 will return a function whose domain coincides with the type of the value arising from the evaluation of t 2 .Finally, the environment of type Env passed as an argument contains just the right values for all the variables drawn from the context .

Translation to combinatory logic
Before we can define the translation from lambda terms to combinators, we need to fix our target language.As a first attempt, we might try something along the following lines, turning the grammar from the introduction into an Agda datatype: Yet if we aim for our translation to be type-preserving, the very least we can do is decorate our combinators with the same type information as our lambda terms: The types of both the app and var constructors are the same as we saw for the lambda terms.The types of the primitive combinators are determined by their desired reduction behaviour.Note that -as our Comb lacks lambdas and cannot introduce new variables -the context is now a parameter rather than an index as we saw for the Term datatype.This is the essence of combinatory logic: a language with variables but without binders.
Yet we will strive to do even better.We will define a translation that preserves both the types and dynamic semantics of our lambda terms.To achieve this, we index our combinators with both their types and their intended semantics, given by a function of type Env → Val u.This will enable us to define a translation from a lambda term to a term in combinatory logic that has the same semantics as its input lambda term.This yields the final version of our datatype for combinatory logic: Here the type of each base combinator (S, K, and I) contains both its type and semantics.For example, the I combinator has type σ ⇒ σ and corresponds to the lambda term λ x → x.None of the combinators rely on the additional environment parameter env.This environment is used in the var constructor; just as we saw in our evaluator for lambda terms, this environment stores a value for each variable.Finally, the app constructor applies one combinator term to another.The type information for both the var and app constructors coincides with their counterparts from the Term data type; their intended semantics can be read off from the evaluator for lambda terms, [[ t ]], that we defined previously.
The key difference between lambda terms and SKI combinators is the lack of lambdas in the latter.To handle the bracket abstraction translation from the introduction, we define the abs function that maps one combinator term to another: = app K S abs K = app K K abs I = app K I abs (app t 1 t 2 ) = app (app S (abs t 1 )) (abs t 2 ) abs (var zero) = I abs (var (succ i)) = app K (var i) This behaviour of the abs function should be clear from its type: given a Comb term of type τ using variables drawn from the context σ :: , the abs function returns a combinator of type σ ⇒ τ using variables drawn from the context .Essentially, any occurrences of the var Top are replaced with the identity I; the new argument is distributed over applications using the S combinator; any other variables or base combinators discard this new argument by introducing an additional K combinator.
With this definition in place, we can now define our type-preserving correct-byconstruction translation.That is, we aim to define a translation with the following type: Here a lambda term of type σ in the context is mapped to a combinator of type σ using variables drawn from the context in such a way that the evaluation of t and semantics of the combinator are identical, namely [[ t ]].The definition of this translation is now entirely straightforward.
translate (app t 1 t 2 ) = app (translate t 1 ) (translate t 2 ) translate (lam t) = abs (translate t) translate (var i) = var i To see why this code type checks, note that both the (dynamic) semantics of both the app and var constructors of the Comb datatype coincide precisely with their semantics as lambda terms, [[ app t 1 t 2 ]] and [[ var i ]], respectively.Finally, if translating the body of a lambda produces some Comb term f, the abs function produces a combinator term with the semantics λ env x → f (Cons x env).The similarity between the type of the abs function and the lam branch of our evaluator is no coincidence.
There is a subtle difference between this translation scheme and the one presented in the introduction.In particular, when a variable does not occur anywhere, the bracket abstraction sketched in the introduction immediately introduces a K combinator, whereas the abs function will use the S combinator in every application -even if the variable is unused in both branches.This may lead to unnecessarily large combinatorial terms.Furthermore, the SKI-combinators are not the only possible choice of combinatorial basis.In particular, the S combinator always passes its third argument to the first two -even if it is unused in one of the branches.Can we do better?

An optimising translation
There is an alternative implementation of bracket abstraction, using two additional combinators B and C, that Turner (1979) attributes to Curry.The reduction behaviour of B and C is defined as follows: In contrast to the S combinator, the B combinator only passes its third argument to its second argument.The C combinator, on the other hand, only passes its third argument to its first argument.This avoids unnecessarily duplicating the third argument x, when it is only used by one of the two terms in an application.When the variable is not used at all, we can introduce the K combinator as suggested by the translation scheme from the introduction.As a result, normalising terms may require fewer reduction steps.
We can readily extend our Comb datatype with new constructors for these two combinators: When translating an application, we now need to select between four possible choices: K, B, C, and S, depending how variables are used.How can we make this choice, while still guaranteeing that types and semantics are preserved accordingly?
The key insight is that the translation scheme, implemented by the abs function above, already informs us whether or not a variable is used: any variable occurrence or combinator https://doi.org/10.1017/S0956796823000084Published online by Cambridge University Press that does not use the most recently bound variable starts with an application of the K combinator.Rather than indiscriminately apply the S combinator on subterms, we can instead differentiate where variables are actually used.To this end, we define the following specialised function for applying the S combinator: Unlike the previous naive translation, this definition avoids unnecessary occurrences of the K combinator, simplifying the resulting definition whenever possible.Only the very last case, when neither t 1 nor t 2 start with an application of K, introduces the S combinator.
The other cases introduce an outermost K, B, or C combinator, depending on where the 'bound' variable occurs.
To complete the translation, we need to adapt the abs function: adding new cases for B and C, and calling the sapp function instead of applying S directly.
The types and remaining cases definitions, however, remain unchanged.

Reflection
Although the translation schemes are reasonably straightforward, finding the implementation presented here was not.Writing dependently typed programs in this style -folding a program's specification into its type -may feel like a bit of a parlour trick, where the right choice of definitions ensures the entire construction is correct.Yet reading through these definitions after the fact -like so often with Agda programs -does not tell the complete story of how they were constructed.
Verifying the type safe translation from lambda terms to SKI combinators is a question I have set my students in the past.Proving this translation correct requires defining an evaluation function for combinatory terms and then proving that the translation is semantics preserving.Interestingly, this proof requires an axiom -functional extensionality -in the case for lambdas, as we need to prove two functions equal.Yet the structure of proof is simple enough: it relies exclusively on induction hypotheses and a property of the abs function.It is this observation that makes it possible to incorporate the correctness proofs in the definitions themselves -where the required property of the abs function is combined with its definition.This observation is an instance of the recomputation lemma of algebraic ornaments (McBride, 2010).Extending the translation scheme to use the B and C combinators is a bit harder.The code accompanying this paper demonstrates how to use the 'co-De Bruijn' representation of variables to define the optimising translation (McBride, 2018).Ralf Hinze suggested defining the translation directly using the sapp function.
Historically, combinatory logic arose from the desire to find a foundation for mathematics that avoided the issues surrounding variable binding (Schönfinkel, 1924;Curry et al., 1958).The translation between between lambda calculus and combinatory logic is well documented in numerous textbooks (see Barendregt, 1984, Chapter 7;Hindley & Seldin, 1986, Chapter 2;Sørensen & Urzyczyn, 2006, Chapter 5.4;Mimram, 2020, Chapter 3.6).There is a close connection between combinatory logic and Hilbert-style proof systemscognoscenti will recognise the correspondence between the first three axiom schemes and the types that can be assigned to the three combinators above.Since then, Turner (1979) has explored how to compile functional programs to combinatory logic (see also Peyton Jones, 1987, Chapter 16;Diller, 1988).This idea has been extended further by Hughes (1982) and many others, even leading to design of custom hardware for efficiently rewriting terms in combinatory logic (Stoye, 1983(Stoye, , 1985;;Scheevel, 1986).The lambda terms corresponding to the S and K combinators have made a recent reappearance as the operations defining the Reader applicative functor (McBride & Paterson, 2008).
As our starting point, we have taken the 'traditional' simply typed lambda calculus.More recent work by Kiselyov (2018) shows how a slight modification to the typing rules allows for a denotational semantics as combinators directly.Formalising this in a proof assistant, however, is left as an exercise for the reader.