GADTs Are Not (Even Partial) Functors

Generalized Algebraic Data Types (GADTs) are a syntactic generalization of the usual algebraic data types (ADTs), such as lists, trees, etc. ADTs’ standard initial algebra semantics (IAS) in the category Set of sets justify critical syntactic constructs — such as recursion, pattern-matching, and fold — for programming with them. In this paper, we show that semantics for GADTs that specialize to the IAS for ADTs are necessarily unsatisfactory. First, we show that the functorial nature of such semantics for GADTs in Set introduces ghost elements, i.e., elements not writable in syntax. Next, we show how such ghost elements break parametricity. We observe that the situation for GADTs contrasts dramatically with that for ADTs, whose IAS coincides with the parametric model constructed via their Church encodings in System F. Our analysis reveals that the fundamental obstacle to giving a functorial IAS for GADTs is the inherently partial nature of their map functions. We show that this obstacle cannot be overcome by replacing Set with other categories that account for this partiality.


Introduction
As functional languages have become increasingly sophisticated, so too have the data types they support.Algebraic data types (ADTs) -i.e., data types that are essentially tree types -are wellknown to support an initial algebra semantics (IAS) in any category with enough structure (Manes and Arbib 1986).The IAS interpretation of an ADT comprises exactly the interpretations of the terms that are writable in its syntax.IAS interpretations of ADTs provide semantic justification for useful computational tools for programming with them, such as pattern matching, recursion rules, induction rules, etc.Also fundamental is that the interpretation of the type constructor defined by an ADT can be extended to a functor whose action on morphisms interprets the ADT's syntactic map function.Categorical models in which ADTs have IAS interpretations include those of (Johann and Ghani 2007;Johann et al. 2021;Manes and Arbib 1986).In the model of (Johann et al. 2021), the syntactic generalization of ADTs known as nested types (Bird and Meertens 1998) also has IAS interpretations (Manes and Arbib 1986;nLab authors 2019).Examples of nested types include perfect trees and bushes; see Section 3 below.
In addition to ADTs and nested types, modern functional languages such as Haskell, Agda, and OCaml support generalized algebraic data types (GADTs) (Peyton Jones et al. 2006).As their name suggests, GADTs syntactically generalize nested types (and thus further generalize ADTs), so that in any such category, any extension to GADTs of the standard IAS for nested types in which the interpretations of the type constructors defined by GADTs extend to functors must be trivial.
Taken together, the two main results of this paper lead us to conclude that GADTs cannot be interpreted as functors in any reasonable computational setting.So if a functorial semantics like the one we seek is possible, then it will be necessarily in a setting much more exotic than the ones GADT programmers are accustomed to using to understand their programs.
This paper is an extended version of the LSFA 2021 paper (Johann et al. 2021).The results of Sections 2 through 5 were first reported by (Johann et al. 2021), although they have been reworked here.The work in Section 6 is entirely new.where each t ij is a type depending only on a.Such a data type can be thought of as a "container" for data of type a.The data in an ADT are arranged at various positions in its underlying shape, which is determined by the types of its constructors C 1 , . . ., C n .An ADT's constructors are used to build the data values of the data type, as well as to analyze those values using pattern matching.ADTs are used extensively in functional programming to structure computations, to express invariants of the data over which computations are defined, and to ensure the type safety of programs specifying those computations.

Syntax and Semantics of ADTs
List types are the quintessential ADTs.The shape of the container underlying the type is determined by the types of its two constructors Nil :: List a and Cons :: a → List a → List a.
These constructors specify that the data in a list of type List a are arranged linearly.The shape underlying the type List a is therefore given by the set N of natural numbers, with each natural number representing a choice of length for a list structure, and the positions in a structure of shape n given by natural numbers ranging from 0 to n − 1.Since the type argument to every occurrence of the type constructor List in the right-hand side of the above definition is the same as the type instance being defined on its left-hand side, the type List a enforces the invariant that all of the data in a structure of this type have the same type a.In a similar way, the tree type Tree a = Leaf a | Node (Tree a) a (Tree a) of binary trees has as its underlying shape the type of binary trees of units, and the positions in a structure of this type are given by sequences of L (for "left") and R (for "right") navigating a path through the structure.The type Tree a enforces the invariant that all of the data at the nodes and leaves in a structure of this type have the same type a.
Since the shape of an ADT structure -i.e., a structure whose type is an instance of an ADTis independent of the type of data it contains, ADTs can be defined polymorphically.As a result, an ADT structure containing data of type a can be transformed into another ADT structure of the exact same shape containing data of another type b simply by applying a given function f :: a → b to each of its elements.In other words, for every ADT T a, its underlying type constructor T can be made an instance of Haskell's Functor class by defining a type-and-data-uniform, structure-preserving, data-changing function map T for it. 3Then, given a type-independent way of rearranging an ADT structure's shape T a into the shape for another ADT structure T a, we get the same structure of type T b regardless of whether we first rearrange the original structure of type T a into one of type T a and then use map T to convert that resulting structure to one of type T b, or we first use map T to convert the original structure of type T a to one of type T b and then rearrange that resulting structure into one of type T b.For example, if f :: a → b, t :: Tree a, and g :: Tree c → List c is a polymorphic function (note that g's type is implicitly universally quantified over c) that arranges trees into lists in a type-independent way, then we have the following rearrange-transform property: map List f (g t) = g (map Tree f t)

ADTs as functors
The standard way to understand ADTs is at least fixpoints 4 of (first-order) functors on the category interpreting types.This category is typically taken to be the category Set, whose objects are sets and whose morphisms are functions between them, and we will do so here unless otherwise specified.But, as shown in (Johann and Polonsky 2019), we can interpret types as objects in any locally presentable category (Adámek and Rosický 1994) without affecting the development below.
Syntactically, ADTs can be represented as fixpoints in any language with primitives for sum types, product types, and recursion.If μ is the fixpoint operator in such a language, and if 1 is its unit type, then the fixpoint representations of the ADTs List a and Tree a are and respectively.That is, List a can be seen as a fixpoint of F List a , where Indeed, every element of List a is either empty or is obtained by consing an element of type a onto an already-existing structure of type List a, so that the above fixpoint equation is nothing more than a rewriting of the Haskell data type declaration for List a. Analogously, Tree a can be seen as a fixpoint of F Tree a , where F Tree a X = a + X × a × X, by (3).The intent here is that List a, i.e., μF List a , should be interpreted by the semantic fixpoint μF List a , where the functor F List a interprets the type constructor F List a , and Tree a, i.e., μF Tree a , should be interpreted by the semantic fixpoint μF Tree a , where the functor F Tree a interprets F Tree a . 5A similar situation obtains for other ADTs.The fixpoint equations above are entirely sensible at the level of types.But to ensure that the syntactic fixpoint representing an ADT actually denotes a semantic object computed as a semantic fixpoint, the corresponding semantic fixpoint calculation must converge.If, as is typical, we interpret our types as sets, then the fixpoint being taken must be of a functor on the category Set of sets and functions between them, rather than merely of a function between sets nLab authors (2019).That is, the function F interpreting the type constructor F constructing the body of a syntactic fixpoint must not only have an action on sets but must also have a functorial action on functions between sets.Reflecting this requirement back into syntax gives that F must support a function map satisfying the functor laws.That is, F must be an instance of Haskell's Functor class (with the aforementioned caveat about the functor laws).
Requiring F to be a functor is critical for the interpretation μF of the ADT T a = μF to exist.But to ensure that μF is itself a functor, so that the type constructor T associated with T a also supports its own map function map T , we can require that F be a functor on the category Set Set of functors and natural transformations on Set.That is, we can require that F be a higher-order functor on Set.Writing H in place of F to emphasize that it is higher-order, and reflecting this requirement back into syntax, we have that T a = (μH) a for a "type constructor constructor" H that supports suitable 6 map functions.
A concrete example is given by the ADT List a.This type is modeled as the fixpoint μF List a of the first-order functor whose action on sets is given by F List a X = 1 + a × X and whose action on functions is given by F List a f = id 1 + id a × f .The type constructor List is modeled by the functor that is the fixpoint μH of the higher-order functor H whose action on a functor F is given by the functor H F whose actions on sets and functions between them are given by H F X = 1 + X × F X and H F f = id 1 + f × F f , respectively, and whose action on a natural transformation η is the natural transformation whose component at X is given by (H η) X = id 1 + id X × η X .Reflecting the functorial action of μH back into syntax gives exactly Haskell's built-in map function as the type-and-data-uniform, structure-preserving, data-changing function associated with List.
Note that the rearrange-transform property for fixpoint representations of ADTs is simply the reflection back into syntax of the instance of naturality for the type-independent function that rearranges structures of type T a into ones of type T a and the structure-preserving, data-changing functions map T f and map T f for a function f :: a → b, where T and T are the type constructors associated with these ADTs, respectively.
Whenever a data type has an interpretation as the least fixpoint of a functor -or, equivalently by Lambek's Lemma, as the carrier of the initial algebra of that functor -we say that the data type has an initial algebra semantics (IAS).As mentioned in Section 1, ADTs and nested types are wellknown to have IAS (Johann and Ghani 2007;Johann et al. 2021;Manes and Arbib 1986).Having an IAS is the gold standard semantics for a data type: an IAS guarantees that the data type supports pattern matching on its constructors, an induction rule that can be used to prove properties on it, an elimination rule guaranteeing that functions over it can be written by recursion, etc.These essential programming tools are just reflections back into syntax of fundamental properties of IAS.

Syntax and Semantics of GADTs
Generalized algebraic data types (GADTs) (Peyton Jones et al. 2006) relax the restriction on the type instances appearing in a data type definition.The special form of GADTs known as nested types (Bird and Meertens 1998) allows the data constructors of a GADT to take as arguments data whose types involve type instances of the GADT other than the one being defined.However, the return type of each constructor of a nested type must still be precisely the one being defined.This is illustrated by the definition PTree a = PLeaf a | PNode (PTree (a × a)) of the nested type PTree a of perfect trees, which introduces the data constructors PLeaf :: a → PTree a and PNode :: PTree (a × a) → PTree a.It enforces not only the invariant that all of the data in a structure of type PTree a is of the same type a but also the invariant that all perfect trees have lengths that are powers of 2. GADTs allow their constructors both to take as arguments and return as results data whose types involve type instances of the GADT other than the one being defined.An example of a GADT is Seq, whose definition is data Seq a where Const :: a → Seq a Pair :: Since the return type of the data constructor Pair is not of the form Seq a for any variable a, Seq is a proper GADT, i.e., a GADT that is not a nested type.
By contrast with the ADT List a, where the type parameter a is integral to the type being defined, the type parameter a appears in both PTree a and Seq a as a "dummy" parameter used only to give the kind * → * of the type constructors PTree and Seq.This is explicitly captured in the alternative "kind signature" Haskell syntax, which represents PTree and Seq as respectively.A GADT -even a nested type -thus does not define a family of inductive types, one for each type argument, like an ADT does, but instead defines an entire family of types that must be constructed simultaneously.That is, a GADT defines an inductive family of types.Letting * n → * denote n asterisks * → * → • • • → * → * , we take the general form of a GADT to be data G :: * n → * where C l :: Here, for each i ∈ {1, . . ., m}, F i is a type constructor with type signature * n i → * that can involve G itself, and each component K ij of K i is either a projection or a type constructor in the language without GADTs whose type signature is * n i → * .In general, a type constructor T with type signature * k → * is said to have arity k.The overline notation denotes a finite list whose length is exactly the arity of the type constructor being applied to it.The number of type constructors in each K i is thus n, and the number of type variables in a i is thus n i .In addition, we require that each type constructor F i is constructed inductively according to the following grammar, where p ranges from 1 to the length of a: This grammar is subject to the following restrictions.In the fourth clause, neither G nor any (other) proper GADT appears in F, and L is a closed type.In the fifth clause, neither G nor any (other) proper GADT appears in any of the n type constructors in F. These requirements prevent the nesting of proper GADTs, which would not only render the ambient language inconsistent (Norell 2022) but would also make it impossible to obtain a parametricity theorem for any language extended with GADTs.In the sixth clause, H :: * k → * is a data type constructor defined by any nested type (including truly nested types).It thus subsumes the case in which F a is a closed type.
All of the particular GADTs considered in this paper conform to the syntax in (5).In addition to specifying the syntax of the GADTs we consider, we also assume that GADTs come equipped with constructs for defining functions (uniquely) over them.More precisely, each n-ary GADT G comes together with a rule of the following form: Given a type expression E a, whose n free type variables are the components of (7) a, and terms t i :: ∀a i .F i a i → E K i a i for each i ∈ {1, . . ., m}, there is a unique term t :: ∀a.G a → E a such that t • c i = t i for each i ∈ {1, . . ., m}.
This will be important in the proof of Theorem 18 below.
Proper GADTs are used in precisely those situations in which different behaviors at different instances of a data type are desired.This is achieved by allowing the programmer to give the type signatures of the GADT's data constructors independently -as is made explicit by the alternative syntax above -and then using pattern matching to force the desired type refinement.The same technique can be used to support type-indexed inductive families in dependent type theory.Note, however, that the impredicative nature of languages supporting GADTs entails they behave quite differently from those supporting type-indexed inductive families.
Applications of GADTs include generic programming, modeling programming languages via higher-order abstract syntax, maintaining invariants in data structures, and expressing constraints in embedded domain-specific languages.GADTs have also been used, e.g., to implement tagless interpreters (Pasalic and Linger 2004;Peyton Jones et al. 2006;Pottier and Régis-Gianas 2006), to improve memory performance (Minsky 2015), and to design APIs (Penner 2020).

GADTs are not functors
In this section, we show that, by contrast with the situation for ADTs and nested types, proper GADTs do not support map functions.That is, since proper GADTs are not uniform in their type parameters, they cannot be regarded as type-independent containers that can be filled with data of any type the way that ADTs and nested types can.But since a GADT's map function is just the reflection back into syntax of the functorial action of the functor interpreting it, this entails that, by contrast with the situation for ADTs and nested types, proper GADT syntax cannot be interpreted as functors on Set.We will consider various approaches to recovering functorial interpretations of GADTs in the remainder of the paper.
Example 1.The GADT Seq defined in (4) comprises sequences of any type a, and sequences obtained by pairing the data in two already-existing sequences.Syntactically, Seq contains no elements other than these.We naturally expect a GADT's interpretation, like those for ADTs and nested types, to contain only those data elements that are representable by its syntax.However, the definition in (4) specifies that an element of Seq of the form Pair t 1 t 2 must have the shape of a sequence of data of pair type rather than a sequence of data of an arbitrary type.This means that the clause of map for the Pair constructor should feed map a function f :: (a × b) → c and a term of the form Pair t 1 t 2 for t 1 :: Seq a and t 2 :: Seq b, and produce a term Pair t 3 t 4 for some appropriately typed terms t 3 and t 4 .However, it is not clear how to achieve this since c need not necessarily be a product type.And even if c were known to be of the form w × z, we still wouldn't necessarily have a way to produce data of type w × z from only f :: a × b → w × z and t 1 and t 2 unless we knew, e.g., that f was a product (f 1 :: a → w) × (f 2 :: b → z) of functions.Since Seq does not support a map function, it cannot be made into a functor.
As already noted, the fact that the syntax of GADTs allows non-variable type arguments in the return types of their data constructors establishes a strong connection between a GADT's shape and the data it contains.With ADTs, we first choose the shape of the container and then fill that container with data of whatever type we like; critically, the choice of shape is independent of the data to be stored.With GADTs, however, the shape of the container may actually depend on (the type of) the data to be contained.For example, Const can create data of any shape Seq a, but Pair can produce data of shape Seq a only if a is a pair type.As a result, modifying the data in a GADT's container may change the shape of that container, or even produce an ill-typed result.
To determine the possible shapes of a GADT's container, we must pattern match the type of the data to be contained.For this, it is essential that a GADT calculus supports an equality type Eq.This type is a singleton set when its two type arguments are the same and is the empty set otherwise.That is, it is the syntactic reflection of semantic equality function Eq.Then, every GADT can be written in terms of Eq (Cheney and Hinze 2003;Hinze 2003;McBride 1999;Schrijvers et al. 2009;Sheard and Pasalic 2004), so that Eq is, in some sense, the quintessential GADT.Unfortunately, however, Eq itself cannot be interpreted as a functor on Set, as the next example shows.
Example 2. The equality GADT Eq is defined by data Eq a b where Refl :: Eq a a (8) This GADT cannot be made into a functor: if Eq supported a function map Eq :: would allow us to construct an element eqElim (map Eq id 0 absurd Refl) 1 of the empty type 0, where 1 is (by abuse of notation) the unique element of the unit type 1, and absurd :: 0 → 1 is the empty function.But this is not possible.
Since a proper GADT can always be written in terms of Eq, its map function must necessarily involve Eq's map function, too.But since the Eq does not support a map function, it is immediate that GADTs cannot, in general, support map functions either.For Seq, this can be seen by noting that, given a function f :: (a × b) → c, the term map Eq (a×b) f Refl would have type Eq (a × b) c.But, as above, we have no way to produce a term of this type in the absence of a functorial map function for Eq, and thus no way to produce a term of type Seq c using the Pair constructor, as is required by the clause of map Seq for Pair.A similar analysis is obtained for other proper GADTs.

Recovering Functoriality
One way to read the results of Section 3.1 is as saying that, if we want to interpret types in Set, then we must be willing to accept that the interpretations of proper GADTs will necessarily contain "extra" elements that are not reachable in syntax.We will call such extra elements in the semantics ghost elements.The functorial completion (Johann and Polonsky 2019) of a GADT adds ghost elements to complete the interpretation of its syntax from a function on types to a functor on types.As we have seen, functoriality is absolutely essential to the initial algebra semantics of data types.
Since being a functor entails supporting a map function satisfying the functor laws, the functorial interpretation of a data type must include the entire "map closure" of its syntax.Intuitively, this means that the functorial interpretation of Eq, for example, contains not just interpretations of those data elements representable by its syntax, but also interpretations of all data elements of the form map Eq f g s for all types t 1 , t 2 , t 3 , and t 4 , all functions f :: t 1 → t 3 and g :: t 2 → t 4 definable in the language, and all s :: Eq t 1 t 2 , as well as all interpretations of data elements of the form map Eq h k s for all appropriately typed functions h and k and each element s already added to the data type, and so on.Functorial completion for Eq adds, in particular, interpretations of the problematic data elements of the form map Eq h k Refl from Example 2, even though these may not themselves be of the form Refl.All of these elements are ghost elements for Eq.Similarly, we see that the functorial interpretation of Seq contains not just interpretations of those data elements https://doi.org/10.1017/S0960129524000161Published online by Cambridge University Press representable by its syntax, but also interpretations of all data elements of the form map Seq f s for all types t 1 and t 2 , all functions f :: t 1 → t 2 definable in the language, and all s :: Seq t 1 , as well as interpretations of all data elements of the form map Seq g s for each appropriately typed function g and each element s already added to the data type, and so on.Functorial completion for Seq adds, in particular, interpretations of the problematic data elements of the form map Seq g (Pair t 1 t 2 ) from Section 3.1, even though these may not themselves be of the form Pair t 3 t 4 for any terms t 3 and t 4 .All of these elements are ghost elements for Seq.Importantly, functorial completion adds no ghost data elements to the interpretations of GADTs that are ADTs or other nested types.
We will now make the above precise, by showing formally that the functorial completion of a proper GADT necessarily contains more data elements than those representable in the GADT's syntax because it adds ghost elements to the GADT's interpretation.When interpreted as their functorial completions, GADTs can, like ADTs, be modeled as fixpoints of higher-order functors.Syntactically, higher-orderness is essential: since the type arguments to the GADT being defined are not necessarily uniform across all of its instances in the types of its data constructors, GADTs cannot be seen as first-order fixpoints the way ADTs can.Semantically, (higher-order) functors are essential: as in the standard semantics for ADTs, functoriality guarantees the existence of the (higher-order) fixpoints being computed (nLab authors 2019).
To illustrate the process of computing the functorial completion of a proper GADT, consider again the GADT Seq.Because its type argument varies in the instances of Seq appearing in the types of its data constructor Pair, Seq cannot be modeled as the fixpoint of any first-order functor.As shown in (Johann and Polonsky 2019), it can, however, be modeled as a solution to the higherorder fixpoint equation where Lan K F is the left Kan extension of the functor F along the functor K.In general, the left Kan extension Lan K F : E → D of F : C → D along K : C → E is the best functorial approximation to F that factors through K. Intuitively, "best functorial approximation" means that Lan K F is the smallest functor that both extends the image of K to D and agrees with F on C, in the sense that, for any other such functor G, there is a morphism of functors (i.e., a natural transformation) from Lan K F to G. Formally, this is captured by the following definition (MacLane 1971): To represent GADTs as fixpoints in a setting in which types are interpreted as sets, a calculus must support a primitive construct Lan in such a way that the type constructor Lan K F is the syntactic reflection of the left Kan extension Lan K F of the functor F interpreting F along the functor K interpreting K.If F and the n components of K all have type signature * k → * , then Lan K F has type signature * n → * .It also comes together with a term eta :: ∀b.F b → Lan K F K b https://doi.org/10.1017/S0960129524000161Published online by Cambridge University Press and the following rule: Given a type expression E a, whose n free type variables are the components of a, and a term u :: ∀b.F b → E K b, there is a unique term t :: ∀a.Lan K F a → E a such that t • eta = u.For our purposes, the categories C, D, and E must all be of the form Set m for some m, and the functors F and K must be finitely accessible.This ensures that the left Kan extensions Lan K F all exist, since each category Set m is locally finitely presentable (see (Johann and Polonsky 2019) for a detailed account).Using Lan we can then rewrite the type of a constructor C :: F a → G (K a) as C :: (Lan K F) a → G a since, by Definition 3, morphisms (i.e., natural transformations) from F to G • K are in one-to-one correspondence with those from Lan K F to G.That is, writing F ⇒ G for the set of natural transformations from a functor F to a functor G, we have The calculus must also support a primitive type constructor μ that is the syntactic reflection of the (now higher-order) fixpoint operator on Set Set .Using μ and Lan we can then represent a GADT as a higher-order fixpoint.For example, we can represent the GADT Seq as The fact that Lan K F is the best functorial approximation to F factoring through K means that the type constructor Lan K F computes the smallest collection of data that is generated by the corresponding GADT data constructor's syntax and also supports a map function.Such a fixpoint representation of any GADT thus comprises the smallest data type that both includes the data specified by that GADT's syntax and also supports a map function.When viewed as fixpoints, then, proper GADTs are underspecified by their syntax.We can use Definition 3 to make precise the intuition that functorial completion adds in new elements to interpretations of proper GADTs.This will be shown explicitly in the next example.Despite its simplicity, the GADT defined in (10) serves as an informative case study highlighting the difference between simply interpreting a proper GADT's syntax and interpreting a proper GADT's syntax as a functor -even if we are content to consider only the data elements it contains and ignore whether or not it supports a map function.But since the recursion variable φ does not appear in this body, the interpretation of the fixpoint is just the interpretation of the body itself.The interpretation of G a is therefore (Lan λu.1 λu.1) A, where A interprets a.It turns out, however, that, for any set A, (Lan λu.1 λu.1)A is, in fact, exactly A. Indeed, Proposition 7.1 of (Bush et al., 2003) gives that (Lan λu.1 λu.1)A can be computed as where U is the unique object of Set 0 , * is the unique element of the singleton set 1, and ∼ is the smallest equivalence relation such that (U, f , * ) and (U, f , * ) are related if commutes, i.e., if f = f .Since the relation generating ∼ is already an equivalence relation, we have that (U, f , * ) ∼ (U, f , * ) iff f = f .Thus, up to isomorphism, (Lan λu.1 λu.1)A = {f : 1 → A}, i.e., (Lan λu.1 λu.1) Notice that this is different from what we expect just by looking at G's syntax.Indeed, we expect exactly one data element at instance G 1 and no elements at any other instances.However, the interpretation of the fixpoint representation of G has data elements at every instance other than G 0. These additional data elements can be obtained by reflecting back into syntax the elements map G f a c ∈ G A resulting from applying the functorial action map G of G's interpretation G to the functions f a : 1 → A determined by the elements a of A = ∅ and the interpretation c of C.

More fundamentally, we have
Example 5. Syntactically, the GADT Eq defined in (8) comprises the data elements Refl :: Eq c c for each type c, and no others.In other words, Eq's effect is to test the equality of its two arguments.To compute the interpretation of the binary GADT Eq's fixpoint representation, we first note that Eq's sole constructor Refl :: Eq c c is equivalently expressed as Refl :: 1 → Eq c c or, using (9), as Refl ::  (A, B) is, in fact, the singleton set 1. Indeed, Proposition 7.1 of (Bush et al. 2003) gives that (Lan λc.(c,c) λc.1) (A, B) can be computed as where ∼ is the smallest equivalence relation such that (C, f , g, * ) and (C , f , g , * ) are related if there exists h : C → C such that Notice that this is different from what we expect just looking at Eq's syntax: We expect exactly one data element at instance Eq a a for each type a and no elements at any other instances.However, the interpretation of the fixpoint representation of Eq a b has data elements at every instance.These additional data elements can be obtained by reflecting back into syntax the elements map Eq (π 1 , π 2 ) r ∈ Eq(A, B) resulting from applying the functorial action map Eq of Eq's interpretation Eq to π 1 , π 2 , and the interpretation r of Refl in Eq(A × B, A × B).

Functorial interpretations in Set are insufficient
We have now seen that even though the functorial completions of G and Eq give the smallest extending functor interpreting GADTs, they must still introduce ghost elements.And since all proper GADTs can be written in terms of Eq, the same is true for them.But why should a programmer accept elements in the interpretation of a proper GADT that are not reachable in syntax?Forced to choose, a programmer would likely find the idea that a proper GADT contains data not specified by its syntax more than a little disturbing.What, they might ask, should a data type contain other than data that are constructed using its data constructors?That is, why should a proper GADT's interpretation contain ghost elements that are not specified by its syntax, and are only accessible via applications of its interpretation's functorial action?
From a semanticist's point of view, on the other hand, functorial completions of GADTs are entirely reasonable.Indeed, a semanticist would likely find the nonfunctorial nature of a GADT's syntax unnerving at best.After all, they would likely argue, if a GADT is supposed to be a data type, then the data in it shouldn't change or become ill-typed just because a function is mapped over it.The fact that this happens to GADTs when regarded just as their syntax actually highlights how GADTs do not generalize the essential, container-ish nature of ADTs at all.A semanticist might therefore conclude that GADTs are, at the very least, seriously misnamed.
Note, however, that even if we do accept ghost elements in the interpretation of GADTs, they still don't behave as expected.Indeed, as we show in Section 5, a language interpreting GADTs as their functorial completions cannot have parametric models (Reynolds 1983), as we would expect them to do.This means that languages with GADTs do not necessarily enjoy consequences of parametricity such as representation independence (Ahmed et al. 2009;Dreyer et al. 2012), equivalences between programs (Hur and Dreyer 2011), deep induction principles (Johann and Ghiorzi 2022; Johann and Polonsky 2020), and useful ("free") theorems about programs derived from their types alone (Wadler 1989).

Functorial Completion Does Not Support Parametric Semantics
Relational parametricity encodes a powerful notion of type uniformity, or representation independence, for data types in functional languages.It formalizes the intuition that a polymorphic program must act uniformly on all of its possible type instantiations by requiring that every such program preserves all relations between pairs of types at which it is instantiated.Parametricity was originally put forth by Reynolds (Reynolds 1983) for System F. It was later popularized as Wadler's "theorems for free" (Wadler 1989), so called because it can deduce properties of programs solely from their types, i.e., with no knowledge whatsoever of the text of the programs involved.Most of Wadler's free theorems are consequences of naturality for polymorphic list-processing functions.However, parametricity can also derive results that go beyond just naturality, such as inhabitation results.It can also be used to prove the equivalence of Church encodings and fixpoint representa-tions of ADTs and nested types by validating shortcut fusion and other program equivalences for them (Johann 2002;Wadler 1989).
To show that interpreting GADTs as their functorial completions cannot lead to a parametric semantics, and thus that the semantics they do lead to are unsatisfactory for reasoning about programs involving GADTs, we will need to interpret data types not just in Set, but in a suitable category of relations as well.The following definition is standard: Definition 6.The category Rel has: , where the composition being defined on the left-hand side is in Rel, and the two componentwise compositions on the right-hand side are in Set.
We write R ∈ Rel (A, B) for (A, B, R) ∈ Rel.If R ∈ Rel (A, B) then we write π 1 R and π 2 R for the domain A and codomain B of R, respectively.We write I A = (A, A, {(x, x) | x ∈ A}) for the equality relation on the set A.
The key idea underlying parametricity is to give each type G[a] 7 with one free variable a a set interpretation G 0 taking sets to sets and a relational interpretation G 1 taking relations R ∈ Rel (A, B) to relations G 1 R ∈ Rel (G 0 A, G 0 B), and to interpret each term t (a, x) :: G[a] with one free term variable x :: F[a] as a function t associating to each set A a morphism t A : Here, F 0 is the set interpretation of F. These interpretations are given inductively on the structures of G and t in such a way that they imply two fundamental theorems.The first is an Identity Extension Lemma, which states that G 1 I A = I G 0 A , and is the essential property that makes a model relationally parametric rather than just induced by a logical relation.The second is an Abstraction Theorem, which states that, for any R ∈ Rel (A, B), The Identity Extension Lemma is similar to the Abstraction Theorem except that it holds for all elements of a type's interpretation, not just those that interpret terms.Similar theorems are required for types and terms with any number of free variables.In particular, if t is closed (i.e., has no free term variables) then t A ∈ G 0 A for all A ∈ Set, and (t A, t B) ∈ G 1 R for all A, B ∈ Set and all R ∈ Rel(A, B).
Before showing that languages interpreting GADTs as their functorial completions cannot have parametric models, we first show that languages interpreting them as the interpretations of their Church encodings can.The Church encoding of an ADT or nested type (see, e.g., (Geuvers 2014;Koopman et al. 2014;Pierce 2002)) represents that data type as a term in the higher-order polymorphic lambda calculus F ω (Barendregt 1984), an extension of System F whose type expressions include functions from types to types (i.e., type constructors) and whose terms can abstract over types of all kinds.In particular, expressions of any kind can be universally quantified over variables of any kind.This makes it possible to give Church encodings of ADTs and nested types in F ω that are similar to, e.g., the standard Church encoding of natural numbers in System F. GADTs can be encoded in the same way.For instance, the Church encoding of the type Seq a is The fact that interpreting GADTs as the interpretations of their Church encodings does admit parametric models (Atkey 2012), whereas interpreting them as their functorial completions does not, means that these two interpretations of GADTs cannot possibly be equivalent.Taken together, Examples 7 and 8 will show that they actually behave very differently with respect to parametricity.This contrasts sharply with the fact that both ADTs and nested types have the same parametricity properties regardless of whether they are interpreted as the interpretations of their Church encodings or as their functorial completions.This is yet another way in which the functorial completion semantics for GADTs is unsatisfactory: it specializes in the standard IAS for ADTs, but it doesn't share the same parametricity properties.
The existence of parametric models that interpret GADTs as the interpretations of their Church encodings follows from, e.g., the existence of the parametric model of F ω constructed in (Atkey 2012).In that model, types are interpreted "in parallel" in (types corresponding to) Set and Rel in the usual way, including the familiar "cutting down" of the interpretations of ∀-types to just those elements that are "parametric" (Reynolds 1983;Wadler 1989) to ensure that the Identity Extension Lemma holds.If the set interpretation of Eq is the function Eq and if the relational interpretation of a closed type a with interpretation A is I A as intended, then the parametricity property for a GADT is an inhabitation result saying that the set interpreting any instance of that GADT contains exactly the interpretations of the data elements that can be formed using its data constructors and whose type is that instance.The parametricity property for the Church encoding of a GADT G gives that G a is inhabited iff data elements of the instance of G at a can be formed using G's data constructors.In particular, the parametricity property for the GADT G from (10) gives that G a contains a single data element if a is semantically equivalent to 1 and none otherwise.Indeed, we have Example 7. Let t be a closed term of type G a for the GADT G defined in (10), let G = (G 0 , G 1 ) be the interpretation of the Church encoding of G, let t be the interpretation of t, and let R ∈ Rel (A, B).Then t A ∈ G 0 A and t B ∈ G 0 B and, by the Abstraction Theorem (Theorem 3) in (Atkey 2012), t A and t B must be related in G 1 R.However, under the semantics given in (Atkey 2012), which includes the aforementioned interpretations of Eq and closed types, the relational interpretation G 1 R of G is itself I G 0 1 when R is the relational interpretation I 1 of 1, and the empty relation whenever R differs from I 1 .Reflecting back into syntax we deduce that there can be no term in the type that is the Church encoding of G a unless a is semantically equivalent to 1.
For functorial completion interpretations of GADTs, the story is completely different.If, as intended, the set interpretation of Lan K F is Lan K F, where K is the set interpretation of K and F is the set interpretation of F, then the exact same reasoning gives that the relational interpretation of Lan K F is Lan K F, where K is the relational interpretation of K and F is the relational interpretation of F. But under these interpretations, there can be no parametric model.The following counterexample establishes this surprising result.
Example 8.In any parametric model, we must give both a set interpretation and a relational interpretation for every type as described at the start of this section.In particular, for every GADT G we must give an interpretation G = (G 0 , G 1 ) such that, for every relation R ∈ Rel (A, B), we have G 1 R ∈ Rel (G 0 A, G 0 B).Intuitively, when G is viewed as a fixpoint, its data elements include those given by its functorial completion.Since G 1 is a functor, given any relation S ∈ Rel (C, D) and any morphism m : S → R, G 1 R must contain all elements of the form G 1 m x for x ∈ G 1 S.But the two components m 1 : C → A and m 2 : D → B of m cannot be given independently of one another, since Definition 6 entails that (m 1 c, m 2 d) must be in R whenever (c, d) is in S for (m 1 , m 2 ) to be a well-defined morphism of relations.The domain of G 1 R thus depends on both A and B, rather than simply on A. Likewise, the codomain of G 1 R also depends on both A and B. The domain and codomain therefore cannot simply be G 0 A and G 0 B, respectively.This suggests that GADTs might fail to have relational interpretations, and thus might fail to have parametric models, as described in the previous paragraph.
We can make this informal argument formal by providing a concrete counterexample.Consider again the GADT G given by (10).The set functorial completion interpretation of G is Lan λu.1 λu.1, i.e., is, by the reasoning of Example 4, the identity functor on Set.By the exact same reasoning, this time in Rel rather than in Set, the relational functorial completion interpretation of G Lan λu.I 1 λu.I 1 , where λu.I 1 is the constantly I 1 -valued functor from the category Rel 0 with a single object to Rel.Indeed, this interpretation is still a left Kan extension, but now it is the left Kan extension determined by the functor interpreting λu.1 in Rel.For the Identity Extension Lemma to hold, for every relation R ∈ Rel (A, B) we would need (Lan λu.I 1 λu.I 1 ) R to be a relation between the sets (Lan λu.1 λu.1)A and (Lan λu.1 λu.1) B, i.e., between the sets A and B. However, this need not be the case.
Consider the relation R = (1, 2, 1 × 2), where 1 × 2 relates the single element of 1 to both elements of the two-element set 2. We expect (Lan λu.I 1 λu.I 1 ) R to be a relation with domain 1.Since left Kan extensions preserve projections (Riehl 2016), we can compute the domain as π 1 (Lan λu.I 1 λu.I 1 ) R = (Lan λu.I 1 λu. 1) R (Note that the left Kan extension of λu. 1 along λu.I 1 is a functor from Rel to Set.)By the same reasoning as in Example 4, Proposition 7.1 of (Bush et al. 2003) gives that (Lan λu.I 1 λu.1)R can be computed as But this set is {(!, k 0 ), (!, k 1 )}, where k 0 , k 1 : 1 → 2 are the constantly 0-valued and 1-valued functions in Set, respectively, and therefore π 1 (Lan λu.I 1 λu.I 1 ) R is not 1, as would be needed for the Identity Extension Lemma to hold.Since the Identity Extension Lemma does not hold for models in which GADTs are interpreted by their functorial completions, such models cannot possibly be parametric.
It is actually possible to construct a simpler counterexample to the Identity Extension Lemma for functorial completion interpretations of GADTs using the relation R = (1, ∅, ∅).However, this relation is somewhat artificial, in the sense that its domain is larger than is strictly necessary to define an empty relation.Since it is also too degenerate to properly expose the mismatch between left Kan extensions at the level of sets and left Kan extensions at the level of relations, we give the above example using the relation (1, 2, 1 × 2) instead.
One way to read the result of this subsection is as in (Johann et al. 2021): if we interpret types in Set and n-ary GADTs as functors from Set n to Set, then, as with software engineering's iron triangle, we can have any two of GADTs, functoriality, and parametricity we like, but we cannot have all three.

Functorial Interpretations in PSet and Beyond
We have seen in Section 4.1 that interpreting GADTs as their functorial completions is unsatisfactory.We note, however, that partiality is inherent in the syntax of GADTs.Indeed, one way to understand Examples 1 and 2 is as showing that the map functions for Seq, as well as for Eq (and thus those for all proper GADTs), do not map arbitrary functions over their elements.That is, proper GADTs' map functions are only partially defined (Johann and Cagne 2022;Johann and Ghani 2008;Johann and Polonsky 2019).To interpret GADTs as functors without adding ghost elements to their interpretations, the category in which we interpret them must therefore account for partial functions.
To find a semantics of GADTs that accounts for the partiality of their map functions, we need to allow the functorial actions of GADTs' interpretations to yield partial functions.However, the interpretations of the functions that are representable in syntax and don't involve mapping over elements of GADTs should still be total.That is, we seek categories C that capture partiality in the computationally relevant sense that once a computation diverges it does not become defined again, and in which Set embeds in such a way that it can be considered the subcategory of "total morphisms" of C for some reasonable notion of totality.Obviously, the category PSet of sets and partial functions between them is such a category C since morphisms there propagate undefinedness.But rather than focusing on the specific category PSet, we present here an abstract framework that encompasses much more.The advantage of introducing a framework is two-fold.First, our main result (Theorem 18) holds in categories more general than PSet, including exotic categories C in which Set embeds nicely as a subcategory of total morphisms.And secondly, the framework does not require that the category of total morphisms is Set specifically.The latter entails that Theorem 18 holds not just when the subcategory of total morphisms is Set, but when it is any locally presentable category in which ADTs and nested types can be given their standard IAS.
The reader may wonder why we develop an abstract framework for our results about interpreting GADTs as functors in PSet, but were content to restrict attention to the specific category Set in the previous sections even though, as noted there, we could also have developed the results in those sections for arbitrary locally presentable categories.The main reason is that how to move from Set to locally presentable categories is well-known in the literature, so moving to the more abstract setting in Sections 2 through 5 provides no additional insights that cannot be gleaned from Set.On the other hand, how to move from PSet to more general categories C of the kind we seek now has not previously been known, so the categorical framework for partiality that we provide here is itself part of the contribution of this paper.
We now identify those categories capturing computationally relevant partiality.We then show in Theorem 18 that any semantics in such a category is trivial if we insist that the interpretations of GADTs in them must extend to functors.Given a category C, we write Mor (C) for its (possibily large) set of morphisms.We begin by recalling two classic definitions, the first of which "categorifies" the notion of an ideal in monoids.Definition 9. A cosieve in a category C is a (possibly large) subset S ⊆ Mor (C) such that for all morphisms f : A → B and g Definition 10.A wide subcategory of a category C is a subcategory of C that contains all objects of C.
If D is any subcategory of C, we write D for the complement of D, i.e., for the (possibly large) set Mor (C) \ Mor (D).We can then introduce the following new definition: Definition 11.A structure of computational partiality on a category C is a wide subcategory D of C such that D is a cosieve.
In a category C equipped with a structure of computational partiality D, we call the morphisms of D total and those of D properly partial.Morphisms of C might be referred to simply as partial.It is not hard to see that Set is a structure of computational partiality on PSet and that the terminology introduced in this paragraph accords with what is used there.Indeed, the intuition behind Definition 11 is that D is the collection of computations that are actually undefined on a non-empty set of objects of C (equivalently, of D).Following that intuition, both identities and compositions of total functions must be total functions, and when a function yields an error on an input there is no way to come back from the error by postcomposing with another function.Other categorical frameworks capturing partiality include p-categories (Robinson and Rosolini 1988), (bi)categories of partial maps (Carboni 1987), categories of partial morphisms (Curien and Obtułowicz 1989), and restriction categories (Cockett and Lack 2002).These all give rise to structures of computational partiality.
Recall that a split monomorphism s : A → B in a category C is a monomorphism in C for which there exists a morphism r : B → A such that rs = id A .We have the following basic fact about split monomorphisms in a category equipped with a structure of computational partiality: Lemma 12.In a category equipped with a structure of computational partiality, split monomorphisms are always total.Proof.Let s : A → B be a split monomorphism in a category C, and suppose r : B → A is such that rs = id A .If s was properly partial in a structure of computational partiality on C, then rs, and thus id A , would be properly partial as well.But id A is total by definition, so it cannot be.Thus, s must be total.
Our aim is to consider interpretations of (languages supporting) GADTs in a category C equipped with a structure of computational partiality D. However, to do so we require a bit more structure.Specifically, D must have finite products, and those products in D must extend to C in the sense introduced and justified below.
Definition 13.Let C be a category equipped with a structure of computational partiality ) and f 2 (a 2 ) are defined undefined otherwise Then if f 1 and f 2 are total functions, f 1 ⊗ f 2 is total as well and actually coincides with the cartesian product of f 1 and f 2 in Set.That is, for the inclusion ι : Notice, however, that f 1 ⊗ f 2 is not the product of f 1 and f 2 in PSet.Indeed, the product of two objects A 1 and A 2 in PSet is the disjoint union A 1 + (A 1 × A 2 ) + A 2 .Writing i, j, and k for the three canonical injections into this disjoint union, the product f 1 × f 2 in PSet is the partial function from if f 1 (a 1 ) and f 2 (a 2 ) both defined (f 1 × f 2 )(j(a 1 , a 2 )) = i(f 1 (a 1 )) if f 1 (a 1 ) defined and f 2 (a 2 ) undefined = undefined otherwise The square for some i, and maps (x 1 , . . ., x n ) to (f 1 (x 1 ), . . ., f n (x n )) otherwise.The category ωpCPO t is thus a structure of computational partiality on ωpCPO whose products extend to ωpCPO.Indeed, if f i : D i → D i , i = 1, . . ., n, are strict Scott-continuous functions, then there is a strict Scott-continuous function ωpCPO for some i, and maps (x 1 , . . ., x n ) to (f 1 (x 1 ), . . ., f n (x n )) otherwise.Note that if each f i is in ωpCPO t then ωpCPO n f i is as well and actually coincides with the product of the f i s in ωpCPO t .Moreover, the construction above is clearly functorial in the f i s.Thus, for the inclusion ι : ωpCPO t → ωpCPO, we have ωpCPO Note, however, that ωpCPO n f i is not the product of f 1 , . . ., f n in ωpCPO.The product ωpCPO n D i of D 1 , . . ., D n in ωpCPO is the cartesian product of the sets underlying the D i s, ordered componentwise, and the product . Writing j X 1 ,...,X n for the canonical inclusion of ωpCPO t n X i into ωpCPO n X i , we thus see that the square Now fix a category C equipped with a structure of computational partiality D. Suppose that D has finite products and that they extend to C. We want to define a good notion of an interpretation of (a language supporting) GADTs in (C, D).Although we want to remain as language-agnostic as possible, so that our result can be replayed in as many different settings as possible, we must still make some reasonable assumptions about the type theory underlying the ambient language.Such a type theory necessarily describes how to construct types from a given context of type variables, and how to construct typed terms from a given context of typed term variables.It should also come with a mechanism (such as an operational semantics) that verifies that a term is terminating.Non-terminating terms represent those programs that loop infinitely or are undefined on certain inputs.Terminating terms represent those programs that compute actual values in finite time.
The key idea guiding Definition 17 below is to interpret contexts (and thus types) by objects of C and to interpret terms by morphisms of C in such a way that the interpretations of terminating terms are actually in D. With this in mind, we are led to interpret a context comprising variables of types a 1 , . . ., a n as the product of a i for i = 1, . . ., n.Now, in any context , we can always define the term that picks out the i th variable: it is a terminating term (it represents the program that takes n inputs and simply returns the i th ), so its interpretation must be a morphism π together with the π i s, actually constitute a product in D, we have to turn to the rules that govern the production of morphisms of contexts in the underlying type theory.A morphism from a context to a context is given by a term of type a in context for each type a appearing in .Such a morphism of contexts is considered terminating only if each of the terms defining it is terminating; this corresponds to the intuitive expectation that the simultaneous run of the programs represented by the terms terminates successfully if and only if each of the runs terminates individually.This programming language feature can be expressed in the semantics by requiring that [[ ]] has the following property: for any object C of C, every tuple of morphisms f for the context containing only a variable of type b 1 and a variable of type b 2 .We need to be able to construct the interpretation of the morphism of contexts from to given by the pair (t 1 , t 2 ).That is, we want to construct from Moreover, when t 1 and t 2 are terminating terms, their interpretations are morphisms of D and we want [[(t 1 , t 2 )]] to be in D as well.In addition, in that case we want to recover the usual interpretation of morphisms of contexts, i.e., we want [[(t 1 , t 2 )]] to be the morphism ] and [[t 2 ]] are not total, this construction cannot be done unless the product _ × D _ of D extends to C as a functor _ ⊗ _.In this case, we can define [[(t 1 , t 2 )]] to be the morphism When C is Set (or any other locally presentable category), Definition 17 recovers the functorial semantics for GADTs in Set (or, more generally, in C) used in Sections 2 through 5 by taking D to be C as in Example 14.This perfectly captures the fact that all morphisms are total in that semantics.
We can now prove that, like functorial interpretations of GADTs in Set, functorial interpretations of GADTs in more general categories equipped with structures of computational partiality are also insufficient.
Theorem 18.Let C be a category equipped with structure of computational partiality D. Suppose [[_]] is an interpretation of GADTs in (C, D) relative to which each GADT is manifested by a functor.Then [[a]] 1 for all closed types a containing terminating terms.
Proof.Among the GADTs in our language is the GADT Eq defined in (8).In addition to the function eqElim in Example 2, we can also use the recursion rule for GADTs to define its companion function reduces to the identity function on type a 1 .By the uniqueness property of functions defined over GADTs given in (7), λp y → eqElim p (eqElim −1 p y) reduces to the identity function on type a 1 for any input p. Semantically this entails that, if ϕ a is the canonical isomorphism a 1 × a, and if p : 1 → [[Eq a 1 a 2 ]] is any total function, then Now, let a be a closed type and let t be a terminating closed term of type a.We abuse notation and write [ .The composition rs is necessarily id 1 because it is total and 1 is terminal in D. This explicitly gives that [[a]] 1, as announced in the statement of the theorem.

Conclusion and Related Work
The first part of this paper shows that GADTs do not have satisfactory IAS as functors on Set: the functorial completion semantics, which would give the smallest functorial IAS for them, do not have the expected parametricity properties so no others do either.Recognizing that the underlying reason for this is that GADTs' map functions are inherently partial naturally led us to consider analogous semantics on PSet.But we have shown in the second part of the paper that, unfortunately, GADTs do not have IAS interpretations as functors on PSet either.These results show that if we hope to find IAS interpretations for GADTs as functors on some category, and if we hope that the resulting IAS will specialize to the standard one for ADTs and nested types, then we will have to look in categories far more esoteric than Set and PSet.
The fundamental obstruction we have exposed in Section 6 is that GADTs' map functions are partial in ways that are not compatible with composition.Indeed, as Theorem 18 shows, inconsistencies arise when a composition can be mapped over an element of a GADT even though the first function in the composition cannot.These kinds of pitfalls can be avoided either (i) by abandoning the partiality modeled by categories equipped with structures of computational partiality, or (ii) by abandoning the classical notion of functoriality.But (i) seems neither feasible nor desirable, since categories equipped with structures of computational partiality appear to capture exactly the computationally relevant notion of partiality it needs to.On the other hand, (ii) involves changing the notion of compositionality for GADTs' map functions so that, in Theorem 18, a GADT's interpretation needn't send the composition of t 1 and t 2 to the composition of their images under that interpretation.The challenge here is that if G is an n-ary GADT then the interpretation of G cannot simply be a functor.It must still have actions on objects and morphisms of C n like functors do, and must still send identities to identities, but it need not respect composition.Instead, the image of the composition of t 1 and t 2 under G's interpretation must only be "more defined" than the composition of the images of t 1 and t 2 under that interpretation.We will therefore consider, in future work, semantics in categories equipped with an ordering on morphisms and such that the interpretations of GADTs are normal lax functors instead of functors.Jay (Jay 1991) lays out the basic theory of such categories and lax functors that we plan to exploit to define the kind of semantics for GADTs we seek.
There are treatments of GADTs beyond those discussed in the main body of this paper.Atkey's parametric model for F ω from (Atkey 2012) represents data types -including GADTsas Church encodings.It requires the user to supply a map function for the (higher-order) type constructor whose fixpoint characterizes the data type.But, importantly, functoriality of an underlying type constructor does not imply functoriality of its fixpoint, so the data type itself still need not necessarily support a map function in Atkey's model.Similarly, (Vytiniotis and Weirich 2010) present a parametric model for an extension of F ω that supports type equality and thus can encode GADTs, but this model still does not guarantee functoriality; accordingly, the parametric properties of GADTs described in the precursor work (Vytiniotis and Weirich 2006) to (Vytiniotis and Weirich 2010) are all inhabitation results rather than naturality results.In (Mandelbaum and Stump 2009) GADTs are represented as Scott encodings rather than Church encodings but, again, only inhabitation results are cited for them.GADTs are treated explicitly as fixpoints of discrete functors in (Johann and Ghani 2008), as initial algebras of dependent polynomial functors in (Gambino and Hyland 2004;Hamana and Fiore 2011), and as indexed containers in (Morris and Altenkirch 2009).The latter two treatments move toward seeing GADTs as data types in a dependent type theory.A categorical parametric model of dependent types has been given by (Atkey et al. 2014), but, as with the models mentioned above, this model also does not guarantee that GADTs have functorial semantics.
functor laws -i.e., preservation of identity functions and composition of functions -even though this is not enforced by the compiler and is instead left to the good intentions of the programmer.4 In the rest of the paper, we will refer to the least fixpoints of endofunctors simply as fixpoints.Other fixpoints are of no interest to us because they are not carriers of initial algebras.5 Throughout this paper, we use sans serif font for program text and math italic font for semantic objects.6 The map function for H is intended to satisfy syntactic reflections of the functor laws in Set Set -i.e., preservation of identity natural transformations and composition of natural transformations -and the map function for H F is intended to satisfy syntactic reflections of the functor laws in Set, even though there is no mechanism in Haskell for enforcing this.7 The notation G[a] indicates that G is a type with one hole which has been filled with the type a.

Example 4 .
Syntactically, the GADT G defined by data G a where C :: G 1 (10) comprises a single data element, namely C :: G 1. As the definition of G makes clear, G's effect is simply to test its argument for equality against the unit type 1.To compute the interpretation of G's fixpoint representation we first note that the type of G's solitary constructor C :: G 1 is equivalently expressed as C :: 1 → G 1 or, using (9), as C :: (Lan λu.1 λu.1) a → G a, where λu.1 is the syntactic reflection of the constantly 1-valued functor from the category Set 0 with a single object to Set.We can therefore represent G as G a = (μφ.λb.(Lan λu.1 λu.1) b) a The interpretation of G is obtained by computing the fixpoint of the interpretation of the body λb.(Lan λu.1 λu.1) b of the syntactic fixpoint μφ.λb.(Lan λu.1 λu.1) b and applying the result to a.
(Lan λc.(c,c)  λc.1) a b) → Eq a b, where λc.1 is the syntactic reflection of the constantly 1valued functor from the category Set to itself, and λc.(c, c) is that of the diagonal functor from Set to Set 2 mapping every set C to the pair (C, C).We can therefore represent Eq as Eq a b = (μφ.λc.(Lan λc.(c,c) λc.1) c) a b) The interpretation of Eq is obtained by computing the fixpoint of the interpretation of the body λd e.(Lan λc.(c,c) λc.1) d e of the syntactic fixpoint μφ.λd e.(Lan λc.(c,c) λc.1) d e and applying the result to a and b.But since the recursion variable φ does not appear in this body, the interpretation of the fixpoint is just the interpretation of the body itself.The interpretation of Eq a b is therefore (Lan λc.(c,c) λc.1) (A, B), where A and B interpret a and b, respectively.It turns out, however, that, for any sets A and B, (Lan λc.(c,c) λc.1) For example, the Church encoding of the type List a in F ω is ∀f.(∀b.f b) → (∀b.b → f b → f b) → f a and that of PTree a is ∀f.(∀b.b → f b) → (∀b.f (b × b) → f b) → f a https://doi.org/10.1017/S0960129524000161Published online by Cambridge University Press
D, and write ι : D → C for the inclusion functor from D to C. Suppose further that D has finite products, and write D n : D n → D for the n-ary product functor from D n to D. Then the products of D extend to C if, for each n ≥ 0, there exists a functor C n : C n → C such that the following square commutes:We write × D and ⊗ C infix rather than D 2 and C 2 prefix when n = 2. Crucially, even when C has products, the object C n A i need not be a product of A 1 ,..., A n in C.The following examples help clarify this observation.For any category C, C itself is (trivially) a structure of computational partiality on C.Moreover, if C has finite products, then they extend (trivially) to C. For each n, C The category Set is a structure of computational partiality on the category PSet whose products extend to PSet.We consider the case for n = 2 explicitly; those for other values of n are analogous.Define the functor _ ⊗ _ : PSet 2 → PSet whose action on objects is given by A 1 ⊗ A 2 = A 1 × A 2 , where _ × _ denotes the usual cartesian product in Set, and whose action on morphisms sends partial functions f 1 not commute in PSet.Consider the category ωpCPO of complete partial orders with bottom elements and strict Scott-continuous functions between them.Write ωpCPO t for the wide subcategory containing only those morphisms f : D → D such that f −1 (⊥ D ) = {⊥ D }, where ⊥ D is the bottom element of D and ⊥ D is the bottom element of D .The product ωpCPO t n D i of objects D 1 , . . ., D n in ωpCPO t is the subset of the cartesian product in Set of the sets underlying the D i s comprising those tuples (x 1 , . . ., x n ) such that either x i = ⊥ D i for all i or x i = ⊥ D i for all i.The product ωpCPO t ://doi.org/10.1017/S0960129524000161Published online by Cambridge University Press of morphims f 1 n f i https n, and f is uniquely determined by the f i s.In other words, [[ ]], together with the π i s, is a product of the [[a i ]]s in D. Now that we have established that it is natural to interpret each context as the product D n [[a i ]] in D of the types a i , i = 1, . . ., n, it comprises, we explain why it is equally natural to expect the products of D to extend to C. We focus on the case n = 2 for illustrational purposes.Suppose we are given terms t 1 of type b 1 and t 2 of type b 2 in context , and suppose t 1 and t 2 are interpreted by morphisms [[t 1 ]