Coherent differentiation

The categorical models of the differential lambda-calculus are additive categories because of the Leibniz rule which requires the summation of two expressions. This means that, as far as the differential lambda-calculus and differential linear logic are concerned, these models feature finite nondeterminism and indeed these languages are essentially non-deterministic. We introduce a categorical framework for differentiation which does not require additivity and is compatible with deterministic models such as coherence spaces and probabilistic models such as probabilistic coherence spaces. Based on this semantics we sketch the syntax of a deterministic version of the differential lambdacalculus.


Introduction
The differential λ-calculus has been introduced in [ER03], starting from earlier investigations on the semantics of Linear Logic (LL) in models based on various kinds of topological vector spaces [Ehr05,Ehr02].Later on we proposed in [ER04,Ehr18] an extension of LL featuring differential operations which appear as an additional structure of the exponentials (the resource modalities of LL), offering a perfect duality to the standard rules of dereliction, weakening and contraction.The differential λ-calculus and differential LL are about computing formal derivatives of programs and from this point of view are deeply connected to the kind of formal differentiation of programs used in Machine Learning for propagating gradients (that is, differentials viewed as vectors of partial derivatives) within formal neural networks.As shown by the recent [BMP20,MP21] formal transformations of programs related to the differential λ-calculus can be used for efficiently implementing gradient back-propagation in a purely functional framework.The differential λ-calculus and the differential linear logic are also useful as the foundation for an approach to finite approximations of programs based on the Taylor expansion [ER08,BM20] which provides a precise analysis of the use of resources during the execution of a functional program deeply related with implementations of the λ-calculus in abstract machines such as the Krivine Machine [ER06].
One should insist on the fact that in the differential λ-calculus derivatives are not taken wrt.to a ground type of real numbers as in [BMP20,MP21] but can be computed wrt.elements of all types.For instance it makes sense to compute the derivative of a function M : (ι ⇒ ι) → ι wrt.its argument which is a function from ι, the type of integers, to itself, thus suggesting the possibility of using this formalism for optimization purposes in a model such as probabilistic coherence spaces [DE11] (PCS) where a program of type ι → ι is seen as an analytic function transforming probability distributions on the integers.In [Ehr19] it is also shown how such derivatives can be used to compute the expectation of the number of steps in the execution of a program.A major obstacle on the extension of programming languages with such derivatives is the fact that probabilistic coherence spaces are not a model of the differential λ-calculus in spite of the fact that the morphisms, being analytic, are obviously differentiable.The main goal of this paper being to circumvent this obstacle, let us first understand it better.
These differential extensions of the λ-calculus and of LL require the possibility of adding terms of the same type.For instance, to define the operational semantics of the differential λ-calculus, given a term t such that x : A ⊢ t : B and a term u such that Γ ⊢ u : A one has to define a term ∂t ∂x • u such that Γ, x : A ⊢ ∂t ∂x • u : B which can be understood as a linear substitution of u for x in t and is actually a formal differentiation: x has no reason to occur linearly in t so this operation involves the creation of linear occurrences of x in t and this is done applying the rules of ordinary differential calculus.The most important case is when t is an application t = (t 1 )t 2 where Γ, x : A ⊢ t 1 : C ⇒ B and Γ, x : A ⊢ t 2 : C. In that case we set where we use differential application which is a syntactic construct of the language: given Γ ⊢ s : C ⇒ B and Γ ⊢ v : C, we have Γ ⊢ Ds • v : C ⇒ B. This crucial definition involves a sum corresponding to the fact that x can appear free in t 1 and in t 2 : this is the essence of the "Leibniz rule" (f g) ′ = f ′ g +f g ′ which has nothing to do with multiplication but everything with the fact that both f and g can have non-zero derivatives wrt. a common variable they share (logically this sharing is implemented by a contraction rule).
For this reason the syntax of the differential λ-calculi and linear logic features an addition operation on terms of the same type and accordingly the categorical models of these formalisms are based on additive categories.Operationally such sums correspond to a form of finite non-determinism: for instance if the language has a ground type of integers ι with constants n such that Γ ⊢ n : ι for each n ∈ N, we are allowed to consider sums such as 42 + 57 corresponding to the non-deterministic superposition of the two integers (and not at all to their sum 99 in the usual sense!).This can be considered as a weakness of this approach since, even if one has nothing against non-determinism per se it is not satisfactory to be obliged to enforce it for allowing differential operations which have nothing to do with it a priori.So the fundamental question is: Does every logical approach to differentiation require non-determinism?
We ground our negative answer to this question on the observation made in [Ehr19] that, in the category of PCS, morphisms of the associated cartesian closed category are analytic functions and therefore admit all iterated derivatives (at least in the "interior" of the domain where they are defined).Consider for instance in this category an analytic f : 1 → 1 where 1 (the ⊗ unit of LL) is the [0, 1] interval, meaning that f x n has no reason to map [0, 1] to [0, 1] and can even be unbounded on [0, 1) and undefined at x = 1 (and there are programs whose interpretation behaves in that way).Though, if (x, u) ∈ [0, 1] 2 satisfy x + u ∈ [0, 1] then f (x) + f ′ (x)u ≤ f (x + u) ∈ [0, 1].This is true actually of any analytic morphism f between two PCSs X and Y : we can see the differential of f as mapping a summable pair (x, u) of elements of X to the summable pair (f (x), f ′ (x) • u) of elements of Y .Seeing the differential as such a pair of functions is central in differential geometry as it allows, thanks to the chain rule, to turn it into a functor mapping a smooth map f : X → Y (where X and Y are now manifolds) to the function Tf : TX → TY which maps (x, u) to (f (x), f ′ (x) • u) where TX is the tangent bundle of X, a manifold whose elements are the pairs (x, u) of a point x of X and of a vector u tangent to X at x.The concept of tangent category has been introduced in [Ros84,CC14] precisely to describe categorically this construction and its properties.In spite of this formal similarity our central concept of summability cannot be compared with tangent categories in terms of generality, first because when (x, u) ∈ TX it makes no sense to add x and u or to consider u alone (independently of x), and second because, given (x, u 0 ), (x, u 1 ) ∈ TX, the local sum (x, u 0 + u 1 ) ∈ TX is always defined in the tangent bundle, whereas in our summability setting, the pair (u 0 , u 1 ) has no reason to be summable.
Content.We base our approach on a concept of summable pair that we axiomatize as a general categorical notion in Section 2: a summable category is a category L with 0-morphisms 1 together with a functor S : L → L equipped with three natural transformations from SX to X: two projections and a sum operation.The first projection also exists in the "tangent bundle" functor of a tangent category but the two other morphisms do not.Such a summability structure induces a monad structure on S (a similar phenomenon occurs in tangent categories).In Section 3 we consider the case where the category is a cartesian SMC equipped with a resource comonad ! in the sense of LL where we present differentiation as a distributive law between the monad S and the comonad ! .This allows to extend S to a strong monad D on the Kleisli category L ! which implements differentiation of non-linear maps.In Section 4 we study the case where the functor S can be defined using a more basic structure of L based on the object 1 & 1 where & is the cartesian product and 1 is the unit of ⊗: this is actually what happens in the concrete situations we have in mind.Then the existence of the summability structure becomes a property of L and not an additional structure.We also study the differential structure in this setting, showing that it boils down to a simple !-coalgebra structure on 1 & 1.
As a running example along the presentation of our categorical constructions we use the category of coherence spaces, the first model of LL historically [Gir87].There are many reasons for this choice.It is one of the most popular models of LL and of functional languages, it is a typical example of a model of LL which is not an additive category (in contrast with the relational model or the models of profunctors), a priori it does not exhibit the usual features of a model of the differential calculus (no coefficients, no vector spaces etc) and it strongly suggests that our coherent approach to the differential λ-calculus might be applied to programming languages which have nothing to do with probabilites, deep learning or non-determinism.In Section 5 we describe the differential structure of the coherence space model, showing that it provides an example of a canonically summable differential category.We observe that, in the uniform setting of Girard's coherence space, our differentiation does not satisfy the Taylor formula but that this formula will hold if we use instead non-uniform coherence spaces of which we describe the differential structure.
In Section 6 we consider the situation where the underlying SMC is closed, that is, has internal hom objects.In that case an additional condition on the summability structure is required, expressing intuitively that the sum of two morphisms is computed pointwise.
Last in Section 7 we outline a syntax for a differential λ-calculus corresponding to this semantics.This concluding section should only be considered as an appetizer for a more consistent paper on a differential and deterministic extension of PCF which will be available soon.

Related work.
As already mentioned our approach has strong similarities with tangent categories which have been a major source of inspiration, we explained above the differences.There are also strong connections with differential categories [BCLS20] with the main difference again that differential categories are left-additive which is generally not the case of L ! in our case, we explained why.There are also interesting similarities with [CLL20] (still in an additive setting): our distributive law ∂ X might play a role similar to the one of the distributive law introduced in the Section 5 of that paper.This needs further investigations.
Recently [KP20] have exhibited a striking connection between Gödel's Dialectica interpretation and the differential λ-calculus and differential linear logic, with applications to gradient back-propagation in differential programming.One distinctive feature of Pédrot's approach to Dialectica [Péd15] is to use a "multiset parameterized type" M whose purpose is apparently to provide some control on the summations allowed when performing Pédrot's analogue of the Leibniz rule (under the Dialectica/differential correspondence of [KP20]) and might therefore play a role similar to our summability functor S. The precise technical connection is not clear at all but we believe that this analogy will lead to a unified framework for Dialectica interpretation and coherent differentiation of programs and proofs involving denotational semantics, proof theory and differential programming.
The differential λ-calculus that we obtain in Section 7 features strong similarities with the calculus introduced in [BMP20, MP21] for dealing with gradient propagation in a functional setting.Both calculi handle tuples of terms in the spirit of tangent categories which allows to make the chain rule functorial thus allowing to reduce differential terms without creating explicit summations.

Preliminary notions and results
This section provides some more or less standard technical material useful to understand the paper.It can be skipped and used when useful, in call-by-need manner.

Finite multisets
A finite multiset on a set A is a function m : A → N such that the set supp(m) = {a ∈ A | m(a) = 0} is finite, we use M fin (A) for the set of all finite multisets of elements of A. The cardinality of m is #m = a∈A m(a).We use [ ] for the empty multiset (so that supp

The SMCC of pointed sets
Let Set 0 be the category of pointed sets.We use 0 X or simply 0 for the distinguished point of the object X.
The terminal object is the singleton {0}.The cartesian product X & Y is the ordinary cartesian product, with 0 X&Y = (0 X , 0 Y ).The tensor product X ⊗ Y is defined as with 0 X⊗Y = (0 X , 0 Y ).The unit of the tensor product is the object 1 = {0, * } of Set 0 .This category is enriched over itself, the distinguished point of Set 0 (X, Y ) being the constantly 0 Y function.Actually, it is monoidal closed with X ⊸ Y = Set 0 (X, Y ) and 0 X⊸Y defined by 0 X⊸Y (x) = 0 Y for all x ∈ X.A mono in Set 0 is a morphism of Set 0 which is injective as a function.
Unless explicitly stipulated, all the categories L we consider in this paper are enriched over pointed sets, so this assumption will not be mentioned any more.In the case of symmetric monoidal categories, this also means that the tensor product of morphisms is "bilinear" wrt. the pointed structure, that is: if and by symmetry we have 0 ⊗ f = 0.

Monoidal and resource categories
A symmetric monoidal category (SMC) is a category L equipped with a bifunctor L 2 → L denoted as ⊗, a monoidal unit 1 which is an object of L and λ as associated isomorphisms satisfying the usual McLane coherence commutations.Given objects X 0 , . . ., X n−1 and i < j in {0, . . ., n−1}, we use γ i,j for the canonical swapping iso in The category L ⊗ of commutative comonoids has these tuples as objects, and an element of Theorem 1.1.For any SMC L the category L ⊗ is cartesian.The terminal object is (1, Id 1 , (λ 1 ) −1 ) (remember that λ 1 = ρ 1 ) simply denoted as 1 and for any object C the unique morphism and the structure maps are defined as The projections The proof is straightforward.In a commutative monoid M , multiplication is a monoid morphism M × M → M .The following is in the vein of this simple observation.
Proof.The second statement amounts to the following commutation which results from the commutativity of C. The first statement is similarly trivial.

Resource categories
The notion of resource category is more general than that of a Seely category in the sense of [Mel09].We keep only the part of the structure and axioms that we need to define our notion of differential structure and keep our setting as general as possible.
An object X of an SMC L is exponentiable if the functor ⊗ X has a right adjoint, denoted as X ⊸ .In that case, we use ev ∈ L((X ⊸ Y )⊗X, Y ) for the counit of the adjunction and, given f ∈ L(Z ⊗X, Y ) we use cur f for the associated morphism cur f ∈ L(Z, X ⊸ Y ).
We say that the SMC L is closed (is an SMCC) if any object of L is exponentiable.
A category L is a resource category if • L is an SMC; • L is cartesian with terminal object ⊤ (so that 0 is the unique element of L(X, ⊤)) and cartesian product of X 0 , X 1 denoted (X 0 & X 1 , pr 0 , pr 1 ) and pairing of morphisms • and L is equipped with a resource comonad, that is a tuple (! , der, dig, m 0 , m 2 ) where ! is a functor L → L which is a comonad with counit der and comultiplication dig, and are the Seely isomorphisms subject to conditions that we do not recall here, see for instance [Mel09] apart for the following which explains how dig interacts with Then !inherits a lax symmetric monoidality µ 0 , µ 2 on L (considered as an SMC).This means that one can define µ 0 ∈ L(1, !1) and µ 2 X0,X1 ∈ L(!X 0 ⊗ !X 1 , !(X 0 ⊗ X 1 )) satisfying suitable coherence commutations.Explicitly these morphisms are given by Proof.This results from the definition of µ 2 and from the following commutation which results from the observation that !0 ∈ L(!!X, !0) can be written !0 = !(0der X ).
For any X ∈ L it is possible to define a contraction morphism contr X ∈ L(!X, !X ⊗ !X) and a weakening morphism weak X ∈ L(!X, 1) turning !X into a commutative comonoid.These morphisms are defined as follows: .
Lemma 1.3.The two following diagrams commute in any resource category L.
For the second diagram, we compute by naturality of der where by naturality of m 2 .In that expression pr We have used the commutation of the following diagram which is easily proved by post-composing the two equated morphisms with and hence where, by naturality of dig, and hence, by the diagram (1) It follows that by the monoidality properties of the Seely isomorphisms.

Coalgebras of the resource comonad
A !-coalgebra is a pair P = (P , h P ) where P is an object of L and h P ∈ L(P , !P ) satisfies

!hP
Given coalgebras P and Q, a coalgebra morphism from P to Q is an f ∈ L(P , Q) such that the following square commutes The category so defined is the Eilenberg-Moore category L ! associated with the comonad ! .We will use the following standard result for which we refer to [Mel09].
An immediate consequence of this theorem is the following observation.
Proposition 1.1.Let P be an object of L ! , u ∈ L ! (P, 1) and d ∈ L ! (P, P ⊗ P ) be such that

Lafont categories and the free exponential
In many interesting models of LL, the exponential resource modality is completely determined by the tensor product; in that case one says that the exponential is free.We provide the precise definition of such categories and give some of their properties that we shall use in the paper.Let L be an SMC.Remember from [Mel09] that L is a Lafont category if the forgetful functor U : L ⊗ → L has a right adjoint E : L → L ⊗ which maps an object X to a commutative comonoid (!X, weak X , contr X ).In that case we use (! , der, dig) for the associated comonad U E called the free exponential of the SMC L.
More explicitly this means that for any object X of L, for any commutative comonoid C = (C, w C : C → 1, c C : C → C⊗C) and any f ∈ L(C, X), there is exactly one morphism f ⊗ ∈ L/X((C, f ), (!X, der X )) which is a comonoid morphism.In other words there is exactly one morphism f ⊗ ∈ L(C, !X) such that the three following diagrams commute.
Let L be a Lafont category.For any commutative comonoid C there is exactly one morphism δ C ∈ L(C, !C) such that the following diagrams commute.
Proof.The first part of the statement is just a special case of the universal property with X = C and f = Id X .For the second part we only have to prove )) because both are defined by composing morphisms in that category.The equation f 1 = f 2 follows by universality, observing that for i = 1, 2, which readily results from the naturality of der and from the definition of a comonad.
Here are two important special cases of the above.First, there is exactly one morphism These two morphisms turn !into a lax monoidal comonad on the SMC L.
The correspondence C → (C, δ C ) can be turned into a functor A : L ⊗ → L ! acting as the identity on morphisms.Let indeed f ∈ L ⊗ (C, D), it suffices to prove that δ D f = !fδ C ∈ L(C, !D).Let f 0 = δ D f and f 1 = !fδ C .By the universal property, it suffices to prove that the three following diagrams commute for i = 0, 1: These commutations follow from the commutations satisfied by δ C and δ D and from the fact that f ∈ L ⊗ (C, D).As an example of these computations, we have Conversely given a !-coalgebraP = (P , h P ) one can define a commutative comonoid structure on P by the following two morphisms that we respectively denote as w P and c P .This correspondence P → M(P ) = (P , w P , c P ) can be turned into a functor M : L ! → L ⊗ acting as the identity on morphisms.
Theorem 1.3.For any Lafont SMC L, the functors A and M define an isomorphism of categories between L ⊗ and L ! .
Proof.Let C ∈ L ⊗ and let P = A(C) so that P = C and h P = δ C .Let D = M(P ) so that D = C, Conversely let P ∈ L ! .Let C = M(P ) so that C = P , w C = weak P h P and c C = (der P ⊗der P ) contr P h P .Let Q = A(C) = (P , δ C ).To prove that δ C = h P it suffices to show that the following diagrams commute which results from the definition of C and from the fact that P is a coalgebra.Let us check for instance the last one: (h P ⊗ h P ) c C = (h P ⊗ h P ) (der P ⊗ der P ) contr P h P = (der !P ⊗ der !P ) (!h P ⊗ !h P ) contr P h P = (der !P ⊗ der !P ) contr !P !h P h P = (der !P ⊗ der !P ) contr !P dig P h P = (der !P ⊗ der !P )(dig P ⊗ dig P ) contr P h P = contr C h P where we have used in particular the fact that for any X ∈ L, one has dig X ∈ L ⊗ (E(X), E(!X)) by the fact that the comonad ! is induced by the adjunction U ⊣ E. This shows that M and A define a bijective correspondence on objects and since both functors act as the identity on morphisms, our contention is proven.
In that way we retrieve the fact that L ! is cartesian since L ⊗ is always cartesian by Theorem 1.1 (even if L is not Lafont).Remember that in the general (not necessarily Lafont) case the fact that L ! is cartesian could be proven under the additional assumption that L is a resource category.Remember also that a cartesian Lafont SMC is automatically a resource category, see [Mel09].
Lemma 1.5.Let C 0 , C 1 ∈ L ⊗ .Remember that we use C 0 ⊗ C 1 for the cartesian product of C 0 and C 1 in L ⊗ (see Theorem 1.1).Then we have Proof.One just checks that the right hand morphisms satisfy the three diagrams of Lemma 1.4.
Theorem 1.4.Let L be a Lafont category and let C ∈ L ⊗ .Then the following diagrams commute Proof.We deal with the second diagram, the argument for the first one being completely similar.By Lemma 1.1 we have c C ∈ L ⊗ (C, C ⊗ C) and hence (since A is the identity on morphisms) we have which is exactly the diagram under consideration by Lemma 1.5.

Resource Lafont categories
A resource Lafont category is a resource category L where the exponential arises in the way explained above; in that case one says that ! is the free exponential (it is unique up to unique iso since it is defined by a universal property).This is equivalent to requiring that Indeed when these conditions hold the Seely isomorphisms are uniquely defined by the universal property of the Lafont SMC L. The lax monoidality (µ 0 , µ 2 ) induced by these Seely isomorphisms coincide with the one which is directly induced by the Lafont property (again by universality).This is why we used the same notations for both.

Summable categories
Let L be a category; composition in L is denoted by simple juxtaposition.We develop a categorical axiomatization of a concept of finite summability in L which will then be a partially additive category [AM80].The main idea is to equip L with a functor S which has the flavor of a monad 2 and intuitively maps an object X to the object SX of all pairs (x 0 , x 1 ) of elements of X whose sum x 0 + x 1 is well defined.This is another feature of our approach which is to give a crucial role to such pairs, which are the values on which derivatives are computed, very much in the spirit of Clifford's dual numbers.However, contrarily to dual numbers our structures also axiomatize the actual summation of such pairs.◮ Example 2.1.In order to illustrate the definitions and constructions of the paper we will use the category Coh of coherence spaces [Gir87] as a running example.An object of this category is a pair E = (|E|, ¨E) where |E| is a set (the web of E) and ¨E is a symmetric and reflexive relation on |E|.
The set of cliques of a coherence space E is Equipped with ⊆ as order relation, Cl(E) is a cpo.Given coherence spaces E and F , we define the coherence space In that way we have turned the class of coherence spaces into a category Coh with Coh(E, F ) = Cl(E ⊸ F ) and Coh is enriched over pointed sets, with 0 = ∅.This category is cartesian with 2 And will actually be shown to have a canonical monad structure so that this sum is not always defined.With these notations observe that (where the right hand side is defined as soon as the left hand side is) explaining somehow the terminology "linear maps" for these morphisms.◭ Definition 2.1.A pre-summability structure on L is a tuple (S, π 0 , π 1 , σ) where S : L → L is a functor which preserves the enrichment of L (that is S0 = 0) and π 0 , π 1 and σ are natural transformation from S to the identity functor such that for any two morphisms f, g ∈ L(Y, SX), if π i f = π i g for i = 0, 1, then f = g.In other words, π 0 and π 1 are jointly monic.
◮ Example 2.2.We give a pre-summability structure on coherence spaces.Given a coherence space E, the coherence space S(E) is defined by where 1 is the coherence space whose web is a chosen singleton * .We shall see in Section 4 that it is often possible to define S in that particular way.
Lemma 2.2.Cl(SE) is isomorphic to the poset of all pairs (x 0 , x 1 ) ∈ Cl(E) 2 such that x 0 + x 1 is defined and belongs to Cl(E), equipped with the product order.
Then it is easy to check that Ss ∈ Coh(SE, SF ) and that S is a functor.This is due to the definition of s which entails s The additional structure is defined as follows: From now on we assume that we are given such a structure.We say that f By definition of a pre-summability structure there is only one such g if it exists, we denote it as f 0 , f 1 S .When this is the case we set We sometimes call f 0 , f 1 S the witness of the summability of f 0 and f 1 and f 0 + f 1 their sum.
◮ Example 2.3.In the case of coherence spaces, saying that s 0 , s 1 ∈ Coh(E, F ) are summable simply means that s 0 ∩ s 1 = ∅ and s 0 ∪ s 1 ∈ Coh(E, F ).This property is equivalent to and in that case the witness is defined exactly in the same way as ) are summable and that g ∈ L(U, X) and h ∈ L(Y, Z).Then h f 0 g and h f 1 g are summable with witness (Sh) f 0 , f 1 S g ∈ L(U, SZ) and sum h (f 0 + f 1 ) g ∈ L(U, Z).
The proof boils down to the naturality of π i and σ.An easy consequence is that the application of S to a morphism can be written as a witness.
Now using this notion of pre-summability structure we start introducing additional conditions to define a summability structure.
Notice that this witness is an involutive iso since (S-zero) For any f ∈ L(X, Y ), the morphisms f and 0 ∈ L(X, Y ) are summable and their sum is f , that is σ f, 0 = f .By (S-com) this implies that 0 and f are summable with 0 + f = f .Notice that we have four morphisms This is an easy consequence of the fact that π 0 , π 1 are jointly monic.
We can now state our last axiom.(S-assoc) The following diagram commutes.
Let us see what this condition has to do with associativity of summation.
We define a general notion of summable family of morphisms (f i ) n i=1 in L(X, Y ) together with its sum f 1 + • • • + f n by induction on n: Of course we use the standard notation n Lemma 2.10.If f 1 , . . ., f n are summable then f 2 , . . ., f n , f 1 is summable and Proof.This is obvious if n ≤ 1 so we can assume n ≥ 2. By Lemma 2.9 f 2 , . . ., f n are summable and are summable by Lemma 2.5 and hence f 2 , . . ., f n , f 1 is summable (by definition) with sum equal to n i=1 f i .
Proof.By our assumption, f 1 , . . ., f n−2 is summable (let us call g its sum), g, f n−1 are summable and g + f n−1 , f n are summable.Moreover (g + f n−1 ) + f n = n i=1 f i .It follows by Lemma 2.8 that f n−1 , f n are summable and hence f n , f n−1 are summable with f n + f n−1 = f n−1 + f n by Lemma 2.5.So we know by Lemma 2.8 that g, f n +f n−1 are summable and hence by the same lemma that g, f n are summable and that g + f n , f n−1 are summable with (g Proposition 2.1.For any p ∈ S n (the symmetric group) and any family of morphisms (f i ) n i=1 , the family (f i ) n i=1 is summable iff the family (f p(i) ) n i=1 is summable and then i∈I f i = i∈I f p(i) .Proof.Remember that S n is generated by the permutations (1, . . ., n − 2, n, n − 1) (transposition) and (2, . . ., n, 1) (circular permutation) and apply Lemmas 2.11 and 2.10.So we define an unordered finite family (f i ) i∈I to be summable if any of its enumerations (f i1 , . . ., f in ) is summable and then we set ) is summable iff for any family of pairwise disjoint sets (I j ) j∈J such that ∪ j∈J I j = I: • the family ( i∈Ij f i ) j∈J is summable and then we have i∈I f i = j∈J i∈Ij f i .
Proof.By induction on k = #J ≥ 1.If k = 1 the property trivially holds so assume k > 1. Upon choosing enumerations we can assume that I = {1, . . ., n} and J = {1, . . ., k}, with n, k ∈ N. Thanks to Proposition 2.1 we can choose these enumerations in such a way that I k = {l + 1, . . ., n} for some l ∈ {1, . . ., n}.Then by an iterated application of the definition of summability and of Lemma 2.8 we know that the families f 1 , . . ., f l and f l+1 , . . ., f k are summable and that ( We conclude the proof by applying the inductive hypothesis to (I j ) k−1 j=1 which satisfies k−1 j=1 I j = {1, . . ., l}.
Remark 2.1.These properties strongly suggest to consider summability as an n-ary notion, axiomatized in an operadic way.However in the sequel we shall see that the differential operations use SX as a space of pairs, and there it is not clear that such an operadic approach would be so convenient.This is why we stick (at least for the time being) to this "binary" axiomatization.
Another interesting consequence of (S-assoc) is that S preserves summability.
Proof.We must prove that π i c S f 0 , f 1 S = Sf i .For this we use the fact that π 0 , π 1 ∈ L(SY, Y ) are jointly monic.We have And we have We will use the notations ι 0 = Id, 0 S ∈ L(X, SX) and ι 1 = 0, Id S ∈ L(X, SX).
and to 0 if i = 1 since f 0 = 0. On the other hand π i Id, 0 S f is equal to f is i = 0 and to 0 if i = 1 since 0 f = 0.The naturality follows by the fact that π 0 , π 1 are jointly monic.
Notice that if L has products X & Y and coproducts X ⊕ Y then we have where [ι 0 , ι 1 ] is the co-pairing of ι 0 and ι 1 , locating SX somewhere in between the coproduct and the product of X with itself.Notice that, in the case of coherence spaces, SX is neither the product X & X nor the coproduct X ⊕ X in general.
In contrast, if L has biproducts, then we necessarily have SX = X & X = X ⊕ X with obvious structural morphisms, and L is additive.Of course this is not the situation we are primarily interested in!
Similarly, using the naturality of π 1 , we have The commutations involving τ and ι 0 are proved in the same way.The last equation results from π i π j c = π j π i ◮ Example 2.5.In our coherence space running example, we have Just as in tangent categories, this monad structure will be crucial for expressing that the differential (Jacobian) is a linear morphism.

Differentiation in a summable symmetric monoidal category
Let L be a symmetric monoidal category (SMC), with monoidal product ⊗, unit 1 and isomorphisms ρ Most often these isos will be kept implicit to simplify the presentation.Concerning the compatibility of the summability structure with the monoidal structure our axiom stipulates distributivity.
Assume that L is also equipped with a summability structure.We say that L is a summable SMC if the following property holds, which expresses that the tensor distributes over the (partially defined) sum.
Proof.The fact that ϕ 1 is a strength means that the following two diagrams commute: Let us prove for instance the second one.We have The fact that (S, ι 0 , τ, ϕ 1 ) is a commutative monad means that, moreover, the following diagram commutes: which results from a stronger property, namely that the following diagram commutes and from Theorem 2.3.The last commutation is proved as follows: We set it is well known that in such a commutative monad situation, the associated tuple (S, ι 0 , τ, L) is a symmetric monoidal monad on the SMC L.
Definition 3.1.When the summability structure of the SMC L satisfies (S⊗-dist) we say that L is a summable SMC.

Differential structure
We say that a resource category L (see Section 1.3) is summable if it is summable as an SMC and satisfies the following additional condition of compatibility with the cartesian product.(S&-pres) The functor S preserves all finite cartesian products.In other words 0 ∈ L(S⊤, ⊤) and Spr 0 , Spr 1 ∈ L(S(X 0 & X 1 ), SX 0 & SX 1 ) are isos.
A differential structure on a summable resource category L consists of a natural transformation ∂ X ∈ L(!SX, S!X) which satisfies the following conditions.
This first condition allows to extend the functor ! to the Kleisli category L S of the monad S. In this Kleisli category, a morphism X → Y can be seen as a pair (f 0 , f 1 ) of two summable morphisms in L(X, Y ), and composition is defined by g • f = (g 0 f 0 , g 1 f 0 +g 0 f 1 ), a definition which is very reminiscent of the multiplication of dual numbers.
This second condition allows to extend the functor S to the Kleisli category L ! .We obtain in that way the functor D : L ! → L ! defined as follows: on objects, we set The purpose of the two commutations is precisely to make this operation functorial and this functoriality is a categorical version of the chain rule of calculus, exactly as in tangent categories since, as we shall see, this functor D essentially computes the derivative of f .Remark 3.1.It is very likely that the natural transformation ∂ X can be seen as one of the six kinds distributive law between the monad S and the comonad !described in [PW02], Section 8.
Proof.This is an easy consequence of the naturality of ∂ and of the definition of weak X and contr X which is based on the cartesian products and on the Seely isomorphisms.
This diagram, involves the canonical flip c introduced before the statement of (S-assoc) and expresses a kind of commutativity of the second derivative.Definition 3.2.A differentiation in a summable resource category L is a natural transformation ∂ X ∈ L(!SX, S!X) which satisfies (∂-local), (∂-lin), (∂-chain), (∂-&) and (∂-Schwarz).A summable resource category given together with a differentiation is a differential summable resource category.

Derivatives and partial derivatives in the Kleisli category
The Kleisli category L ! of the comonad (!, der, dig) is well known to be cartesian.In general it is not a differential cartesian category in the sense of [AL20] because it is not required to be additive3 .Our running example of coherence spaces is an example of such a category which is not a differential category.
There is an inclusion functor Lin !: L → L ! which maps X to X and f ∈ L(X, Y ) to f der X ∈ L ! (X, Y ), it is faithful but not full in general and allows to see any morphism of L as a "linear morphism" of L ! .
We have already mentioned the functor D : L ! → L ! , remember that DX = SX and Df = (Sf ) ∂ X when f ∈ L(X, Y ).Then we have D • Lin != Lin !• S which allows to extend simply the monad structure of S to D by setting Theorem 3.3.The morphisms ζ X ∈ L ! (X, DX) and θ X ∈ L ! (D 2 X, DX) are natural and turn the functor D into a monad on L ! .
Proof.The only non obvious property is naturality, the monadic diagram commutations resulting from those of (S, ι , σ) on L and of the functoriality of Lin ! .The proof can certainly be adapted from [PW02], we provide it for convenience.Let f ∈ L ! (X, Y ), that is f ∈ L(!X, Y ).We must first prove that Since S preserves cartesian products, we can equip easily this monad (D, ζ, θ) on L ! with a commutative strength where η = Spr 0 , Spr 1 −1 is the canonical iso of (S&-pres).It is possible to prove the following commutation in L, relating the strength of S (wrt.⊗) with the strength of D (wrt.&) through the Seely isomorphisms defined from ψ 1 using the symmetry of &.

Deciphering the diagrams
After this rather terse list list of categorical axioms, it is fair to provide the reader with intuitions about their intuitive meaning; this is the purpose of this section.
One should think of the objects of L as partial commutative monoids (with additional structures depending on the considered category), and SX as the object of pairs (x, u) of elements x, u ∈ X such that x + u ∈ X is defined.The morphisms in L are linear in the sense that they preserve 0 and this partially defined sums whereas the morphisms of L ! should be thought of as functions which are not linear but admit a "derivative".More precisely f ∈ L ! (X, Y ) can be seen as a function X → Y and, given (x, u) ∈ SX we have where df (x) dx • u is just a notation for the second component of the pair Df (x, u) which, by construction, is such that the sum f (x) + df (x) dx • u is a well defined element of Y .Now we assume that this derivative df (x) dx • u obeys the standard rules of differential calculus and we shall see that the above axioms about ∂ correspond to these rules.
Remark 3.2.The equations we are using in this section as intuitive justifications for the diagrams of Section 3.1 refer to the standard laws and properties of the differential calculus that we assume the reader to be acquainted with.They do hold exactly as written here in the model Pcoh where derivatives are computed exactly as in Calculus as we will show in a forthcoming paper.
Remark 3.3.We use the well established notation df (x) dx • u which must be understood properly: in particular the expression df (x) dx • u is a function of x (the point where the derivative is computed) and of u (the linear parameter of the derivative).When required we use df (x)  dx (x 0 ) • u for the evaluation of this derivative at point x 0 ∈ X.
• (∂-local) means that the first component of Df (x, u) is f (x), justifying our intuitive notation Notice that it prevents differentiation from being trivial by setting df (x) dx • u = 0 for all f and all x, u.
which is exactly the chain rule.
• The "second derivative" , therefore applying the standard rules of differential calculus we have where we have used the fact that f (x) does not depend on u and that df (x) dx • u is linear in u).We have used (∂-lin) to prove Theorem 3.3 whose main content is the naturality of ζ and θ.This second naturality means that Df Similarly the naturality of ζ means that df (x) dx • 0 = 0.So the condition (∂-lin) means that the derivative is a function which is linear with respect to its second parameter.
• We have assumed that L is cartesian and hence L ! is also cartesian.Intuitively X 0 & X 1 is the space of pairs (x 0 , x 1 ) with x i ∈ X i , and our assumption (S&-pres) means that S (X 0 & X 1 ) is the space of pairs ((x 0 , x 1 ), (u 0 , u 1 )) such that (x i , u i ) ∈ SX i , and the sum of such a pair is which can be seen by the following computation of π 1 Df using that diagram the two components of these sums corresponding to the two partial derivatives, see Section 3.2.
Then Theorem 3.2 means that df (x,x) dx • u which is the essence of the Leibniz rule of Calculus.
• The object S 2 X consists of pairs ((x, u), (x ′ , u ′ )) such that x, u, x ′ and u ′ are globally summable.
Then c ∈ L(S 2 X, S 2 X) maps ((x, u), (x ′ , u ′ )) to ((x, x ′ ), (u, u ′ )).Therefore, using the same computation of D 2 f ((x, u), (x ′ , u ′ )) as in the case of (∂-lin), we see that (∂-Schwarz) expresses that . So this diagram means that the second derivative (aka.Hessian) is a symmetric bilinear function, a property of sufficiently regular differentiable functions often refereed to as Schwarz Theorem.

A differentiation in coherence spaces
Now we exhibit such a differentiation in Coh.We define !E as follows: |!E| is the set of finite multisets5 m of elements of |E| such that supp(m) ∈ Cl(E) (such an m is called a finite multiclique).Given which actually belongs to Cl(!E ⊸ !F ) because s ∈ Cl(E ⊸ F ).The comonad structure of this functor and the associated commutative comonoid structure are given by Composition in Coh ! can be described directly as follows: The functions f : Cl(E) → Cl(F ) definable in that way are exactly the stable functions: f is stable if for any x ∈ Cl(E) and any b ∈ f (x) there is exactly one minimal subset x 0 of x such that b ∈ f (x 0 ), and moreover this x 0 is finite.When moreover this x 0 is always a singleton f is said linear and such linear functions are in bijection with Coh(E, F ) (given t ∈ Coh(E, F ), the associated linear function Cl(E) → Cl(F ) is the map x → t • x).
Notice that for a given stable function f : Cl(E) → Cl(F ) there can be infinitely many s ∈ Coh !(E, F ) such that f = s since the definition of s does not take into account the multiplicities in the multisets m such that (m, b) ∈ s.With this identification we define ∂ E ⊆ |!SE ⊸ S!E| as follows: We think useful to check directly that ∂ E ∈ Coh(!SE, S!E) although this checking is not necessary since we shall see in Section 4.6 that this property results from a much simpler one.Let ((m j0 , m j1 ), (i j , m j )) ∈ ∂ E for j = 0, 1 and assume that (m 00 , m 01 ) ¨!SE (m 10 , m 11 ) . (3) By symmetry, there are 3 cases to consider.
• Last assume that i 0 = 1 and i 1 = 0.So we have m 01 = [a] with a / ∈ supp(m 00 ) and m 0 = m 00 + [a]; m 11 = [ ] and m 1 = m 10 .By (3) we know that supp(m 0 + m 1 ) ∈ Cl(!E).Coming back to the definition of the coherence in SF (for a coherence space F ), we must also prove that m 0 = m 1 : this results from (3) which entails that a / ∈ supp(m 1 ) = m 10 whereas we know that a ∈ supp(m 0 ).
We postpone the proofs of the other commutations as they will be reduced in Section 4.6 to much simpler properties because Coh id .Given x ∈ Cl(E), we can define a coherence space E x (the local sub-coherence space at x) as follows:  Remark 3.5.This shows in particular that df (x)  dx ∈ Coh(E x , F s(x) ) since df (x) dx = π 1 • g • ι 1 and also that this derivative is stable with respect to the point x where it is computed and thus differentiation of stable functions can be iterated.However Remark 3.4 indicates a peculiarity of this derivative which has as consequence that the morphisms in Coh !do not coincide with their Taylor expansion that one can define by iterating this derivative (the expansion of s is s whereas the expansion of s ′ is ∅).This is an effect of the uniformity of the construction !E, that is, of the fact that for m ∈ M fin (|E|) to be in |!E|, it is required that supp(m) be a clique.This can be remedied, without breaking the main feature of our construction, namely that it is compatible with the determinism6 of the model, by using non-uniform coherence spaces instead, where |!E| = M fin (E) [BE01,Bou11], see Section 5.1.In some sense, stable functions on Girard's coherence spaces are smooth but not analytic.

Canonically summable categories
The concept of summable category applies typically to models of Linear Logic in the sense of Seely (see [Mel09]): such a model is based on an SMC L whose morphisms are intuitively considered as linear, and the summability structure makes this linearity more explicit.In the models we want to apply primarily our theory to -typically (probabilistic) coherence spaces -the summability structure boils down to a more basic structure which is always present in such a model: the functor SX is defined on objects by SX = (1 & 1 ⊸ X), and similarly for morphisms.A priori, given a categorical model of LL L, this functor does not necessarily define a summability structure.The purpose of this section is to examine under which conditions this is the case, and to express the differential structure introduced above in this particular and important setting.
Let L be a cartesian7 SMC where the object I = 1 & 1 is exponentiable, that is, the functor S I : X → X ⊗ I has a right adjoint S I : X → (I ⊸ X).We use ev ∈ L((I ⊸ X) ⊗ I, X) for the corresponding evaluation morphism and, given f ∈ L(Y ⊗ I, X) we use cur f for the associated Curry transpose of f which satisfies cur f ∈ L(Y, I ⊸ X).Being a right adjoint, S I all limits existing in L (and in particular the cartesian product).
We shall use the construction provided by the following lemma.
Lemma 4.1.Let ϕ ∈ L(1, I).For any object X of L let nt(ϕ) X ∈ L(I ⊸ X, X) be the following composition of morphisms Proof.Naturality results from the naturality of ρ and functoriality of I ⊸ .Let us prove the second part of the lemma, we have: For i = 0, 1 we have a morphism π i ∈ L(1, I) given by π 0 = Id 1 , 0 and π 1 = 0, Id 1 .We also have a diagonal morphism ∆ = Id 1 , Id 1 ∈ L(1, I).Using these we define the following natural transformations S I X → X: Definition 4.1.The category L is canonically summable if (S I , π 0 , π 1 , σ) is a summability structure.
Remark 4.1.Canonical summability is a property of L and not an additional structure, which is however defined in a rather implicit manner.We exhibit three elementary conditions that are necessary and sufficient for guaranteeing canonical summability.
Proof.Assume that X ⊗ π 0 , X ⊗ π 1 are jointly epic and let f j ∈ L(X, S I Y ) for j = 0, 1 be such that ) so that f j = cur f ′ j , for j = 0, 1.We have by Lemma 4.1.
So we have f ′ 0 = f ′ 1 by our assumption on the π j 's and hence f 0 = f 1 .Assume conversely that π 0 , π 1 are jointly monic and let f 0 , f 1 ∈ L(X ⊗I, Y ) be such that f 0 (X ⊗π i ) = f 1 (X ⊗π i ) for i = 0, 1.By Lemma 4.1 again we have f j (X ⊗π i ) = π i (cur f j ) ρ X and hence cur f 0 = cur f 1 and hence f 0 = f 1 which proves that X ⊗ π 0 , X ⊗ π 1 are jointly epic.
⊲ (S-com).Let f = cur g ∈ L(S I X, S I X) where g is the following composition of morphisms We have and similarly We have ⊲ (S-assoc).We define c X ∈ L(S 2 I X, S 2 I X) by c X = cur(cur(ev (ev⊗I) (S 2 I X ⊗γ I,I ))) where the transposed morphism is typed as follows.
A computation similar to the previous ones shows that π i π j c = π j π i as required.We have moreover ⊲ (S⊗-dist).Let (f 00 , f 01 ) be a summable pair of morphisms in L(X 0 , Y 0 ) so that we have the witness f 00 , f 01 S ∈ L(X 0 , S I Y 0 ), and let ) where h ′ is the following composition of morphisms: We have which shows that f 00 ⊗ f 1 , f 01 ⊗ f 1 are summable with We have by a similar computation There are cartesian SMC where I is exponentiable and which are not canonically summable.The category Set 0 provides probably the simplest example of that situation.◮ Example 4.1.We refer to Section 1.2.We have the functor S I : Set 0 → Set 0 defined by S I X = (I ⊸ X).An element of S I X is a function z : {0, * } 2 → X such that z(0, 0) = 0.The projections π i : S I X → X are characterized by π 0 (z) = z( * , 0) and π 1 (z) = z(0, * ), so π 0 , π 1 is not injective since π 0 , π 1 (z) = (z( * , 0), z(0, * )) does not depend on z( * , * ) which can take any value.So (S I , π 0 , π 1 , σ) is not even a pre-summability structure in Set 0 .This failure of injectivity is due to the fact that I lacks an addition which would satisfy ( * , 0) + (0, * ) = ( * , * ) and, preserved by z, would enforce injectivity.◭ There are also cartesian SMC where I is exponentiable, where (CS-epi) holds but where (S I , π 0 , π 1 , σ) does not satisfy (S-witness).
◮ Example 4.2.Let B be the category whose objects are the finite dimensional real Banach space.By this we mean pairs (V, V ) where V is a finite dimensional real vector space and V is a norm on V .In B, a morphism V → W is a linear map such that ∀v ∈ V f (v) W ≤ v V .This category is a cartesian symmetric monoidal closed category with U ⊸ V defined as the space of all linear maps f : U → V and Indeed since we consider only finite dimensional spaces, all linear maps are continuous (for the product topology induced by any choice of basis, which is the same as the one induce by the norm) and hence bounded.The tensor product classifies bilinear maps (with norm defined by sups as for linear maps) and satisfies u ⊗ v U⊗V = u U v V for all u ∈ U and v ∈ V .The unit of this tensor product is 1 = R with u 1 = |u|.The cartesian product is the standard direct product of vector spaces with (u, v) U&V = max( u V , v V ).Notice that there is also a coproduct U ⊕ V , with the same underlying vector space and (u, v) The functor The natural transformations π i are the obvious projections and σ(u 0 , u 1 ) = u 0 + u 1 .Then, in 1 = R: • −1/2 + 1/2 = 0 and 1 are summable in 1 • but 1/2 and 1 are not summable in 1.
So B is not canonically summable.◭ This example shows that the condition (S-witness) cannot be disposed of and speaks not only of associativity of partial sum, but also of some kind of "positivity" of morphisms in L.

The comonoid structure of I
We assume that L is a canonically summable cartesian SMC.The morphisms π 0 , π 1 ∈ L(1, I) are summable with π 0 + π 1 = ∆, with witness Id ∈ L(I, I).As a consequence of (S⊗-dist) the morphisms Theorem 4.2.Equipped with pr 0 ∈ L(I, 1) as counit and L ∈ L(I, I ⊗ I) as comultiplication, I is a cocommutative comonoid in the SMC L.
Proof.To prove the required commutations, we use (CS-epi).Here are two examples of these computations.

Strong monad structure of S I
Therefore S I has a canonical comonad structure given by ρ (X ⊗ pr 0 ) ∈ L( S I X, X) and α (X ⊗ L) ∈ L( S I X, S 2 I X).Through the adjunction S I ⊣ S I the functor S I inherits a monad structure which is exactly the same as the monad structure of Section 2.1.This monad structure (ι 0 , τ ) can be described as the Curry transpose of the following morphisms (the monoidality isos are implicit) .
of S I (the same as the one defined in the general setting of Section 3).We have seen in Section 3 that equipped with this strength S I is a commutative monad and recalled that there is therefore an associated lax monoidality L X0,X1 ∈ L(S I X 0 ⊗ S I X 1 , S I (X 0 ⊗ X 1 )) which can be seen as arising from L by transposing the following morphism (again we keep the monoidal isos implicit)

Canonically summable SMCC
In a SMCC, the conditions of Theorem 4.1 admit a slightly simpler formulation.
The functor S I defined by S I E = (I ⊸ E) (and similarly for morphisms) coincides exactly with the functor S described in Example 2.2.Therefore the associated summability is the one described in Example 2.3. Let Assume that t 0 and t 1 are summable, that is t 0 ∩ t 1 = ∅ and t 0 ∪ t 1 ∈ Coh(1, E), we must prove that s 0 ∩ s 1 = ∅ and s 0 ∪ s 1 ∈ Coh(I, E).Let (j i , a i ) ∈ s i for i = 0, 1.We have ( * , a i ) ∈ t i and hence a 0 = a 1 from which it follows that (j 0 , a 0 ) = (j 1 , a 1 ).Since j 0 ¨I j 1 and (j 0 , a 0 ), (j 1 , a 1 ) ∈ s 0 ∪ s 1 ∈ Coh(I, E), we have a 0 ¨E a 1 .Hence s 0 and s 1 are summable.◭

Differentiation in a canonically summable category
Let L be a resource category (see the beginning of Section 3.1) which is canonically summable.Doubtlessly the following lemma is a piece of categorical folklore, it relies only on the adjunction S I ⊣ S I and on the functoriality of ! .Let η X ∈ L(X, S I S I X) and ε X ∈ L( S I S I X, X) be the unit and counit of this adjunction.Let ϕ X : L(!S I X, S I !X) be a natural transformation, then we define a natural transformation ϕ − X ∈ S I !X → !S I X as the following composition of morphisms Conversely given a natural transformation ψ X ∈ L( S I !X, !S I X) we define a natural transformation ψ + X ∈ L(!S I X, S I !X) as the following composition of morphisms !S I X S I S I !S I X S I !S I S I X S I !X .Proof.Simple computation using the basic properties of adjunctions and the naturality of the various morphisms involved.
Lemma 4.4.Let ∂ X ∈ L( S I !X, !S I X) be a natural transformation.The associated natural transformation ∂ + X ∈ L(!S I X, S I !X) satisfies (∂-chain) iff the two following diagrams commute in other words ∂ X is a co-distributive law S I !X → !S I X.These conditions will be called (C∂-chain).
Proof.Consists of computations using naturality and adjunction properties.As an example, assume the second commutation and let us prove the second diagram of (∂-chain): We have The other computations are similar.
Let ∂ X ∈ L(!X ⊗ I, !(X ⊗ I)) satisfying (C∂-chain).We introduce additional conditions.We keep implicit some of the monoidal isos associated with ⊗ to increase readability.
be a natural transformation.The two following conditions are equivalent.
• ∂ + is a differentiation in (L, S I ) (in the sense of Definition 3.2).We show now that this differential structure boils down to a much simpler one.
4.5 A !-coalgebra structure on I induced by a canonical differential structure Let ∂ X ∈ L(!X ⊗ I, !(X ⊗ I)) be a natural transformation which satisfies the conditions of Definition 4.2.
The next result will be technically useful in the sequel and has also its own interest as it deals with differentiation with respect to a tensor product, showing essentially that it boils down to differentiation with respect to one of the components of the tensor product.
Theorem 4.5.The following diagram commutes, for all objects X 0 , X 1 of L.
) is defined as the following composition of morphisms so that we have We start rewriting the right hand expression.We introduce notations f 0 , f 1 . . .for subexpressions.By Lemma 4.5 we have where q 0 = ρ X0 (pr 0 ⊗ pr 0 ), Then by naturality of dig.Next by (C∂-chain).By Lemma 4.5 again (applied under the functor ! ) we have Finally we have by definition of µ 2 X0,X1 .We define δ ∈ L(I, !I) as the following composition of morphisms Proof.We have by Theorem 4.5.We obtain the announced equation by µ 2 X,1 (!X ⊗ µ 0 ) = !(ρX ) −1 ρ !X , the naturality of ∂ and the fact that ρ Theorem 4.6.The morphism δ is a !-coalgebra structure on I. Moreover the following commutations hold.(∂ca-local) (∂ca-lin) Proof.We have, using the fact that (1, µ 0 ) is an !-coalgebra, We have proven that (I, δ) is an !-coalgebra.
4.6 From a coalgebra structure on I to a canonical differential structure.
Assume now conversely that L is a canonically summable resource category where I is exponentiable and that we have a morphism δ ∈ L(I, !I).Then we can define a morphism ∂ X ∈ L(!X ⊗ I, !(X ⊗ I)) as the following composition of morphisms.
This morphism is natural in X by the naturality of µ 2 .Proof.⊲ (C∂-chain).We have by the properties of the lax monoidality (µ 2 , µ 0 ).⊲ (C∂-lin).We have which ends the proof of the theorem.
We can summarize the results obtained in this section as follows.
Theorem 4.8.Let L be a resource category which is canonically summable.Then there is a bijective correspondence between • the differential structures (∂ X ) X∈L on the canonical summability structure (S I , π 0 , π 1 , σ) of L • and the !-coalgebra structures δ on I which satisfy (∂ca-local) and (∂ca-lin).
When the second condition holds, the associated differentiation ∂ X ∈ L(!S I X, S I !X) is cur where d is the following composition of morphisms.
Remark 4.2.This correspondence can certainly be made functorial, this is postponed to further work.
Theorem 4.9.If L is a Lafont resource category which is canonically summable then there is exactly one differential structure on the canonical summability structure of L.
Proof.Since (I, pr 0 , L) is a commutative comonoid, we know by Lemma 1.4 that there is exactly one morphism δ ∈ L(I, !I) such that the following diagrams commute 5 The differential structure of coherence spaces Equipped with the multiset exponential introduced in Section 3.4 it is well known that Coh is a Lafont resource category as observed initially by Van de Wiele (unpublished, see [Mel09]).Since Coh is canonically summable, we already know that it has a unique differential structure by Theorem 4.9.We will show that we retrieve in that way the differential structure outlined in Section 3.4.Remember that I = 1 & 1 so that |I| = {0, 1} × {(0, 1)} with (i, j) ¨I (i ′ , j ′ ) for each i, j, i ′ , j ′ ∈ {0, 1}.The comonoid structure of I = 1 & 1 is given by pr 0 = {(0, * )} ∈ Coh(I, 1) and L = {(0, (0, 0)), (1, (1, 0)), (1, (0, 1))} ∈ Coh(I, I ⊗ I).The n-ary comultiplication of this comonoid is L (n) ∈ Coh(I, I ⊗n ) given by The unique δ ∈ Coh(I, !I) specified by Theorem 4.9 is given by  The proviso that a / ∈ {a 1 , . . ., a n } arises from uniformity of the exponential: we must have Finally, upon identifying |!I ⊸ E| with which is exactly the definition announced in Equation (2).The fact that this is a natural transformation satisfying all the commutations required to turn Coh into a differential summable category results from Theorem 2 and Theorem 4.4.

Differentiation in non-uniform coherence spaces
In Remark 3.5 we have pointed out that the uniform definition of !E in coherence spaces makes our differentials "too thin" in general although they are non trivial and satisfy all the required rules of the differential calculus.We show briefly how this situation can be remedied using non-uniform coherence spaces.
A non-uniform coherence space (NUCS) is a triple E = (|E|, ˝E, ˇE) where |E| is a set and ˝E and ˇE are two disjoint binary symmetric relations on |E| called strict coherence and strict incoherence.The important point of this definition is not what is written but what is not: contrarily to usual coherence spaces we do not require the complement of the union of these two relations to be the diagonal: it can be any (of course symmetric) binary relation on |E| that we call neutrality and denote as ≡ E (warning: it needs not even be an equivalence relation!).Then we define coherence as ¨E = ˝E ∪≡ E and incoherence ˚E = ˇE ∪ ≡ E and any pair of relations among these 5 (with suitable relation between them such as ≡ E ⊆ ˚E), apart from the trivially complementary ones (ˇE, ¨E) and (˝E, ˚E), are sufficient to define such a structure.
Cliques are defined as usual: ) is a cpo (a dIdomain actually) but now there can be some a ∈ |E| such that a ˇE a, and hence {a} / ∈ Cl(E) (we show below that this really happens).Given NUCS E and F we define Then we define a category NCoh by NCoh(E, F ) = Cl(E ⊸ F ), taking the diagonal relations as identities and ordinary composition of relations as composition of morphisms.This is a cartesian SMCC with tensor product given by |E 0 ⊗ E 1 | = |E 0 |× |E 1 | and (a 00 , a 01 ) ¨E0⊗E1 (a 10 , a 11 ) if a 0j ¨Ej a 1j for j = 0, 1, and ≡ E0⊗E1 is defined similarly; the unit is 1 with |1| = { * } and * ≡ 1 * (so that 1 ⊥ = 1 meaning that the model satisfies a strong form of the MIX rule of LL).The object of linear morphisms from E to F is of course E ⊸ F and NCoh is * -autonomous with 1 as dualizing object.The dual E ⊥ is given by |E ⊥ | = |E|, ˝E⊥ = ˇE and ˇE⊥ = ˝E.The cartesian product &i∈I E i of a family (E i ) i∈I of NUCS is given by |& i∈I E i | = ∪ i∈I {i} × |E i | with (i 0 , a 0 ) ≡ &i∈I Ei (i 1 , a 1 ) if i 0 = i 1 = i and a 0 ≡ Ei a 1 , and (i 0 , a 0 ) ¨&i∈I Ei (i 1 , 1 ) if i 0 = i 1 = i ⇒ a 0 ¨Ei a 1 .We do not give the definition of the operations on morphisms as they are the most obvious ones (the projections of the product are the relations pr i = {((i, a), a) | i ∈ I and a ∈ |E i |}).Notice that in the object Bool = 1 ⊕ 1 = (1 & 1) ⊥ , the two elements 0, 1 of the web satisfy 0 ˇBool 1 so that {0, 1} / ∈ Cl(Bool) which is expected in a model of deterministic computations.
The comonoid structure (pr 0 , L) is exactly the same as in Coh and therefore the morphism δ ∈ NCoh(I, !I) (whose existence and properties result from the fact that NCoh is Lafont) is defined exactly as in Coh: The functor S I can be described as follows: |S I E| = {0, 1} × |E| and (i 0 , a 0 ) ≡ S I E (i 1 , a 1 ) if i 0 = i 1 and a 0 ≡ E a 1 , and (i 0 , a 0 ) ¨SI E (i 1 , a 1 ) if (a 0 ¨E a 1 and a 0 ≡ E a 1 ⇒ i 0 = i 1 ).Given s ∈ L(E, F ) we have S I s = {((i, a), (i, b)) | i ∈ {0, 1} and (a, b) ∈ s}.By the same computation as in Coh (but now without the uniformity restrictions of Coh) we get that which is in NCoh(!S I E, S I !E) and satisfies all the required properties by Theorem 2 and Theorem 4.4.
Remark 5.1.This means that the issue with Girard's uniform coherence spaces with respect to differentiation that we explained in Remarks 3.4 and 3.5 disappears in the non-uniform coherence space setting, at least if we use Boude's exponentials so that any morphism will coincide with its Taylor expansion in this model.This non-uniform model preserves the main feature of coherence spaces, namely that in the type Bool for instance, the only possible values are true and false (and not the non-deterministic superposition of these values) as we have seen above with the description of 1 ⊕ 1.
Remark 5.2.The category Rel of sets of relation, being a model of differential linear logic, is a special case of summable differential resource category.That model is actually exactly the same as nCoh where objects are stripped from their coherence structure: the logical constructs in Rel coincide with the constructs we perform on the webs of the objects of nCoh.For instance, given a set X, the object !X in Rel is simply M fin (X).And similarly for the operation on morphisms: as constructions on relations, they are exactly the same as in nCoh.This identification extends even to ∂ X .So one of the outcomes of this paper is the fact that the constructions of differential linear logic in Rel are compatible with the coherence structure of nCoh, if we are careful enough with morphism addition.This is all the point of our categorical axiomatization to explain what this carefulness means.

Summability in a SMCC
Assume now that L is a summable resource category which is closed with respect to its monoidal product ⊗, so that L ! is cartesian closed.We use X ⊸ Y for the internal hom object and ev ∈ L((X ⊸ Y )⊗X, Y ) for the evaluation morphism.If f ∈ L(Z ⊗ X, Y ) we use cur f for its transpose ∈ L(Z, X ⊸ Y ).
We can define a natural morphism ϕ Proof.The first two equations come from the fact that π i ϕ 0 = π i ⊗ X.The last one results from Lemma 3.1.
Then we introduce a further axiom, required in the case of an SMCC.Its intuitive meaning is that two morphisms f 0 , f 1 are summable if they map any element to a pair of summable elements, and that their sum is computed pointwise.(S⊗-fun) The morphism ϕ ⊸ is an iso.Lemma 6.2.If (S⊗-fun) holds then f 0 , f 1 ∈ L(Z ⊗ X, Y ) are summable iff cur f 0 and cur f 1 are summable.Moreover when this property holds we have cur (f 0 + f 1 ) = cur f 0 + cur f 1 .
Proof.Assume that f 0 , f 1 are summable so that we have the witness f 0 , f 1 S ∈ L(Z ⊗ X, SY ) and hence cur f 0 , f 1 S ∈ L(Z, X ⊸ SY ), so let h = (ϕ ⊸ ) −1 cur f 0 , f 1 S ∈ L(Z, S(X ⊸ Y )).By Lemma 6.1 we have π i h = (I ⊸ π i ) cur f 0 , f 1 S = cur f i for i = 0, 1. Conversely if cur f 0 , cur f 1 are summable we have the witness cur f 0 , cur f 1 S ∈ L(Z, S (X ⊸ Y )) and hence ϕ ⊸ cur f 0 , cur f 1 S ∈ L(Z, X ⊸ SY ) so that g = ev((ϕ ⊸ cur f 0 , cur f 1 S ) ⊗ X) ∈ L(Z ⊗ X, Y ).Then by naturality of ev and by Lemma 6.1 we get π i g = f i for i = 0, 1 and hence f 0 , f 1 are summable.
Proof.In this case, we know from Section 4.2 that ϕ ⊸ is the double transpose of the following morphism of L and therefore is an iso.
We know that L ! is a cartesian closed category, with internal hom-object (X ⇒ Y, Ev) (with X ⇒ Y = (!X ⊸ Y ) and Ev defined using ev).Then if L is a differential summable resource category which is closed wrt.⊗ and satisfies (S⊗-fun), we have a canonical iso between D(X ⇒ Y ) and X ⇒ DY and two morphisms f 0 , f 1 ∈ L ! (Z & X, Y ) are summable (in L) iff Cur f 0 , Cur f 1 ∈ L ! (Z, X ⇒ Y ) are summable and then Cur f 0 + Cur f 1 = Cur (f 0 + f 1 ).

Sketch of a syntax
We outline a tentative syntax corresponding the semantic framework of this paper and strongly inspired by it.Our choice of notations is fully coherent with the notations chosen to describe the model, suggesting a straightforward denotational interpretation.This section should only be considered as an introduction for another paper which will introduce a differential version of PCF fully compatible with our new semantics.
The Typing rules.We provide some of the typing rules in Figure 1.The most important feature of this typing system is that it does not contain the rule Γ ⊢ M 0 : A Γ ⊢ M 1 : A Γ ⊢ M 0 + M 1 : A typical of the original differential λ-calculus of [ER03].So the most tricky rules have to do with term addition: some such rules are required since sums are allowed in the syntax, and actually occur during the reduction.We arrived to the three rules mentioned in this figure, where → lin is a very simple rewriting system expressing that sums commute with the linear constructs of the syntax, for instance (M 0 + M 1 )N → lin (M 0 )N + (M 0 )N .
It is then possible to prove that if Γ, x : A ⊢ M : B and Γ ⊢ N : A then Γ ⊢ M [N/x] : B, and if Γ ⊢ P : DA then Γ ⊢ ∂(x, P, M ) : DB.
Reduction rules.Our rewriting system contains the rules of the already mentioned system → lin which expresses that most constructs are linear with respect to 0 and to sums of terms, for instance D(M 0 + M 1 ) → lin DM 0 + DM 1 or (0)M → lin 0; the only non-linear construct is the argument side of application.Here are some of the other reduction rules: Some additional rules are also required, expressing in particular how constructs applied at different depths commute.Semantically, the definition of ∂(x, N, M ) and the reduction rules are justified by the fact that when L is a differential summable resource SMCC, the category L ! is cartesian closed and the functor D acts on it as a strong monad; of course the type DA will be interpreted by DX where X is the interpretation of A. The syntactic construct DM corresponds to the "internalization" (X ⇒ Y ) → (DX ⇒ DY made possible by the strength of D (see Section 3.2).The reduction rules concerning π i are based on the basic properties of the functor S and on the definition of the "multiplication" θ of the monad D.
With these reduction rules, one can prove a form of subject reduction: if Γ ⊢ M : A and M → M ′ then Γ ⊢ M ′ : A.
Remark 7.1.The only rule introducing sums of terms is the reduction of π d 1 (θ d (M )).Since the terms θ d (M ) are created only by the definition of ∂(x, N, (P )Q) we retrieve the fact that, in the differential λ-calculus, sums are introduced by the definition of ∂(s)t ∂x • u.Therefore the reduction of a term which contains no π d i 's will lead to as sum-free term.It is only when we will want to "read" some information about the differential content of this term that we will apply to it some π d i which will possibly create sums when interacting with the θ's contained in the term and typically created by the reduction.These θ's are markers of the places where sums will be created.But we can try to be clever and create as few sums as necessary, whereas the differential λ-calculus creates all possible sums immediately in the course of the reduction.This possible parsimony in the creation of sums is very much in the spirit of the effectiveness considerations of [BMP20, MP21].
Remark 7.2.This is only the purely functional core of a differential programming language where the ground type ι is unspecified.We will extend the language with constants n : ι for n ∈ N, and with successor, predecessor, and conditional constructs turning it into a type of natural numbers.Since these primitives (as well as many others such as arbitrary recursive types) are easy to interpret in our coherent differential models (such as Coh, NCoh or PCS), they can be integrated smoothly in the language as well.Notice to finish that, contrarily to what happens in Automatic Differentiation, the operation + on terms is not related to an operation of addition on a ground numerical data type: in AD, one of the the ground types is R and the + on terms extends the usual addition of real numbers pointwise.In AD, the derivatives are accordingly defined with respect to this structure of ground types whereas in our setting the derivatives are taken with respect to the summability structure.

Recursion
One major feature of the models of the differential λ-calculus that we can tackle with the new approach developed in this paper is that they can have fixpoint operators in L ! (X ⇒ X, X) implementing general recursion.This is often impossible in an additive category (typically the categories of topological vector spaces where the differential λ-calculus is usually interpreted): given a closed term M of type A, one can define a term λx A (x + M ) : A ⇒ A which cannot have a fixpoint in general if addition is not idempotent.
In contrast consider for instance the category Pcoh [DE11].It is a differential canonically summable resource SMCC where addition is not idempotent and where all least fixpoint operators are available.And accordingly we can extend our language with a construct YM typed by Γ ⊢ YM : A if Γ ⊢ M : A ⇒ A, with the usual reduction rule YM → (M )YM and so morphisms defined by such fixpoints can also be differentiated.It turns out that we can easily extend the definition of ∂(x, N, P ) to the case where P = YM with Γ, x : A ⊢ M : B ⇒ B and Γ ⊢ N : DA.The correct definition seems to be ∂(x, N, YM ) = Y(λy DB θ 0 ((D∂(x, N, M ))y)) .

Conclusion
This coherent setting for the formal differentiation of functional programs should allow to integrate differentiation as an ordinary construct in any functional programming language, without breaking the determinism of its evaluation, contrarily to the original differential λ-calculus, whose operational meaning was unclear essentially for its non-determinism.Moreover the differential construct features commutative monadic structures strongly suggesting to consider it as an effect.The fact that this differentiation is compatible with models such as (non uniform) coherence spaces which have nothing to do with "analytic" differentiation suggests that it could also be used for other operational goals, more internal to the scope of general purpose functional languages, such as incremental computing.
In a SMC L (with the usual notations), a commutative comonoid is a tuple C = (C, w C , c C ) where C ∈ L, w C ∈ L(C, 1) and c C ∈ L(C, C ⊗ C) are such that the following diagrams commute.
the relational composition of t and s) belongs to Cl(E ⊸ G) and the diagonal relation Id E belongs to Cl(E ⊸ E).
and then the associated natural transformation ∂ E ∈ Coh(!S I E, ) is cur d where d is the following composition of morphisms: Figure 1: Typing rules