Hostname: page-component-89b8bd64d-b5k59 Total loading time: 0 Render date: 2026-05-05T22:09:12.787Z Has data issue: false hasContentIssue false

Automatic differentiation for ML-family languages: Correctness via logical relations

Published online by Cambridge University Press:  21 October 2024

Fernando Lucatelli Nunes*
Affiliation:
Department of Information and Computing Sciences, Utrecht University, Utrecht, the Netherlands Department of Mathematics, University of Coimbra, CMUC, Coimbra, Portugal
Matthijs Vákár
Affiliation:
Department of Information and Computing Sciences, Utrecht University, Utrecht, the Netherlands
*
Corresponding author: Fernando Lucatelli Nunes; Email: fernandolucatellinunes@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

We give a simple, direct, and reusable logical relations technique for languages with term and type recursion and partially defined differentiable functions. We demonstrate it by working out the case of automatic differentiation (AD) correctness: namely, we present a correctness proof of a dual numbers style AD code transformation for realistic functional languages in the ML-family. We also show how this code transformation provides us with correct forward- and reverse-mode AD.

The starting point is to interpret a functional programming language as a suitable freely generated categorical structure. In this setting, by the universal property of the syntactic categorical structure, the dual numbers AD code transformation and the basic $\boldsymbol{\omega } \mathbf{Cpo}$-semantics arise as structure preserving functors. The proof follows, then, by a novel logical relations argument.

The key to much of our contribution is a powerful monadic logical relations technique for term recursion and recursive types. It provides us with a semantic correctness proof based on a simple approach for denotational semantics, making use only of the very basic concrete model of $\omega$-cpos.

Information

Type
Special Issue: Differential Structures
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Typing rules for a basic source language with real conditionals, where $\mathtt{R}\subset \mathbb{R}$ is a fixed set of real numbers containing $0$.

Figure 1

Figure 2. Typing rules for term recursion and iteration.

Figure 2

Figure 3. Basic $\beta \eta$-equational theory for our language. We write $\beta \eta$-equality as $\equiv$ to distinguish it from equality in let-bindings. We write ${\# x_1,\ldots, x_n}{\equiv }$ to indicate that the variables are fresh in the left-hand side. In the top right rule, $x$ may not be free in $r$. Equations hold on pairs of computations of the same type.

Figure 3

Figure 4. Extra typing rules for the target language with iteration and recursion, where we denote $\mathbb{N} ^\ast := \mathbb{N} - \{ 0 \}$, $\mathbf{real} ^1 := \mathbf{real}$ and $\mathbf{real} ^{i+1} = \mathbf{real} ^i \times \mathbf{real}$.

Figure 4

Figure 5. Assignment that gives the universal property of the source language.

Figure 5

Figure 6. Assignment that gives the universal property of the target language.

Figure 6

Figure 7. AD macro $\mathcal{D}\;{(-)}$ defined on types and computations. All newly introduced variables are chosen to be fresh. We provide a more efficient way of differentiating $\mathbf{sign}\,$ in Appendix B.

Figure 7

Figure 8. AD assignment.

Figure 8

Figure 9. Semantics’ assignment for each primitive operation $\mathrm{op} \in \mathrm{Op}_n$ ($n\in \mathbb{N}$) and each constant $c\in \mathtt{R}$.

Figure 9

Figure 10. Typing rules for the recursive types extension.

Figure 10

Figure 11. The standard $\beta \eta$-equational theory for recursive types in CBV.

Figure 11

Figure 12. The definitions of AD on recursive types.

Figure 12

Figure A1. Typing rules for the our fine-grain CBV language with iteration and real conditionals. We use a typing judgement $\vdash ^v$ for values and $\vdash ^c$ for computations.

Figure 13

Figure A2. Standard $\beta \eta$-laws for fine-grain CBV. We write ${\# x_1,\ldots, x_n}{\equiv }$ to indicate that the variables are fresh in the left-hand side. In the top right rule, $x$ may not be free in $r$. Equations hold on pairs of terms of the same type.

Figure 14

Figure A3. A forward-mode AD macro defined on types as $\mathcal{D}\;{(-)}$, values as $\mathcal{D}_{\mathcal{V}}(-)$, and computations as $\mathcal{D}_{\mathcal{C}}(-)$. All newly introduced variables are chosen to be fresh.