Hostname: page-component-89b8bd64d-ksp62 Total loading time: 0 Render date: 2026-05-12T16:44:34.644Z Has data issue: false hasContentIssue false

Numerical approximation of McKean–Vlasov SDEs via stochastic gradient descent

Published online by Cambridge University Press:  23 March 2026

Ankush Agarwal*
Affiliation:
Western University
Andrea Amato*
Affiliation:
Università di Bologna
Stefano Pagliarani*
Affiliation:
Università di Bologna
Gonçalo dos Reis*
Affiliation:
The University of Edinburgh
*
*Postal address: Statistical and Actuarial Sciences, Western University, Canada. Email: ankush.agarwal@glasgow.ac.uk
**Postal address: Università di Bologna, Italy
**Postal address: Università di Bologna, Italy
*****Postal address: School of Mathematics, The University of Edinburgh, UK. Email: g.dosreis@ed.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

We propose a novel approach to numerically approximate McKean–Vlasov stochastic differential equations (MV-SDEs) using stochastic gradient descent (SGD) while avoiding the use of interacting particle systems (IPSs) and the associated simulation costs required to achieve the ‘propagation of chaos’ limit. The SGD technique is deployed to solve a Euclidean minimization problem, obtained by first representing the MV-SDE as a minimization problem over the set of continuous functions of time, and then approximating the domain with a finite-dimensional sub-space. Convergence is established by proving certain intermediate stability and moment estimates of the relevant stochastic processes, including the tangent processes. Numerical experiments illustrate the competitive performance of our SGD-based method compared with the IPS benchmarks. This work offers a theoretical foundation for using the SGD method in the context of numerical approximation of MV-SDEs, and provides analytical tools to study its stability and convergence.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press on behalf of Applied Probability Trust
Figure 0

Table 1. Main notation for spaces and classic operators.

Figure 1

Table 2. Main notation for primary processes and auxiliary functions.

Figure 2

Algorithm 1: SGD-MVSDE

Figure 3

Algorithm 2: Mini-batch-SGD-MVSDE

Figure 4

Figure 1. Flow-chart of Algorithm 1.

Figure 5

Table 3. Kuramoto–Shinomoto–Sakaguchi MV-SDE. Average number of iterations m (execution time in seconds) over 1000 independent runs of the algorithm to achieve relative error accuracy $\varepsilon_m<1\%$ with the best combination of $(r_0,\rho)$ for each pair (n,M). Here $T=0.5$, $x_0=0.5$, and $\sigma=0.5$, and the Monte Carlo benchmark exhibited an execution time of 1.98 seconds.

Figure 6

Table 4. Kuramoto–Shinomoto–Sakaguchi MV-SDE. Average number of iterations m (execution time in seconds) over 1000 independent runs of the algorithm to achieve relative error accuracy $\varepsilon_m<1\%$ with the best combination of $(r_0,\rho)$ for each pair (n,M). Here $T=1$, $x_0=0.5$, and $\sigma=0.5$, and the Monte Carlo benchmark exhibited an execution time of 3.60 seconds.

Figure 7

Table 5. Kuramoto–Shinomoto–Sakaguchi MV-SDE. Average number of iterations m (execution time in seconds) over 1000 independent runs of the algorithm to achieve relative error accuracy $\varepsilon_m<1\%$ with the best combination of $(r_0,\rho)$ for each pair (n,M). Here $T=2$, $x_0=0.5$, and $\sigma=0.5$, and the Monte Carlo benchmark exhibited an execution time of 6.59 seconds.

Figure 8

Figure 2. Kuramoto–Shinomoto–Sakaguchi MV-SDE. Comparison of the output curves $\mathscr{L} \textbf{a}_m=(\mathscr{L} \textbf{a}_m^1, \mathscr{L} \textbf{a}_m^2)$ of the SGD algorithm versus the benchmark curves ${\bar{\gamma}}^{\textrm{MC}}=({\bar{\gamma}}^{\textrm{MC}}_1 , {\bar{\gamma}}^{\textrm{MC}}_2)$ for all values of n, for time-step size $h=10^{-2}$, $T=0.5$, $M=1000$, $x_0=0.5$, and $\sigma=0.5$.

Figure 9

Figure 3. Kuramoto–Shinomoto–Sakaguchi MV-SDE. Comparison of the output curves $\mathscr{L} \textbf{a}_m=(\mathscr{L} \textbf{a}_m^1, \mathscr{L} \textbf{a}_m^2)$ of the SGD algorithm versus the benchmark curves ${\bar{\gamma}}^{\textrm{MC}}=({\bar{\gamma}}^{\textrm{MC}}_1 , {\bar{\gamma}}^{\textrm{MC}}_2)$ for all values of n, for time-step size $h=10^{-2}$, $T=1$, $M=1000$, $x_0=0.5$, and $\sigma=0.5$

Figure 10

Table 6. Polynomial drift MV-SDE. Average number of iterations (execution time in seconds) over 1000 independent runs of the algorithm to achieve relative error accuracy $\varepsilon_m<1\%$ with the best combination of $(r_0,\rho)$ for each pair (T,M). Here $x_0=1$, $\delta=0.8$ and the Monte Carlo benchmark exhibited an execution time of 2.57 seconds for $T=0.1$, 9.78 seconds for $T=0.5$, and 19.53 seconds for $T=1$.

Figure 11

Figure 4. Kuramoto–Shinomoto–Sakaguchi MV-SDE. Comparison of the output curves $\mathscr{L} \textbf{a}_m=(\mathscr{L} \textbf{a}_m^1, \mathscr{L} \textbf{a}_m^2)$ of the SGD algorithm versus the benchmark curves ${\bar{\gamma}}^{\textrm{MC}}=({\bar{\gamma}}^{\textrm{MC}}_1 , {\bar{\gamma}}^{\textrm{MC}}_2)$ for all values of n, for time-step size $h=10^{-2}$, $T=2$, $M=1000$, $x_0=0.5$, and $\sigma=0.5$

Figure 12

Figure 5. Polynomial drift MV-SDE. Comparison of the output curves $\mathscr{L} \textbf{a}_m=(\mathscr{L} \textbf{a}_m^1, \mathscr{L} \textbf{a}_m^2)$ of the SGD algorithm versus the benchmark curves ${\bar{\gamma}}^{\textrm{MC}}=({\bar{\gamma}}^{\textrm{MC}}_1 , {\bar{\gamma}}^{\textrm{MC}}_2)$ for all values of T, for time-step size $h=10^{-2}$, $n=3$, $M=1000$, $x_0=1$, and $\delta=0.8$.

Figure 13

Figure 6. Monte Carlo method densities $\tilde{w}^{(K),\textrm{MC}}_T(x)$ over the interval $[\!-3,4]$, for $T=1$ and $K = 3, 5, 10, 20$. The benchmark vector ${\bar{\gamma}}^{\textrm{MC}}(T)$ was computed with $N_{0}=10^7$ particles. Here $X_0 \sim \mathcal{N}_{(0,1)}$ and $\sigma = 0.1$. In the model (30) we had $X_0 \sim \mathcal{N}_{(0,1)}$ and $\sigma = 0.1$.

Figure 14

Figure 7. Comparison of $\tilde{w}^{(K),\textrm{SGD}}_T(x)$ and $\tilde{w}^{(K),\textrm{MC}}_T(x)$ over the interval $[\!-3,4]$, for $T=1$ and $K = 10$. The benchmark vector ${\bar{\gamma}}^{\textrm{MC}}(T)$ was computed with $N_{0}=10^7$ particles. The SGD output $\mathscr{L} \textbf{a}_m(T)$ was obtained after $m=172$ iterations (35 seconds of computation time), with parameters $n = 3$, $M = 100$, $r_0 = 5$, and $\rho = 0.9$. In the model (30) we had $X_0 \sim \mathcal{N}_{(0,1)}$ and $\sigma = 0.1$.