On McKean–Vlasov branching diffusion processes

Julien Claisse; Jiazhi Kang; Xiaolu Tan

doi:10.1017/jpr.2026.10097

On McKean–Vlasov branching diffusion processes

Part of: Stochastic analysis Markov processes Stochastic processes Limit theorems

Published online by Cambridge University Press: 15 May 2026

Julien Claisse ,

Jiazhi Kang and

Xiaolu Tan

Show author details

Julien Claisse*: Affiliation:
Université Paris-Dauphine, PSL University
Jiazhi Kang*: Affiliation:
The Chinese University of Hong Kong
Xiaolu Tan*: Affiliation:
The Chinese University of Hong Kong
*: *Email address: claisse@ceremade.dauphine.fr
**Email address: jzkang@math.cuhk.edu.hk
***Email address: xiaolu.tan@cuhk.edu.hk

Article contents

Abstract
Introduction
Main results
Strong formulation and well-posedness
Weak formulation and propagation of chaos
Funding information
Competing interests
References

Rights & Permissions

Abstract

We study a nonlinear branching diffusion process in the sense of McKean, i.e. where particles are subjected to a mean-field interaction. We consider first a strong formulation of the problem and we provide an existence and uniqueness result by using contraction arguments. Then we consider the notion of weak solution and its equivalent martingale problem formulation. In this setting, we provide a general weak existence result, as well as a propagation of chaos property, i.e. the McKean–Vlasov branching diffusion is the limit of a large-population branching diffusion process with mean-field interaction.

Keywords

Branching diffusion McKean–Vlasov process mean field propagation of chaos

MSC classification

Primary: 60J80: Branching processes (Galton-Watson, birth-and-death, etc.) 60H20: Stochastic integral equations 60F05: Central limit and other weak theorems

Secondary: 60G46: Martingales and classical analysis 60J60: Diffusion processes

Information

Type: Original Article
Information: Journal of Applied Probability , First View , pp. 1 - 25

DOI: https://doi.org/10.1017/jpr.2026.10097 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright: © The Author(s), 2026. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

The McKean–Vlasov stochastic differential equation (SDE) has been introduced to describe the limit of a large-population system, where each particle dynamic is ruled by an SDE whose coefficient depends on an interaction term given by the empirical distribution induced by the whole population. In the symmetric setting, where each particle SDE has the same coefficient functions and is driven by an independent Brownian motion, the interaction term converges to the distribution of a representative particle as the number of particles tends to infinity. Let us mention the pioneering work of McKean [Reference McKean26] and Kac [Reference Kac21], and the pedagogical lecture notes of Snitzman [Reference Sznitman30] for the early development of the subject. Recently, McKean–Vlasov SDEs have drawn much attention, partially due to the development of mean-field game theory introduced independently and simultaneously in [Reference Huang, Malhamé and Caines17, Reference Lasry and Lions25].

While standard McKean–Vlasov SDEs describe a large-population system, the population size stays unchanged from the beginning to the end. In this paper we will study the McKean–Vlasov SDE in a setting where the population size evolves according to a branching process. The mathematical modelling of population dynamics has been intensively studied for more than a century. In particular, the branching process theory has been greatly developed due to its various applications in biology, ecology, medicine, etc., for which let us simply refer to [Reference Bansaye and Méléard1]. We are particularly interested in the branching diffusion process, which is a branching process where each particle has a feature, such as its spatial position, whose dynamic is ruled by a diffusion process. Such processes were first studied in [Reference Skorokhod28] and [Reference Ikeda, Nagasawa and Watanabe18] in the 1960s, notably to represent a class of nonlinear partial differential equations (PDEs) as an extension of the Feynman–Kac formula. It has also been extended to represent a larger class of semilinear PDEs, and to serve within a Monte Carlo method to solve the corresponding PDEs; see, e.g., [Reference Henry-Labordere, Oudjane, Tan, Touzi and Warin15, Reference Henry-Labordere, Tan and Touzi16]. Additionally, the control of branching diffusion processes has been studied in a context without mean-field interaction; see, e.g., [Reference Claisse4, Reference Kharroubi and Ocello22, Reference Nisio27, Reference Ustunel31]. We also mention the recent work in [Reference Claisse, Ren and Tan6] where the authors study a controlled branching diffusion process with interaction in a mean-field game setting.

In this paper we will study a class of McKean–Vlasov branching diffusion processes where the coefficient functions depend on an interaction term given by the distribution of the process itself. Equivalently, it is an extension of the McKean–Vlasov SDE, where the population size dynamic is ruled by a branching process. Such processes have already been studied in [Reference Fontbona and Méléard12, Reference Fontbona and Muñoz-Hernández13], in a setting where the branching process is a birth–death process and the interaction is given by a convolution type term. While a weak-convergence-type propagation of chaos result is obtained in [Reference Fontbona and Méléard12], a convergence rate result under the dual bounded-Lipschitz norm is given by [Reference Fontbona and Muñoz-Hernández13] using a coupling argument. In both [Reference Fontbona and Méléard12, Reference Fontbona and Muñoz-Hernández13], the finite population branching processes are constructed explicitly, while the limit McKean–Vlasov branching process is implicitly described by the Fokker–Planck equation. Here we will describe the limit McKean–Vlasov branching diffusion process by using a more explicit SDE formulation and then study its well-posedness. Moreover, we will consider a branching diffusion with more general interaction under a path-dependent setting, in the sense that the coefficient functions (drift, volatility, and branching parameters) are general functions of the whole path of the process and the interaction measure, and the interaction term is a measure on the path space as well. As in the classical SDE theory, we will first apply the contraction argument in order to obtain an existence and uniqueness result for strong solutions. Next, we consider the notion of weak solution as well as the related martingale problem formulation to obtain a general existence result. Based on this well-posedness result, an optimal control problem of the McKean–Vlasov branching diffusion processes is studied in [Reference Claisse, Kang, Lan and Tan5], and the corresponding Hamilton–Jacobi–Bellman master equation is studied in [Reference Ekren, He, Lan and Tan10]. Moreover, we provide a propagation of chaos result, that is, the McKean–Vlasov branching diffusion describes the limit of a large-population branching diffusion process with interaction given by the empirical distribution induced by the whole population. The result is based on the weak convergence technique and hence qualitative. Let us also mention that a quantitative propagation of chaos result with convergence rate is further obtained in [Reference Cao, Ren and Tan3] under additional regularity conditions.

The rest of the paper is organized as follows. In Section 2, we formulate the McKean–Vlasov branching diffusion and then provide our main results, including strong existence and uniqueness, weak existence, and propagation of chaos. In Section 3, we provide the proof of strong existence and uniqueness by contraction arguments. Finally, the proofs of weak existence as well as a propagation of chaos result by a weak convergence technique are completed in Section 4.

1.1. Notation

Let us first introduce some notation used in the rest of the paper.

First, let $(X, \rho)$ be a non-empty metric space. We denote by $\mathcal{P}(X)$ (resp. $\mathcal{M}(X)$ ) the space of all Borel probability measures (resp. non-negative finite measures) on X. Given a measure $\mu\in\mathcal{M}(X)$ and a mapping $f\colon E\to\mathbb{R}$ , we denote the integration by $\langle\mu,f\rangle \,:\!=\, \int_E f(x)\,\mu({\mathrm{d}} x)$ . Further, for $p\geq1$ , we denote by $\mathcal{P}_p(X)$ the space of probability measures on X with p-order moment, i.e. $\mathcal{P}_p(X)\,:\!=\,\big\{\mu\in\mathcal{P}(X)\colon\int_{X}\rho(x,x_0)^p\,\mu({\mathrm{d}} x)<+\infty\big\}$ for some (and thus all) fixed point $x_0\in X$ . Let us equip $\mathcal{P}_p(X)$ with the Wasserstein metric $\mathcal{W}_p$ given by

\begin{align*} \mathcal{W}_p(\mu,\nu) \,:\!=\, \inf_{\lambda\in\Lambda(\mu,\nu)}\Bigg(\int_{X\times X}\rho(x,y)^p\, \lambda({\mathrm{d}} x,{\mathrm{d}} y)\Bigg)^{1/p},\end{align*}

where $\Lambda(\mu,\nu)$ is the collection of all Borel probability measures $\lambda$ on $X\times X$ whose marginals are $\mu$ and $\nu$ respectively.

Next, to describe the genealogy of the branching process, we use the classical Ulam–Harris–Neveu notation. Let $\mathbb{K}\,:\!=\, \{\emptyset\}\cup\bigcup_{n=1}^{+\infty}\mathbb{N}^n$ be the set of labels. Given $k, k'\in\mathbb{K}$ with $k=k_1\ldots k_n$ and $k'=k'_1\ldots k'_m$ , we define the concatenation of labels $kk'\,:\!=\, k_1\ldots k_nk'_1\ldots k'_m$ , and write $k\prec k'$ if there exists $\tilde{k}$ such that $k'=k\tilde{k}$ , which means that k ^′ is a descendant of k. Let us define

\begin{align*} E \,:\!=\, \bigg\{\sum_{k\in K}\delta_{(k,x^k)}\colon K\subset\mathbb{K}\ \text{is finite},\, x^k\in\mathbb{R}^d, \text{ for all } k, k' \in K,\, k\nprec k'\bigg\},\end{align*}

where the Dirac measure $\delta_{(k,x^k)}$ corresponds to a particle identified by a label k and a position $x^k$ . The condition on the set of labels K ensures consistency with the genealogy of a population. Motivated by the study of superprocesses, it has become common to represent branching diffusions as measure-valued processes; see, e.g., [Reference Dawson, Maisonneuve, Spencer and Dawson7, 11]. Let $\mathbb{K}$ be equipped with the discrete topology; then E is a closed subspace of $\mathcal{M}(\mathbb{K}\times\mathbb{R}^d)$ under the weak convergence topology and thus E is a Polish space. We provide a metric $d_E$ on E which is consistent with the weak convergence topology: for all $e_1, e_2\in E$ such that $e_1 = \sum_{k\in K_1} \delta_{(k, x^k)}$ and $e_2 = \sum_{k\in K_2} \delta_{(k, y^k)}$ ,

(1)

\begin{equation} d_E(e_1,e_2) \,:\!=\, \sum_{k\in K_1\cap K_2}(|x^k-y^k|\wedge1) + \#(K_1\triangle K_2),\end{equation}

where $K_1\triangle K_2 \,:\!=\, (K_1\setminus K_2) \cup (K_2\setminus K_1)$ and $\#(K_1\triangle K_2)$ is the number of elements in $K_1\triangle K_2$ . Let $f\,:\!=\,(\,f^k)_{k\in\mathbb{K}}\colon\mathbb{K}\times\mathbb{R}^d\rightarrow\mathbb{R}$ be a function and $e\,:\!=\,\sum_{k\in K}\delta_{(k,x^k)}\in E$ ; we write $\langle e,f\rangle = \sum_{k\in K}f^k(x^k)$ . We also fix the reference point $e_0 \in E$ , the null measure, which means that the associated set of particles is empty.

Finally, we fix a constant $T > 0$ and an integer $d \ge 1$ throughout the paper, and denote by $\mathcal{C}^d\,:\!=\,C([0,T],\mathbb{R}^d)$ the space of all $\mathbb{R}^d$ -valued continuous paths on [0, T] equipped with the uniform convergence norm $\|\omega\| \,:\!=\, \sup\nolimits_{0 \le t \le T} |\omega_t|\wedge 1$ . Next, we denote by $\mathcal{D}_E\,:\!=\,D([0,T], E)$ the Skorokhod space of all E-valued càdlàg paths on [0, T]. For the weak formulation, we work with the classical Skorokhod metric on $\mathcal{D}_E$ , while, for the strong formulation, we consider the uniform convergence metric on $\mathcal{D}_E$ given by

(2)

\begin{equation} d(\omega^1, \omega^2) \,:\!=\, \sup\nolimits_{0\leq t \leq T}d_E(\omega^1_t,\omega^2_t) \quad \text{for all } \omega^1, \omega^2 \in \mathcal{D}_E,\end{equation}

and the truncated metric $d_t(\omega^1,\omega^2) \,:\!=\, d(\omega^1_{t\wedge\cdot},\omega^2_{t\wedge\cdot})$ for all $\omega^1, \omega^2 \in \mathcal{D}_E$ and $t\in[0,T]$ , so that

(3)

\begin{equation} \mathcal{P}_1(\mathcal{D}_E) = \Bigg\{\mu\in\mathcal{P}(\mathcal{D}_E) \colon \int_{\mathcal{D}_E}\sup\nolimits_{t\in[0,T]}\langle\omega_t,\mathbf{1}\rangle\,\mu({\mathrm{d}}\omega) < \infty\Bigg\}\end{equation}

since $d_E(\omega_t, e_0) = \langle\omega_t,\mathbf{1}\rangle$ , where $\mathbf{1}\colon\mathbb{K}\times\mathbb{R}^d \longrightarrow \mathbb{R}$ is the constant function defined by $\mathbf{1}(k,x) = 1$ for all $k \in \mathbb{K}$ , $x\in\mathbb{R}^d$ .

Let us also define the stopping operator $\pi_t \colon \mathcal{C}^d \longrightarrow \mathcal{C}^d$ (resp. $\pi_t\colon \mathcal{D}_E \longrightarrow \mathcal{D}_E$ ) by $\pi_t(\omega)\,:\!=\,\omega_{t\wedge\cdot}$ for all $\omega \in \mathcal{C}^d$ (resp. $\omega \in \mathcal{D}_E$ ). Moreover, given a probability measure $m \in \mathcal{P}(\mathcal{D}_E)$ and $t \ge 0$ , we define the pushforward measure $m_t\,:\!=\, m \circ (\pi_t)^{-1}$ .

2. Main results

2.1. McKean–Vlasov branching diffusion

Let us introduce first the McKean–Vlasov branching diffusion process by means of a pathwise description. We consider the coefficient functions

\begin{align*} \big(b,\sigma,\gamma,(p_{\ell})_{\ell\in\mathbb{N}}\big) \colon [0,T]\times\mathcal{C}^d\times\mathcal{P}(\mathcal{D}_E) \longrightarrow \mathbb{R}^d\times\mathbb{R}^{d\times d}\times[0,\bar\gamma]\times[0,1]^{\mathbb{N}},\end{align*}

where $\bar \gamma > 0$ is a fixed constant. Namely, b and $\sigma$ are the drift and diffusion coefficients for the movement of each particle, $\gamma$ is the death rate, and $(p_{\ell})_{\ell \in \mathbb{N}}$ is the probability mass function of the progeny distribution. In particular, $p_{\ell}(\!\cdot\!) \in [0,1]$ for each $\ell \in \mathbb{N}$ , and $\sum_{\ell \in \mathbb{N}} p_{\ell}(\!\cdot\!) = 1$ . Let us also define a partition $(I_{\ell}(\!\cdot\!))_{\ell \in \mathbb{N}}$ of [0,1) by $I_{\ell}(\!\cdot\!) \,:\!=\, \big[\sum_{i=0}^{\ell -1}p_i(\!\cdot\!),\sum_{i=0}^{\ell}p_i(\!\cdot\!)\big)$ for each $\ell\in\mathbb{N}$ .

Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space with filtration $\mathbb{F} = (\mathcal{F}_t)_{t \ge 0}$ , equipped with an $\mathcal{F}_0$ -measurable E-valued random variable $\xi$ , and a family of mutually independent, $\mathbb{R}^d$ -valued Brownian motions $(W^k)_{k\in\mathbb{K}}$ and Poisson random measures $(Q^k(ds,dz))_{k\in\mathbb{K}}$ on $[0,T] \times [0, \bar \gamma] \times [0,1]$ with Lebesgue intensity measure ${\mathrm{d}} s$ , ${\mathrm{d}} z$ .

We consider first a branching diffusion process with a fixed environment measure $\mu\in \mathcal{P}(\mathcal{D}_E)$ and initial state $\xi$ . It corresponds to an E-valued process $(Z_t)_{t \in [0,T]}$ given by

(4)

\begin{equation} Z_t \,:\!=\, \sum_{k\in K_t}\delta_{(k,X^k_t)}, \quad t \in [0,T],\end{equation}

where $K_t$ denotes the collection of all labels of particles alive at time $t \in [0,T]$ and $X^k_t$ corresponds to the position of particle $k\in K_t$ . In particular, we have $Z_0 = \xi = \sum_{k \in K_0} \delta_{(k, X^k_0)}$ . Then, for each $k\in K_t$ , the position $X^k_t$ evolves as a diffusion process characterized by the SDE

(5)

\begin{equation} {\mathrm{d}}X^k_t = b\big(t,X^k_{t\wedge\cdot},\mu_t\big)\,{\mathrm{d}} t + \sigma\big(t,X^k_{t\wedge\cdot},\mu_t\big)\,{\mathrm{d}} W^k_t,\end{equation}

where $\mu_t$ is the pushforward measure $\mu_t \,:\!=\, \mu \circ \pi_t^{-1}$ . We also denote by $S_k$ the birth time of particle k. In particular, $S_k = 0$ for each initial particle $k \in K_0$ . Then, each particle k runs a death clock with intensity $\gamma(t, X^k_{t\wedge\cdot}, \mu_t)$ , i.e. its death time $T_k$ is given by

\begin{align*} T_k \,:\!=\, \inf\big\{t>S_k \colon Q^k\big(\{t\}\times\big[0,\gamma\big(t,X^k_{t\wedge\cdot},\mu_t\big)\big] \times[0,1] \big) = 1\big\}.\end{align*}

Let $U_k$ be a random variable satisfying $Q^k \big(\{T_k\} \times [0,\gamma(T_k,X^k_{T_k\wedge\cdot},\mu_{T_k})] \times \{U_k\} \big) =1$ , and note that $U_k$ is uniformly distributed over the interval [0,1]. When $U_k \in I_{\ell}\big(T_k, X^k_{T_k\wedge\cdot}, \mu_{T_k}\big)$ , at time $T_k$ the particle k dies and gives birth to $\ell$ offspring particles $\{k1, \ldots, k \ell \}$ , so that

\begin{equation*} K_{T_k} = (K_{T_k-} \setminus \{k\}) \cup \{k1, \ldots,k \ell\}.\end{equation*}

In particular, the birth time of the offspring particles corresponds to the death time of the parent particle, i.e. $S_{ki}\,:\!=\, T_k$ for $i = 1, \ldots , \ell$ . Further, the offspring particles start from the position of the parent particle, i.e. $X^{ki}_{S_{ki}} = X^k_{T_k-}$ for $i = 1, \ldots, \ell$ . Moreover, for a path-dependent treatment, we define $X^{ki}_t$ as its ancestor position before the birth time, i.e. $X^{ki}_t = X^k_t$ for $t< S_k$ and $i = 1, \ldots , \ell$ .

Then the McKean–Vlasov branching diffusion corresponds to the situation where the environment measure $\mu$ coincides with the law of the branching diffusion process Z, i.e. $\mu = \mathbb{P}\circ Z^{-1}$ , or equivalently, $\mu_t = \mathbb{P} \circ (Z_{t \wedge \cdot})^{-1}$ for all $t\in[0,T]$ .

Alternatively, let us provide a characterization of the McKean–Vlasov branching diffusion described above through a family of SDEs. Denote by $\mathcal{L}$ the infinitesimal generator of the diffusion $(b,\sigma)$ , i.e., for all $(t, \mathrm{x} ,m)\in[0,T]\times\mathcal{C}^d\times\mathcal{P}(\mathcal{D}_E)$ and $f\in C^{2}_{\mathrm{b}}(\mathbb{R}^{d},\mathbb{R})$ ,

\begin{equation*} \mathcal{L} f(t, \mathrm{x}, m) \,:\!=\, \frac{1}{2}\operatorname{Tr}(\sigma\sigma^{\top}(t,\mathrm{x},m)D^2f(\mathrm{x}_t)) + b(t,\mathrm{x},m)\cdot Df(\mathrm{x}_t).\end{equation*}

Then the McKean–Vlasov branching diffusion with initial condition $\xi$ can be characterized as the solution, for all $f=(f^k)_{k\in\mathbb{K}} \in C^2_{\mathrm{b}} (\mathbb{K} \times \mathbb{R}^d, \mathbb{R})$ , $t\in[0,T]$ , to the SDE

(6)

\begin{align} \langle Z_t, f \rangle & = \langle\xi ,f\rangle + \int_0^t\sum_{k\in K_s}\mathcal{L}f^k\big(s,X^k_{s\wedge\cdot},\mu_s\big)\,{\mathrm{d}} s \nonumber \\[2pt] & \quad + \int_0^t\sum_{k\in K_s}Df^k(X^k_s)\sigma\big(s,X^k_{s\wedge\cdot},\mu_s\big)\,{\mathrm{d}} W^k_s \nonumber \\[2pt] & \qquad\quad\; \times \int_{(0,t]\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K_{s-}}\sum_{\ell\geq0}\Bigg(\sum_{i=1}^{\ell}f^{ki}-f^k\Bigg) (X^k_s) \nonumber \\[2pt] & \qquad\qquad\qquad \times \mathbf{1}_{[0,\gamma(s,X^k_{s\wedge\cdot},\mu_s)]\times I_{\ell}(s,X^k_{s\wedge\cdot},\mu_s)}(z)\, Q^k({\mathrm{d}} s,{\mathrm{d}} z) \end{align}

with the condition

(7)

\begin{equation} \mu_t = \mathbb{P} \circ (Z_{t\wedge \cdot})^{-1} \quad \text{for all }t \in [0,T].\end{equation}

Remark 1.

(i) In the SDE (6), the random processes $X^k_s$ and the random set $K_s$ act together as part of the E-valued process Z via the relation (4). Conversely, we can represent $X^k_s$ from $Z_s$ by using the relation $X^k_s = \langle Z_s, I_k \rangle$ with $I_k (\tilde{k},x) \,:\!=\, x \mathbf{1}_{\{\tilde{k} = k\}}$ for all $(\tilde{k},x)\in\mathbb{K}\times\mathbb{R}^d$ .
(ii) When the environment measure $\mu\in\mathcal{P}(\mathcal{D}_E)$ is fixed, it becomes a classical branching diffusion process, whose well-posedness is ensured by standard Lipschitz and boundedness conditions; see, e.g., [Reference Claisse4, Proposition 2.1]. The crucial point here is therefore the requirement in (7) that $\mu = \mathcal{L}(Z)$ .
(iii) It might be of interest to consider, as in [Reference Claisse, Ren and Tan6], an interaction term of the form $(\nu_t)_{t\in[0,T]}$ , $\nu_t\in\mathcal{M}(\mathbb{R}^d)$ , where, for all $f\colon\mathbb{R}^d\to \mathbb{R}$ bounded and measurable,
\begin{equation*} \int_{\mathbb{R}^d}{f(x)\,\nu_t({\mathrm{d}} x)} = \mathbb{E}[\langle Z_t,f\rangle] = \mathbb{E}\bigg[\sum_{k\in K_t}{f(X_t^k)}\bigg]. \end{equation*}
This case is clearly encompassed in our setting.
(iv) It might be more natural to consider $\mu_{t-}$ instead of $\mu_t$ in the dynamic described above. However, although the process Z has jumps in E, the mapping $t \mapsto \mu_t= \mathbb{P} \circ (Z_{t\wedge\cdot})^{-1}$ is continuous on [0, T]. Indeed, since the death time $T_k$ of each particle is generated by the Poisson random measure $Q^k$ , we have $\mathbb{P}(T_k = t) = 0$ and thus $\mathbb{P}(Z_t = Z_{t-}) = 1$ for all $t \in [0,T]$ .

2.2. Strong solution and well-posedness

Let us first consider the strong formulation of the McKean–Vlasov branching diffusion.

Definition 1. (Strong solution.) A strong solution to the McKean–Vlasov branching diffusion SDE (6) with (7) in a given probability space $(\Omega, \mathcal{F}, \mathbb{P})$ with respect to an initial condition $\xi$ and a fixed family of independent, Brownian motions $(W^k)_{k\in\mathbb{K}}$ and Poisson random measures $(Q^k)_{k\in\mathbb{K}}$ on $[0,T] \times [0, \bar \gamma] \times [0,1]$ with Lebesgue intensity measure is an E-valued càdlàg process $Z = (Z_t)_{t \in [0,T]}$ adapted to the (augmented) natural filtration of $(\xi,(W^k,Q^k)_{k\in\mathbb{K}})$ satisfying the branching diffusion SDE (6) together with the McKean–Vlasov condition (7).

Under the following assumptions, we can show the existence and uniqueness of a McKean–Vlasov branching diffusion in the strong sense. Recall that $\mathcal{D}_E$ is equipped with the uniform convergence metric defined in (2) so that the Wasserstein distance $\mathcal{W}_1$ on $\mathcal{P}_1(\mathcal{D}_E)$ is defined with respect to this metric.

Assumption 1.

(i) The coefficient functions $(b, \sigma, \gamma, (p_{\ell})_{\ell \ge 0})$ are progressive in the sense that $(b,\sigma,\gamma,$ $(p_{\ell})_{\ell \ge 0})(t,\mathrm{x},m) = \big(b,\sigma,\gamma,(p_{\ell})_{\ell \ge 0}\big)(t,\mathrm{x}_{t \wedge \cdot},m_t)$ for all $(t,\mathrm{x}, m) \in [0,T] \times \mathcal{C}^d \times \mathcal{P}_1(\mathcal{D}_E)$ .
(ii) The coefficient functions $(b, \sigma, \gamma, (p_{\ell})_{\ell \ge 0})$ are uniformly bounded. The same holds for the function $\sum_{ \ell \ge 0} \ell p_{\ell}$ and we write $M\,:\!=\,\big\|\sum_{\ell \ge 0}\ell p_{\ell}\big\|_{\infty} < +\infty$ .

Assumption 2.

(i) The coefficient functions $b,\sigma,\gamma$ are Lipschitz in $(\mathrm{x}, m)$ in the sense that there exists a constant $L>0$ such that
\begin{equation*} \big|(b,\sigma,\gamma)(t,\mathrm{x},m) - (b,\sigma,\gamma)(t,\mathrm{x}',m')\big| \leq L\big(\|\mathrm{x}_{t\wedge\cdot} - \mathrm{x}_{t\wedge\cdot}'\| + \mathcal{W}_{1}(m_t,m'_t)\big) \end{equation*}
for all $(t,\mathrm{x},\mathrm{x}',m,m') \in [0,T]\times\mathcal{C}^d\times\mathcal{C}^d\times \mathcal{P}_1(\mathcal{D}_E)\times\mathcal{P}_1(\mathcal{D}_E)$ .
(ii) There exist positive constants $(C_{\ell})_{\ell \ge 0}$ such that $M' \,:\!=\, \sum_{\ell \in\mathbb{N}} \ell C_{\ell} < \infty$ and
\begin{equation*} \big|p_\ell(t,\mathrm{x},m) - p_\ell(t,\mathrm{x}',m')\big| \leq C_{\ell}\big(\|\mathrm{x}_{t\wedge\cdot} - \mathrm{x}'_{t\wedge\cdot}\| + \mathcal{W}_{1}(m_t,m'_t)\big) \end{equation*}
for all $(t,\mathrm{x},\mathrm{x}',m,m') \in [0,T]\times\mathcal{C}^d\times\mathcal{C}^d\times \mathcal{P}_1(\mathcal{D}_E)\times\mathcal{P}_1(\mathcal{D}_E)$ and $\ell \in \mathbb{N}$ .

Remark 2. Let us provide a simple example of a Lipschitz function on $\mathcal{P}_1(\mathcal{D}_E)$ . Let $f = (f^k)_{k \in \mathbb{K}}\colon \mathbb{K} \times \mathbb{R}^d \to \mathbb{R}$ and assume that there exists $C>0$ such that $|f^k(x)| \le C$ and $|f^k(x)-f^k(y)| \le C|x-y|$ for all $k\in\mathbb{K}$ , $x,y\in\mathbb{R}^d$ . Define $F\colon [0,T]\times \mathcal{P}_1(\mathcal{D}_E) \to \mathbb{R}$ by

\begin{equation*} F(t,m) \,:\!=\, \int_{\mathcal{D}_E} \langle z(t), f\rangle \,m({\mathrm{d}} z). \end{equation*}

Then we can easily check that $| F(t,m^1) - F(t,m^2)| \le C \mathcal{W}_1(m^1_t,m^2_t)$ for all $m^1, m^2 \in \mathcal{P}_1(\mathcal{D}_E)$ . Indeed, given $Z^1_t = \sum_{k\in K_t}\delta_{(k,X^{1,k}_t)}$ and $Z^2_t = \sum_{k \in K'_t} \delta_{(k, X^{2,k}_t)}$ , two arbitrary càdlàg E-valued processes with distributions $m^1$ and $m^2$ respectively, it follows by straightforward computation that

\begin{equation*} |F(t, m^1) - F(t, m^2)| = \Bigg|\mathbb{E}\Bigg[\sum_{k\in K^1_t}f^k(X^{1,k}_t) - \sum_{k\in K^2_t}f^k(X^{2,k}_t)\Bigg]\Bigg| \le C\mathbb{E}[d_E(Z^1_t,Z^2_t)]. \end{equation*}

Theorem 1. Let Assumptions 1 and 2 hold, and assume further that the initial condition $\xi$ satisfies $\mathbb{E}[\langle\xi,\mathbf{1}\rangle] = \mathbb{E}[\#K_0] < \infty$ . Then there exists a unique strong solution Z to the McKean–Vlasov branching diffusion SDE (6) with (7) in the sense of Definition 1, and it satisfies $\mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T}\langle Z_t,\mathbf{1}\rangle\big] = \mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T}\#K_t\big] < \infty$ .

The proof of Theorem 1 relies on a contraction argument; it is postponed to Section 3.2. Additionally, by using the estimates needed for this contraction argument, we can easily establish a companion stability property for McKean–Vlasov branching diffusion with respect to the initial condition and the coefficient functions. See Appendix A for more details.

Remark 3. In contrast to the classical literature on McKean–Vlasov diffusion, where the Lipschitz condition on the coefficients hold with respect to the metric $\mathcal{W}_2$ (see, e.g., [Reference Jourdain, Méléard and Woyczynski20, Reference Sznitman30]), we perform our analysis with the metric $\mathcal{W}_1$ in this setting. We refer to Remark 6 for more details.

2.3. Weak solution and propagation of chaos

As in classical SDE theory, we can consider the notion of weak solution by letting the probability space as well as the Brownian motions and the Poisson random measures be part of the solution. This allows us to establish the existence of McKean–Vlasov branching diffusion under relaxed conditions on the coefficients. Since there is no fixed probability space, we are given a measure $m_0 \in \mathcal{P}(E)$ as an initial condition rather than a random variable $\xi$ with distribution $m_0$ .

Definition 2. A weak solution to the McKean–Vlasov branching diffusion SDE (6) and (7) with initial condition $m_0 \in \mathcal{P}(E)$ is a term $\alpha = (\Omega,\mathcal{F},\mathbb{F},\mathbb{P},Z,\mu,(W^k,Q^k)_{k\in\mathbb{K}})$ satisfying the following conditions:

(i) $(\Omega,\mathcal{F},\mathbb{F},\mathbb{P})$ is a filtered probability space, equipped with a family of mutually independent Brownian motions $(W^k)_{k\in\mathbb{K}}$ and Poisson random measures $(Q^k)_{k\in\mathbb{K}}$ with Lebesgue intensity measure on $[0,T] \times [0,\bar\gamma] \times [0,1]$ .
(ii) Z is an E-valued, $\mathbb{F}$ -adapted càdlàg process such that $\mathbb{P} \circ Z_0^{-1} = m_0$ .
(iii) $\mu$ is a $\mathcal{P}(\mathcal{D}_E)$ -valued random variable independent of $\big(Z_0,(W^k,Q^k)_{k\in\mathbb{K}}\big)$ such that $\mu_t = \mathcal{L}(Z_{t\wedge\cdot}\mid\mu_t)$ , $t \in [0,T]$ .
(iv) The process $Z_t = \sum_{k \in K_t}\delta_{(k, X^k_t)}$ satisfies the following SDE: for all $f \in C^2_{\mathrm{b}} (\mathbb{K} \times \mathbb{R}^d, \mathbb{R})$ , $t\in [0,T]$ ,
(8) \begin{align} \langle Z_t, f \rangle & = \langle Z_0 ,f\rangle + \int_0^t\sum_{k\in K_s}\mathcal{L}f^k\big(s,X^k_{s\wedge\cdot},\mu_s\big)\,{\mathrm{d}} s \nonumber \\[2pt] & \quad + \int_0^t\sum_{k\in K_s}D\,f^k(X^k_s)\sigma\big(s,X^k_{s\wedge\cdot},\mu_s\big)\,{\mathrm{d}} W^k_s \nonumber \\[2pt] & \quad + \int_{(0,t]\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K_{s-}}\sum_{\ell\geq0} \Bigg(\sum_{i=1}^{\ell}{f^{ki}}-f^k\Bigg)(X^k_s) \nonumber \\[2pt] & \qquad\qquad\quad \times \mathbf{1}_{[0,\gamma(s,X^k_{s\wedge\cdot},\mu_s)]\times I_{\ell}(s,X^k_{s\wedge\cdot},\mu_s)}(z)\, Q^k({\mathrm{d}} s,{\mathrm{d}} z). \end{align}

Remark 4. Notice that $\mu$ is independent of $Z_0$ in Definition 2. In particular, $\mu_0 = \mathcal{L}(Z_{0 \wedge \cdot})\in\mathcal{P}(\mathcal{D}(E))$ is deterministic, completely characterized by $m_0\in\mathcal{P}(E)$ .

Under the following assumption, we can show the existence of a weak solution to the McKean–Vlasov branching diffusion SDE. Recall that $\mathcal{D}_E$ is equipped with the classical Skorokhod metric so that the continuity below has to be understood witb respect to this metric.

Assumption 3. The coefficient functions $(b,\sigma,\gamma,(p_{\ell})_{\ell \ge 0})$ and $\sum_{\ell\ge0}\ell p_{\ell}$ are continuous in $(\mathrm{x},m)$ .

Theorem 2. Let Assumptions 1 and 3 hold, and $m_0\in\mathcal{P}_1(E)$ . There exists a weak solution to the McKean–Vlasov branching diffusion SDE (6) and (7) in the sense of Definition 2.

Theorem 2 is an immediate consequence of a propagation of chaos property established in Theorem 3. Namely, we consider a large number n of branching diffusion processes interacting through their empirical distribution in the sense of Definition 3, and we show that, when n goes to infinity, it converges to a McKean–Vlasov branching diffusion in the weak sense.

Definition 3. For each $n \ge 1$ , a weak notion of n interacting branching diffusions with initial condition $m_0 \in \mathcal{P}(E)$ is a term $\alpha_n = \big(\Omega^n,\mathcal{F}^n,\mathbb{F}^n,\mathbb{P}^n, (Z^{i})_{i = 1,\ldots,n},(W^{i,k},Q^{i,k})_{k\in\mathbb{K}, i = 1,\ldots,n}\big)$ satisfying the following conditions:

(i) $(\Omega^n,\mathcal{F}^n,\mathbb{F}^n,\mathbb{P}^n)$ is a filtered probability space, equipped with a family of mutually independent Brownian motions $(W^{i,k})_{k\in\mathbb{K}, i = 1, \ldots, n}$ and Poisson random measures $(Q^{i,k})_{k\in\mathbb{K}, i = 1, \ldots, n}$ with Lebesgue intensity measure on $[0,T] \times [0, \bar \gamma] \times [0,1]$ .
(ii) For each $i=1,\ldots,n$ , $Z^{i}$ is an E-valued, $\mathbb{F}^n$ -adapted càdlàg process such that $Z^{1}_0, \ldots, Z^{n}_0$ are independent and identically distributed with distribution $m_0$ .
(iii) For each $i=1,\ldots,n$ , the process $Z^{i}_t = \sum_{k \in K^{i}_t} \delta_{(k, X^{i,k}_t)}$ satisfies the following SDE: for all $f \in C^2_{\mathrm{b}} (\mathbb{K} \times \mathbb{R}^d, \mathbb{R})$ , $t\in[0,T]$ ,
(9) \begin{align} \langle Z^{i}_t, f \rangle & = \langle Z^{i}_0 ,f\rangle + \int_0^t\sum_{k\in K^{i}_s}\mathcal{L}f^k\big(s,X^{i,k}_{s\wedge\cdot},\mu^n_s\big)\, {\mathrm{d}} s \nonumber \\ & \quad + \int_0^t\sum_{k\in K^{i}_s}Df^k(X^{i,k}_s)\sigma\big(s,X^{i,k}_{s\wedge\cdot},\mu^n_s\big)\, {\mathrm{d}} W^{i,k}_s \nonumber \\ & \quad + \int_{(0,t]\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K^{i}_{s-}}\sum_{\ell\geq 0} \Bigg(\sum_{j=1}^{\ell}{f^{kj}} - f^k\Bigg)(X^{i,k}_s) \nonumber \\ & \qquad\quad\qquad \times \mathbf{1}_{I_{\ell}(s,X^{i,k}_{s\wedge\cdot},\mu^n_{s-})\times[0,\gamma(s,X^{i,k}_{s\wedge\cdot},\mu^n_{s-})]}(z)\, Q^{i,k}({\mathrm{d}} s,{\mathrm{d}} z), \end{align}
where $\mu^n_t$ corresponds to the empirical measure $\mu^n_t \,:\!=\, (1/n)\sum_{i=1}^n \delta_{Z^{i}_{t \wedge \cdot}}$ .

Theorem 3. Let Assumptions 1 and 3 hold, and $m_0\in\mathcal{P}_1(E)$ . Then any sequence of distributions $(\mathbb{P}^n \circ (\mu^n)^{-1})_{n \ge 1}$ of n interacting branching diffusions in the sense of Definition 3 admits a converging subsequence in $\mathcal{P}(\mathcal{P}_1(\mathcal{D}_E))$ , and the limit identifies as the distribution $\mathbb{P} \circ \mu^{-1}$ of a McKean–Vlasov branching diffusion in the sense of Definition 2.

The proof of Theorem 3 is postponed to Section 4.3. It relies on the weak convergence of solutions to appropriate martingale problems, equivalent to the notion of weak solutions.

Remark 5. Under Assumptions 1 and 2, there exists a unique strong solution Z and so, by a Yamada–Watanabe-like theorem, weak uniqueness holds and Theorem 3 reduces to $\mu^n \longrightarrow \mathbb{P} \circ Z^{-1}$ in distribution.

3. Strong formulation and well-posedness

We consider in this section the strong formulation of the McKean–Vlasov branching diffusion processes. More precisely, we provide the proof of the strong existence and uniqueness result, Theorem 1 in Section 3.2. To this end, we start with a series of preliminary technical lemmas in Section 3.1. Throughout this section, we consider a fixed probability space $(\Omega, \mathcal{F}, \mathbb{P})$ equipped with an initial condition $\xi$ and a family of independent Brownian motions $(W^k)_{k \in \mathbb{K}}$ and Poisson random measures $(Q^k)_{k \in \mathbb{K}}$ on $[0,T] \times [0, \bar \gamma] \times [0,1]$ with Lebesgue intensity measure. We assume that Assumptions 1 and 2 hold, and further that $\mathbb{E}[\langle\xi,\mathbf{1}{}\rangle] = \mathbb{E}[\# K_0]<+\infty$ .

3.1. Technical lemmas

We start by studying the solution to SDE (6) when the environment measure is fixed. More precisely, assume that we are given a deterministic $\mu \in \mathcal{P}_1(\mathcal{D}_E)$ and consider $Z_t=\sum_{k\in K_t} {\delta_{(k,X^k_t)}}$ satisfying, for all $f \in C^2_{\mathrm{b}} (\mathbb{K} \times \mathbb{R}^d, \mathbb{R})$ , $t\in[0,T]$ ,

(10)

\begin{align} \langle Z_t, f \rangle & = \langle\xi ,f\rangle + \int_0^t\sum_{k\in K_s}\mathcal{L}f^k(s,X^k_{s\wedge\cdot},\mu_s)\,{\mathrm{d}} s \nonumber \\ & \quad + \int_0^t\sum_{k\in K_s}D\,f^k(X^k_s)\sigma(s,X^k_{s\wedge\cdot},\mu_s)\,{\mathrm{d}} W^k_s \nonumber \\ & \quad + \int_{(0,t]\times [0,\bar \gamma] \times [0,1]}\sum_{k\in K_{s-}}\sum_{ \ell \geq 0} \Bigg(\sum_{i=1}^{\ell}{f^{ki}} - f^k\Bigg)(X^k_s) \nonumber \\ & \qquad\qquad\! \times \mathbf{1}_{[0,\gamma(s,X^k_{s\wedge\cdot},\mu_s)]\times I_{\ell}(s,X^k_{s\wedge\cdot},\mu_s)}(z)\, Q^k({\mathrm{d}} s,{\mathrm{d}} z). \end{align}

By a simple extension of [Reference Claisse4, Proposition 2.1], existence and uniqueness of the strong solution holds for the above SDE.

Lemma 1. Let $\mu\in\mathcal{P}_1(\mathcal{D}_E)$ . Then there exists a unique E-valued càdlàg process Z adapted to the (augmented) natural filtration of $\big(\xi,(W^k,Q^k)_{k\in\mathbb{K}}\big)$ satisfying the branching diffusion SDE (10). In addition, $\mathbb{E}\big[\sum_{k\in\mathbb{K}}\sup\nolimits_{0\leq t\leq T}\mathbf{1}_{k\in K_t}\big] \leq \mathbb{E}[\#K_0]{\mathrm{e}}^{\bar\gamma M T}$ .

Proof. The proof of existence and uniqueness relies on the fact that the Lipschitz assumption on b and $\sigma$ ensures the existence and uniqueness of the strong solution to SDE (5) characterizing the movement of each particle. Then we can construct the process step by step by considering the successive branching times. It remains to rule out explosion: this follows from the boundedness assumption on $\gamma$ and $\sum_{\ell\geq0} \ell p_\ell$ , which ensures that the population size has finite moments (see below) and thus remains finite. We refer to [Reference Claisse4, Proposition 2.1] for more details.

Let us turn now to the moment estimate. Denote by $\bar{N}_t\,:\!=\,\sum_{k\in\mathbb{K}}\sup\nolimits_{0\leq t\leq T}\mathbf{1}_{k\in K_t}$ the total number of particles having been alive in the time interval [0, T], and observe that it can be viewed as a modified branching process where particles give birth to new ones without dying. Then

\begin{equation*} \bar{N}_t = \#K_0 + \int_{(0,t]\times[0,\bar \gamma]\times[0,1]}\sum_{k\in K_{s-}}\sum_{\ell\geq0}\ell \mathbf{1}_{I_\ell(s,X^k_{s\wedge\cdot},\mu_s)\times[0,\gamma(s,X^k_{s\wedge\cdot},\mu_s)]}(z)\,Q^k({\mathrm{d}} s,{\mathrm{d}} z). \end{equation*}

We introduce the stopping times $\tau_n\,:\!=\,\inf\{t\geq0\colon\bar{N}_t\geq n\}$ . Then

\begin{align*} \mathbb{E}[\bar{N}_{T\wedge\tau_n}] & = \mathbb{E}[\#K_0] + \mathbb{E}\bigg[\int_0^{T\wedge\tau_n}\sum_{k\in K_t}\sum_{\ell\geq0} \ell p_\ell\big(t,X^k_{t\wedge\cdot},\mu_t\big)\gamma\big(t,X^k_{t\wedge\cdot},\mu_t\big)\,\unicode{x1E6D} \bigg] \\[2pt] & \leq \mathbb{E}[\#K_0] + \bar\gamma M\mathbb{E}\bigg[\int_0^{T\wedge\tau_n}\#{K_t}\,{\mathrm{d}} t\bigg] \leq \mathbb{E}[\#K_0] + \bar\gamma M\mathbb{E}\bigg[\int_0^T\bar{N}_{t\wedge\tau_n}\,{\mathrm{d}} t\bigg]. \end{align*}

By Grönwall’s lemma, we deduce that $\mathbb{E}[\bar{N}_{T\wedge\tau_n}] \leq \mathbb{E}[\#K_0]{\mathrm{e}}^{\bar\gamma M T}$ . The conclusion follows immediately from Fatou’s lemma.

We now provide a stability result for the branching diffusion solution to SDE (10) with respect to the environment measure. It is the key to the contraction argument used in the proof of Theorem 1 and the main technical difficulty. As a preliminary result, we first establish the following technical lemma.

Lemma 2. Let Assumption 1 hold. For any $(t,\mathrm{x},\mathrm{x}',m,m') \in [0,T]\times\mathcal{C}^d\times\mathcal{C}^d\times\mathcal{P}_1(\mathcal{D}_E)\times\mathcal{P}_1(\mathcal{D}_E)$ , $\int_0^1\sum_{\ell,\ell'\geq0}|\ell-\ell'|\mathbf{1}_{I_\ell(t,\mathrm{x},m)\cap I_{\ell'}(t,\mathrm{x}',m')}(u)\,{\mathrm{d}} u \leq \sum_{\ell\geq0}\ell|p_\ell(t,\mathrm{x},m) - p_\ell(t,\mathrm{x}',m')|$ .

Proof. Fix $(t,\mathrm{x},\mathrm{x}',m,m')\in [0,T]\times\mathcal{C}^d\times\mathcal{C}^d\times \mathcal{P}_1(\mathcal{D}_E)\times\mathcal{P}_1(\mathcal{D}_E)$ and consider the two probability measures $P \,:\!=\,(p_\ell(t,\mathrm{x},m))_{\ell\in\mathbb{N}}$ and $P'\,:\!=\,(p_\ell(t,\mathrm{x}',m'))_{\ell\in\mathbb{N}}$ , which belong to $\mathcal{P}_1(\mathbb{N})$ in view of Assumption 1. Denote the cumulative distribution functions of P and P ^′ as F and F ^′ respectively, and observe that the generalized inverses are given, for all $u\in[0,1]$ , as $F^{-1}(u) = \sum_{\ell\geq0}\ell\mathbf{1}_{I_\ell}(u)$ and $(F')^{-1}(u) = \sum_{\ell\geq0}\ell\mathbf{1}_{I'_{\ell}}(u)$ , where $I_\ell\,:\!=\,I_\ell(t,\mathrm{x},m)$ and $I'_{\ell}\,:\!=\,I_{\ell}(t,\mathrm{x}',m')$ for $\ell\in\mathbb{N}$ . Using the well-known representation of Wasserstein metric for one-dimensional distributions, we have

\begin{equation*} \mathcal{W}_1(P,P') = \int_0^1|F^{-1}(u)-(F')^{-1}(u)|\,{\mathrm{d}} u = \int_0^1\sum_{\ell,\ell'\geq0}|\ell-\ell'|\mathbf{1}_{I_\ell\cap I'_{\ell'}}(u)\,{\mathrm{d}} u. \end{equation*}

In addition, by a change of variable, we also have that

\begin{align*} \mathcal{W}_1(P,P') = \sum_{j\geq0}|F(j)-F'(j)| & = \sum_{j\geq0}\bigg|\sum_{\ell\leq j}p_\ell(t,\mathrm{x},m) - \sum_{\ell\leq j}p_\ell(t,\mathrm{x}',m')\bigg| \\[2pt] & \leq \sum_{j\geq0}\sum_{\ell>j}|p_\ell(t,\mathrm{x},m)-p_\ell(t,\mathrm{x}',m')| \\[2pt] & = \sum_{\ell\geq 0}\ell|p_\ell(t,\mathrm{x},m)-p_\ell(t,\mathrm{x}',m')|. \end{align*}

The conclusion follows immediately.

Lemma 3. Let $\mu^1,\mu^2\in\mathcal{P}_1(\mathcal{D}_E)$ . Denote by $Z^i$ the solution to SDE (10) given $\mu^i$ , $Z^i_0=\xi$ for $i=1,2$ respectively. Then there exist constants $c_d > 0$ , $c_w> 0$ depending on L, $\bar \gamma$ , M, and M^′ such that, for all $T' \in [0,T]$ satisfying $T'+\sqrt{T'}<1/c_d$ ,

\begin{equation*} \mathbb{E}[d_{T'}(Z^1,Z^2)] \leq \frac{c_w(T'+\sqrt{T'})}{1-c_d(T'+\sqrt{T'})} \mathbb{E}[\#K_0]{\mathrm{e}}^{\bar\gamma M T'}\mathcal{W}_1\big(\mu^1_{T'},\mu^2_{T'}\big). \end{equation*}

Proof. Fix some $T'\in[0,T]$ to be determined later. For $t\in[0,T']$ , consider the representations $Z^1_t = \sum_{k\in K^1_t}\delta_{(k,X^{1,k}_t)}$ and $Z^2_t = \sum_{k\in K^2_t}\delta_{(k,X^{2,k}_t)}$ . Let us write $K^{1\cap2}_t \,:\!=\, K^{1}_t \cap K^{2}_t$ and $K^{1\triangle2}_t \,:\!=\, K^{1}_t \triangle K^{2}_t$ . Recall that

(11)

\begin{equation} d_{T'}(Z^1,Z^2) = \sup\nolimits_{0\le t\le T'}\Bigg\{\#K^{1\triangle 2}_t + \sum_{k\in K^{1\cap2}_t}\big|X^{1,k}_t-X^{2,k}_t\big|\wedge1\Bigg\}. \end{equation}

Step 1

We start by estimating the first term in (11), counting the number of particles with distinct labels in $Z^1$ and $Z^2$ . For simplicity, for $i = 1,2$ , $k\in\mathbb{K},$ and $t\in[0,T]$ , we write $\gamma^{i,k}_t \,:\!=\, \gamma(t,X^{i,k}_{t\wedge\cdot},\mu^i_t)$ , $I^{i,k}_{\ell,t} \,:\!=\, I_\ell(t,X^{i,k}_{t\wedge\cdot},\mu^i_t)$ , and $p^{i,k}_{\ell,t}\,:\!=\,p_\ell(t,X^{i,k}_{t\wedge\cdot},\mu^i_t)$ . We begin with the following observation:

(12)

\begin{equation} \#K^{1\triangle 2}_t\leq J^1_t+J^2_t+J^3_t+J^4_t+J^5_t, \end{equation}

where

\begin{align*} J^1_t & = \int_{(0,t]\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K^1_{s-}\setminus K^2_{s-}}\sum_{\ell\geq0} \ell\mathbf{1}_{I^{1,k}_{\ell,s}\times[0,\gamma^{1,k}_s]}(z)\,Q^k({\mathrm{d}} s,{\mathrm{d}} z), \\[2pt] J^2_t & = \int_{(0,t]\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K^2_{s-}\setminus K^1_{s-}}\sum_{\ell\geq0} \ell\mathbf{1}_{I^{2,k}_{\ell,s}\times[0,\gamma^{2,k}_s]}(z)\,Q^k({\mathrm{d}} s,{\mathrm{d}} z), \\[2pt] J^3_t & = \int_{(0,t]\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K^{1\cap 2}_{s-}}\mathbf{1}_{\gamma^{1,k}_s>\gamma^{2,k}_s} \sum_{\ell\geq0}(\ell+1)\mathbf{1}_{I^{1,k}_{\ell,s}\times (\gamma^{2,k}_s, \gamma^{1,k}_s]}(z)\,Q^k({\mathrm{d}} s,{\mathrm{d}} z), \\[2pt] J^4_t & = \int_{(0,t]\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K^{1\cap 2}_{s-}}\mathbf{1}_{\gamma^{1,k}_s<\gamma^{2,k}_s} \sum_{\ell\geq0}(\ell+1)\mathbf{1}_{I^{2,k}_{\ell,s}\times (\gamma^{1,k}_s,\gamma^{2,k}_s]}(z)\,Q^k({\mathrm{d}} s,{\mathrm{d}} z), \\[2pt] J^5_t & = \int_{(0,t]\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K^{1\cap 2}_{s-}}\sum_{\ell,\ell'\geq0}|\ell-\ell'| \mathbf{1}_{I^{1,k}_{\ell,s}\cap I^{2,k}_{\ell',s}\times[0,\gamma^{1,k}_s\wedge\gamma^{2,k}_s]}(z)\,Q^k({\mathrm{d}} s,{\mathrm{d}} z). \end{align*}

To obtain this estimate, we distinguish between particles born from a parent in the common set $K^{1\cap2}_t$ or not. In the latter case, we assume that all newly born particles also lie in the set $K^{1\triangle 2}_t$ . Then $J^1_t$ and $J^2_t$ correspond to the case when parents are from $K^1_t\setminus K^2_t$ and $K^2_t\setminus K^1_t$ respectively. In the former case, when parents are from $K^{1\cap2}_t$ , the terms $J^3_t$ and $J^4_t$ take care of the case when a particle in one tree branches while the other particle sharing the same label does not. The last term $J_t^5$ concerns particles in two trees sharing a common label branching simultaneously, but giving birth to different numbers of progeny.

Let us deal with each term in (12) separately. Firstly, we have

\begin{align*} \mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T'}J^1_t\big] & = \mathbb{E}\Bigg[\int_{(0,T']\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K^1_{t-}\setminus K^2_{t-}}\sum_{\ell\geq 0} \ell\mathbf{1}_{I^{1,k}_{\ell,t}\times[0,\gamma^{1,k}_t]}(z)\,Q^k({\mathrm{d}} t,{\mathrm{d}} z)\Bigg] \\[2pt] & = \mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^1_t\setminus K^2_t}\gamma^{1,k}_t\sum_{\ell\geq 0}\ell p^{1,k}_{\ell,t}\, {\mathrm{d}} t\Bigg] \leq \bar\gamma M\mathbb{E}\bigg[\int_0^{T'}\#(K^1_t\setminus K^2_t)\,{\mathrm{d}} t\bigg]. \end{align*}

Similarly, we have $\mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T'}J^2_t\big]\leq\bar\gamma M\mathbb{E}\big[ \int_0^{T'}\#(K^2_t\setminus K^1_t)\,{\mathrm{d}} t\big]$ . We deduce that

(13)

\begin{equation} \mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T'}\{J^1_t + J^2_t\}\big] \leq \bar\gamma M\mathbb{E}\bigg[\int_0^{T'}\#K^{1\triangle 2}_t\,{\mathrm{d}} t\bigg]. \end{equation}

In addition, using the same approach,

(14)

\begin{equation} \mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T'}\{J^3_t+J^4_t\}\big] \leq (M+1)\mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t}\big|\gamma^{1,k}_t-\gamma^{2,k}_t\big|\,{\mathrm{d}} t\Bigg]. \end{equation}

By the Lipschitz condition on $\gamma$ from Assumption 2, it follows that

(15)

\begin{equation} \mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T'}\{J^3_t+J^4_t\}\big] \leq L(M+1)\mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t}\big( \big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\| + \mathcal{W}_1(\mu^1_t,\mu^2_t)\big)\,{\mathrm{d}} t\Bigg]. \end{equation}

Finally, using Lemma 2, we obtain that

(16)

\begin{align} \mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T'}J^5_t\big] & \leq \bar\gamma\mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t}\int_0^1\sum_{\ell,\ell'\geq0}|\ell-\ell'| \mathbf{1}_{I^{1,k}_{\ell,t}\cap I^{2,k}_{\ell',t}}(u)\,{\mathrm{d}} u\,{\mathrm{d}} t\Bigg] \nonumber \\[2pt] & \leq \bar\gamma\mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t}\sum_{\ell\ge 0} \ell\big|p^{1,k}_{\ell,t}-p^{2,k}_{\ell,t}\big|\,{\mathrm{d}} t\Bigg]. \end{align}

Thus, by the Lipschitz assumption on $(p_\ell)_{\ell\in\mathbb{N}}$ from Assumption 2, we deduce as above that

(17)

\begin{equation} \mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T'}J^5_t\big] \leq \bar\gamma M'\mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t} \big(\big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\| + \mathcal{W}_1(\mu^1_t,\mu^2_t)\big)\,{\mathrm{d}} t\Bigg]. \end{equation}

We finish this step by combining (12), (13), (15), and (17) to obtain

(18)

\begin{equation} \mathbb{E}\big[\!\sup\nolimits_{0\leq t\leq T'}\#K^{1\triangle 2}_t\big] \leq C\mathbb{E}\Bigg[\int_0^{T'}\Bigg(\#K^{1\triangle 2}_t + \sum_{k\in K^{1\cap2}_t}\big(\big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\| + \mathcal{W}_1(\mu^1_t,\mu^2_t)\big)\Bigg)\,{\mathrm{d}} t\Bigg], \end{equation}

where the constant $C>0$ depends on $(\bar \gamma,L,M,M')$ .

Step 2

Now we turn to the second term in (11) measuring the distance between particles with common labels in $Z_t^1$ and $Z_t^2$ . For simplicity, for $i = 1,2$ , $k\in\mathbb{K}$ , and $t\in[0,T]$ we write $b^{i,k}_t \,:\!=\, b(t,X^{i,k}_{t\wedge\cdot},\mu^i_t)$ and $\sigma^{i,k}_t \,:\!=\, \sigma(t,X^{i,k}_{t\wedge\cdot},\mu^i_t)$ . We observe first that, for any $k\in K^{1\cap2}_t$ ,

\begin{equation*} X^{1,k}_t-X^{2,k}_t = X^{1,k}_{\bar{S}^k} - X^{2,k}_{\bar{S}^k} + \int_0^t\mathbf{1}_{k\in K^{1\cap2}_s}\big(b^{1,k}_s - b^{2,k}_s\big)\,{\mathrm{d}} s + \int_0^t\mathbf{1}_{k\in K^{1\cap2}_s}\big(\sigma^{1,k}_s - \sigma^{2,k}_s\big)\,{\mathrm{d}} W^k_s, \end{equation*}

where $\bar{S}^k=\max(S^{1,k},S^{2,k})$ is the maximum of both times of birth of particle k, so that $k\in K^{1\cap2}_s$ whenever $s\in[\bar{S}^k,t]$ . It follows that

(19)

\begin{align} \mathbf{1}_{k\in K^{1\cap2}_t}\big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\| & \le \mathbf{1}_{k\in K^{1\cap2}_t}\big\|X^{1,k}_{\bar{S}^k\wedge\cdot}-X^{2,k}_{\bar{S}^k\wedge\cdot}\big\| + \int_0^t\mathbf{1}_{k\in K^{1\cap2}_s}\big|b^{1,k}_s - b^{2,k}_s\big|\,{\mathrm{d}} s \nonumber \\[2pt] & \quad + \sup\nolimits_{0\le r\le t}\bigg|\int_0^r\mathbf{1}_{k\in K^{1\cap2}_s}\big(\sigma^{1,k}_s - \sigma^{2,k}_s\big)\, {\mathrm{d}} W^k_s\bigg|. \end{align}

Let us deal first with the last two terms on the right-hand side of (19). By using the Lipschitz condition on b and $\sigma$ from Assumption 2, we have, for all $k\in\mathbb{K}$ ,

\begin{equation*} \mathbb{E}\bigg[\int_0^{t}\mathbf{1}_{k\in K^{1\cap2}_s}\big|b^{1,k}_s - b^{2,k}_s\big|\,{\mathrm{d}} s\bigg] \leq L\mathbb{E}\bigg[\int_0^{t}\mathbf{1}_{k\in K^{1\cap2}_s}\big(\big\|X^{1,k}_{s\wedge\cdot}-X^{2,k}_{s\wedge\cdot}\big\| + \mathcal{W}_1(\mu^1_s,\mu^2_s)\big)\,{\mathrm{d}} s\bigg], \end{equation*}

as well as

\begin{align*} & \mathbb{E}\bigg[\sup\nolimits_{0\leq r\leq t}\bigg|\int_0^r\mathbf{1}_{k\in K^{1\cap2}_s} \big(\sigma^{1,k}_s-\sigma^{2,k}_s\big)\,{\mathrm{d}} W^k_s\bigg|\bigg] \\[3pt] & \qquad \leq C_1\mathbb{E}\bigg[\bigg(\int_0^{t}\mathbf{1}_{k\in K^{1\cap2}_s} \big|\sigma^{1,k}_s - \sigma^{2,k}_s\big|^2\,{\mathrm{d}} s\bigg)^{{1}/{2}}\bigg] \\[3pt] & \qquad \leq C_1\sqrt{t}\mathbb{E}\big[\!\sup\nolimits_{0\leq s\leq t}\mathbf{1}_{k\in K^{1\cap2}_s} \big|\sigma^{1,k}_s - \sigma^{2,k}_s\big|\big] \\[3pt] & \qquad \leq C_1L\sqrt{t}\mathbb{E}\big[\!\sup\nolimits_{0\leq s\leq t}\mathbf{1}_{k\in K^{1\cap2}_s} \big(\big\|X^{1,k}_{s\wedge\cdot}-X^{2,k}_{s\wedge\cdot}\big\| + \mathcal{W}_1(\mu^1_s,\mu^2_s)\big)\big], \end{align*}

where the first inequality follows from the Burkholder–Davis–Gundy inequality. Regarding the first term on the right-hand side of (19), we distinguish between particles whose parents are in the common set or not. In the latter case, we use the fact that $\big\|X^{1,k}_{\bar{S}^k\wedge\cdot}-X^{2,k}_{\bar{S}^k\wedge\cdot}\big\|\le 1$ so that

\begin{align*} \mathbb{E}\Bigg[\sum_{k\in\mathbb{K}}\sup\nolimits_{0\le t\le T'}\mathbf{1}_{k\in K^{1\cap2}_t,k^-\in K^{1\triangle 2}_{\bar{S}^k-}} \big\|X^{1,k}_{\bar{S}^k\wedge\cdot}-X^{2,k}_{\bar{S}^k\wedge\cdot}\big\|\Bigg] & \le \mathbb{E}\Bigg[\sum_{k\in\mathbb{K}}\sup\nolimits_{0\le t\le T'} \mathbf{1}_{k\in K^{1}_t\cup K^{2}_t,k^-\in K^{1\triangle 2}_{\bar{S}^k-}}\Bigg] \\[2pt] & = \mathbb{E}\big[J^1_{T'} + J_{T'}^2\big] \le \bar\gamma M\mathbb{E}\bigg[\int_0^{T'}\#K^{1\triangle 2}_t\,{\mathrm{d}} t\bigg], \end{align*}

where $k^-$ denotes the parent of particle k, and the last inequality follows from (13). Note that $\sum_{k\in\mathbb{K}}\sup\nolimits_{0\le t\le T'}\mathbf{1}_{k\in K^{1}_t\cup K^{2}_t,k^-\in K^{1\triangle 2}_{\bar{S}^k-}}$ corresponds to the total number of particles born from a parent in the distinct set $K^{1\triangle 2}$ before time T ^′. Alternatively, when the parent belongs to the common set, we have

\begin{align*} & \mathbb{E}\Bigg[\sum_{k\in\mathbb{K}}\sup\nolimits_{0\le t\le T'}\mathbf{1}_{k\in K^{1\cap2}_t,k^-\in K^{1\cap 2}_{\bar{S}^k-}} \big\|X^{1,k}_{\bar{S}^k\wedge\cdot}-X^{2,k}_{\bar{S}^k\wedge\cdot}\big\|\Bigg] \\[2pt] & = \mathbb{E}\Bigg[\int_{(0,T']\times[0,\bar\gamma]\times[0,1]}\sum_{k\in K^{1\cap2}_{t-}}\sum_{\ell,\ell'\geq0} (\ell\wedge \ell')\mathbf{1}_{I^{1,k}_{\ell,t}\cap I^{2,k}_{\ell',t}\times[0,\gamma^{1,k}_t\wedge\gamma^{2,k}_t]}(z) \big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\|\,Q^k({\mathrm{d}} t,{\mathrm{d}} z)\Bigg] \\[2pt] & \leq \bar\gamma\mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t}\sum_{\ell,\ell'\geq0} \ell\,\big|I^{1,k}_{\ell,t}\cap I^{2,k}_{\ell',t}\big|\big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\|\, {\mathrm{d}} t\Bigg] \\[2pt] & = \bar\gamma\mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t}\sum_{\ell\geq0}\ell p^{1,k}_{\ell,t} \big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\|\,{\mathrm{d}} t\Bigg] \nonumber \\[2pt] & \leq \bar\gamma M\mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t} \big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\|\Bigg]. \end{align*}

Taking the supremum successively over $t\in[0,T']$ , the sum over $k\in\mathbb{K}$ , and the expectation in (19), we conclude by combining the inequalities above that

(20)

\begin{align} & \mathbb{E}\bigg[\sum_{k\in\mathbb{K}}\sup\nolimits_{0\leq t\leq T'}\mathbf{1}_{k\in K^{1\cap2}_t} \big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\|\bigg] \nonumber \\[2pt] & \qquad \le C\sqrt{T'}\mathbb{E}\Bigg[\sum_{k\in\mathbb{K}}\sup\nolimits_{0\leq t\leq T'}\mathbf{1}_{k\in K^{1\cap2}_t} \big(\big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\| + \mathcal{W}_1(\mu^1_t,\mu^2_t)\big)\Bigg] \nonumber \\[2pt] & \qquad\quad + C\mathbb{E}\Bigg[\int_0^{T'}\Bigg(\#K^{1\triangle 2}_t + \sum_{k\in K^{1\cap2}_t}\big(\big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\| + \mathcal{W}_1\big(\mu^1_t,\mu^2_t\big)\big)\Bigg)\,{\mathrm{d}} t\Bigg], \end{align}

where the constant $C>0$ depends on $(\bar \gamma,L,M)$ .

Step 3

We write $\bar{d}_{T'}(Z^1,Z^2) \,:\!=\, \sup\nolimits_{0\leq t\leq T'}\#K^{1\triangle 2}_t + \sum_{k\in\mathbb{K}}\sup\nolimits_{0\leq t\leq T'}\mathbf{1}_{k\in K^{1\cap2}_t}\big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\|$ . Observe that

\begin{equation*} \mathbb{E}\Bigg[\int_0^{T'}\Bigg(\#K^{1\triangle 2}_t + \sum_{k\in K^{1\cap2}_t}\big\|X^{1,k}_{t\wedge\cdot}-X^{2,k}_{t\wedge\cdot}\big\|\Bigg)\,{\mathrm{d}} t\Bigg] \le T'\mathbb{E}[\bar{d}_{T'}(Z^1,Z^2)], \end{equation*}

and

\begin{align*} \mathbb{E}\Bigg[\int_0^{T'}\sum_{k\in K^{1\cap2}_t}\mathcal{W}_1\big(\mu^1_t,\mu^2_t\big)\,{\mathrm{d}} t\Bigg] & \le T'\mathbb{E}\Bigg[\sum_{k\in\mathbb{K}}\sup\nolimits_{0\leq t\leq T'}\mathbf{1}_{k\in K^{1\cap2}_t} \mathcal{W}_1\big(\mu^1_{t},\mu^2_{t}\big)\Bigg] \\[2pt] & \le T'\mathbb{E}[\#K_0]{\mathrm{e}}^{\bar\gamma M T'}\mathcal{W}_1\big(\mu^1_{T'},\mu^2_{T'}\big), \end{align*}

where the last inequality follows from Lemma 1 and $\mathcal{W}_1(\mu^1_{t},\mu^2_{t}) \le \mathcal{W}_1(\mu^1_{T'},\mu^2_{T'})$ for $t\in[0,T']$ . Combining (18) and (20) and using both inequalities above, we can find some positive constants $c_d>0$ and $c_w>0$ depending on $(\bar\gamma,L,M,M')$ such that

\begin{equation*} \mathbb{E}[\bar{d}_{T'}(Z^1,Z^2)] \le c_d(T'+\sqrt{T'})\mathbb{E}[\bar{d}_{T'}(Z^1,Z^2)] + c_w(T'+\sqrt{T'})\mathbb{E}[\#K_0]{\mathrm{e}}^{\bar\gamma M T'}\mathcal{W}_1(\mu^1_{T'},\mu^2_{T'}). \end{equation*}

Choosing $T'>0$ small enough to have $1-c_d(T'+\sqrt{T'})>0$ , we deduce that

\begin{equation*} \mathbb{E}[\bar{d}_{T'}(Z^1,Z^2)] \leq \frac{c_w(T'+\sqrt{T'})}{1-c_d(T'+\sqrt{T'})} \mathbb{E}[\#K_0]{\mathrm{e}}^{\bar\gamma M T'}\mathcal{W}_1\big(\mu^1_{T'},\mu^2_{T'}\big). \end{equation*}

The conclusion follows immediately by observing that $d_{T'}(Z^1,Z^2)\le \bar{d}_{T'}(Z^1,Z^2)$ .

3.2. Proof of Theorem 1

Proof of Theorem 1. We first outline the strategy of the proof. We initially establish the existence and uniqueness of the SDE (6) with (7) on a small interval [0, T ^′] by constructing a contraction mapping. We then shift the time window to [T ^′, 2T ^′] and so forth to establish the global well-posedness.