1. Introduction and main results
1.1. Introduction
The Hawkes process
$(Z_t)_{t\ge0}$
is a random point process that describes temporal stochastic self-exciting phenomena that was first introduced in 1971 [Reference Hawkes19]. The evolution of the process is influenced by the timing of the past events. It is determined by its intensity process
$(\lambda_t)_{t\ge0}$
, which is of the form
$\lambda_t=\mu+\int_0^t\varphi(t-s)\,{\textrm{d}} Z_s$
, where
$\mu>0$
is interpreted as the ‘background rate’ or ‘exogenous rate’ of the process, and
$\varphi$
is the regression kernel, which is the intensity function of a nonhomogeneous Poisson process, and quantifies the influence of past events of the process on the arrival of future events. For more introductory material on Hawkes processes, see [Reference Daley and Vere-Jones9, Reference Daley and Vere-Jones10].
The Hawkes process is not only an interesting subject in mathematics but has also been widely applied to various fields to model both natural and social phenomena, like modeling earthquakes and their aftershocks in seismology [Reference Hawkes19, Reference Hawkes and Oakes21], brain activity in neuroscience [Reference Hansen, Reynaud-Bouret and Rivoirard18], risk estimation, transaction times and midquote changes and so on in mathematical finance [Reference Bacry and Muzy2, Reference Bacry and Muzy3, Reference Hawkes20], social network interactions [Reference Crane and Sornette8, Reference Rizoi, Mishra, Kong, Carman and Xie29], etc. There are substantial numbers of works on Hawkes processes and their applications in various disciplines; here we have given a nonexhaustive reference and we refer the reader to the bibliography and references therein for more details.
From a mathematical point of view, there has been much investigation into the large-time behavior of the Hawkes process, especially the law of large numbers and the central limit theorem, which have been established for different cases of Hawkes processes, like linear, nonlinear, subcritical, critical, and supercritical, under the fixed-rate assumption
$\int_0^\infty\varphi\,{\textrm{d}} t<\infty$
[Reference Bacry, Delattre, Hoffmann and Muzy1, Reference Cattiaux, Colombani and Costa6, Reference Karabash and Zhu27, Reference Zhu31, Reference Zhu33]. The large-deviation behavior for both linear and nonlinear Hawkes processes was established in [Reference Bordenave and Torrisi4, Reference Gao and Zhu15, Reference Gao and Zhu16, Reference Zhu32, Reference Zhu34, Reference Zhu35] (which is not an exhaustive list of references).
Recently, the limit theorem for multivariate marked Hawkes processes was established [Reference Xu30]. Meanwhile, functional central limit theorems for subcritical and critical Hawkes processes and convergence for the heavy-tailed Hawkes processes were investigated in [Reference Horst and Xu22, Reference Horst, Xu and Zhang23], respectively. In particular, for the linear regime, because of its tractability (especially the immigration–birth representation), the limit behaviors have been well understood and widely applied in practice. We also restrict our consideration to the linear case. The rate
$\int_0^\infty\varphi\,{\textrm{d}} t<1$
refers to the subcritical case, and
$\int_0^\infty\varphi\,{\textrm{d}} t=1$
to the critical case; otherwise, we are dealing with the supercritical case. In finance, this fixed rate is the so-called branching ratio, which is interpreted as the average proportion of endogenously triggered events. However, in practice, this ratio can’t always be fixed: it may vary from time to time caused by many real factors. This is consistent with the statistical estimation results, which often seem to show that only nearly unstable Hawkes processes are able to fit the data properly, see the introduction of [Reference Jaisson and Rosenbaum26]. Regarding the nearly unstable processes, first introduced in [Reference Jaisson and Rosenbaum26], this means that the
$L^1$
norm of their kernel is close to 1. In [Reference Jaisson and Rosenbaum26] a time-dependent kernel is assumed with its
$L^1$
norm less than 1 and close to 1 as time goes to infinity, which is a near instability condition, i.e.
$\int_0^\infty\varphi^T\,{\textrm{d}} t=a_T<1$
and
$\lim_{T\to\infty}a_T=1$
, where
$(a_T)_{T\ge0}$
is a sequence of positive numbers such that, for all T,
$a_T<1$
. Together with the assumption
$\int_0^\infty t\varphi^T(t)\,{\textrm{d}} t=m<\infty$
, [Reference Jaisson and Rosenbaum26] then shows that these nearly unstable Hawkes processes, under suitable scaling, converge deterministically if
$T(1-a_T)\to+\infty$
as
$T\to\infty$
, i.e. (law of large numbers)
and tend to an integrated Cox–Ingersoll–Ross (CIR) process if
$T(1-a_T)\to\lambda>0$
as
$T\to\infty$
, i.e. (central limit theorem)
for the Skorohod topology, where
$(X_t)_{t\in[0,1]}$
is a CIR process, originally introduced in [Reference Cox, Ingersoll and Ross7] and used classically to model stochastic volatilities in finance.
1.2. Motivation and comparison
As we noticed, the great bulk of the literature related to the Hawkes process and its applications to geophysics and finance focuses on the subcritical regime, whereas the supercritical regime is little studied but is more germane to epidemics, like modeling the spread of COVID-19 in [Reference Garetto, Leonardi and Torrisi17], and to social networks [Reference Crane and Sornette8, Reference Rizoi, Mishra, Kong, Carman and Xie29].
In this paper, we consider supercritical nearly unstable Hawkes processes, i.e. the kernels of the Hawkes processes depending on T, with
$L^1$
norm strictly larger than 1 and close to 1 as
$T\to\infty$
. When the norm of the kernel is close to 1 at a lower speed than T goes to infinity (see H(
$\infty$
)), we prove that the renormalized Hawkes process is asymptotically deterministic in some sense, see Theorem 1. And when the norm of the kernel is close to 1 at a suitable speed (see the assumption H(
$\Lambda$
)), we prove that the limit behavior of our sequence of Hawkes processes, with a suitable scale of
$1/T^2$
, is an integrated diffusion process, like a CIR process, which is a strong solution of a Brownian driven stochastic differential equation(SDE), see Theorem 2. To do so, we adopt a different approach from [Reference Jaisson and Rosenbaum26] in order to get the limiting behavior of a sequence of Hawkes processes with a scale of
$({1-a_T})/{T}$
with
$0<a_T<1$
. In [Reference Jaisson and Rosenbaum26], the limiting behavior of the intensity of the processes was studied. Roughly speaking, for
$t\in[0,1]$
the authors rewrote the intensity
$\lambda^T_{Tt}$
to
$C_t^T=\lambda^T_{Tt}(1-a_T)$
, which can be written as a stochastic integral equation, then proved the tightness of
$C_t^T$
and passed the coefficients of the equation to the limit to obtain the limiting law. However, in our context, instead of considering the intensity, we directly deal with Hawkes processes whose limit is the same as their cumulated intensity, following the spirit of [Reference Zhu33, Section 5.3], which dealt with the limit behavior of
$Z_{Tt}^T/T^2$
for the critical case, i.e.
$a_T\equiv1$
for any T. There, it was proved that the sequence of
$Z_{Tt}^T/T^2$
converges in law, for the Skorohod topology, to an integrated squared Bessel process, i.e.
$\int_0^t X_s\,{\textrm{d}} s$
, with
$X_t$
satisfying
However, in our Theorem 2, we extend the above case to the nearly unstable supercritical case, i.e.
$a_T>1$
and
$\lim_{T\to\infty}a_T=1$
, and we prove that the limit of
$Z_{Tt}^T/T^2$
is an integrated CIR-like process for the Skorohod topology, i.e.
$\int_0^t X_s\,{\textrm{d}} s$
, with
$X_t$
satisfying
Moreover, the supercritical case was also investigated in [Reference Zhu33, Section 5.4], transferring the supercritical case to the critical case using the Malthusian parameter
$\theta$
, which has often been done when dealing with supercritical branching processes and supercritical Hawkes processes. This proved that the intensity
$\lambda_t$
grows exponentially in t, and
$Z_t/{\textrm{e}}^{\theta t}$
converges almost surely to a process given by
$\mu/(\theta^2\bar{m})+W/(\theta\bar{m})$
, with
$W=\int_0^\infty{\textrm{e}}^{-\theta t}\,{\textrm{d}} M_t$
. Finally, we also mention the mean field limit for multivariate Hawkes processes studied in [Reference Delattre and Fournier12, Reference Liu28] when the dimension goes to infinity.
1.3. Setting
We consider a measurable function
$\varphi\colon[0,\infty)\to [0,\infty)$
and a Poisson measure
$(\Pi({\textrm{d}} t,{\textrm{d}} z))$
on
$[0,\infty)\times [0,\infty)$
with intensity
${\textrm{d}} t\,{\textrm{d}} z$
. The sequence
$\{a_T\}_{T\ge 1}$
indexed by T is positive and
$\lim_{T\to \infty} a_T=1$
. We set
$\varphi^T=a_T\varphi$
. We consider the following system indexed by T: for all
$t\geq 0$
,
In this paper,
$\int_{0}^{t}$
means
$\int_{[0,t]}$
, and
$\int_{0}^{t-}$
means
$\int_{[0,t)}$
. The solution
$(Z_{t}^T)_{t\ge 0}$
is a counting processes. By [Reference Delattre and Fournier12, Proposition 1], see also [Reference Brémaud and Massoulié5, Reference Delattre, Fournier and Hoffmann13], the system (1) has a unique
$(\mathcal{F}_{t})_{t\ge 0}$
-measurable càdlàg solution, where
$\mathcal{F}^T_{t}=\sigma(\Pi(A)\colon A\in\mathcal{B}([0,t]\times [0,\infty))$
, as long as
$\varphi^T$
is locally integrable.
1.4. An illustrative example in epidemiology
We first recall the notable branching representation of Hawkes processes in terms of Galton–Watson trees given in [Reference Hawkes and Oakes21]. That is, in a population process, the migrants arrive according to a Poisson process with ‘exogenous rate’
$\mu$
. Then each migrant gives birth to children according to an ‘endogenous rate’ given by
$\varphi$
, and each child gives birth to grandchildren according to the same ‘endogenous rate’.
Now let us consider the epidemiological example. Let process
$(Z_{t}^{T})_{t\ge 0}$
represent the number of infected individuals over the observation period [0, T]. This process accounts for two main mechanisms of infection: immigration and transmission from already infected individuals:
-
• Immigration of infected individuals: New infections enter the population independently of current local transmissions, modeled by a homogeneous Poisson process with rate
$\mu$
. This represents external sources of infection, such as travelers or new arrivals carrying the disease. -
• Local transmission dynamics: Once an individual is infected, they can further spread the infection to others. This local transmission is captured by an inhomogeneous Poisson process, where the intensity is given by the kernel function
$a_T\varphi(t-s)$
. Here,
-
• t denotes the current time;
-
• s is the time at which an individual was infected;
-
•
$\varphi$
is a kernel function, typically assumed to be decreasing. This assumption reflects the reality that the infectivity of an individual often decreases over time due to factors such as recovery or isolation.
The parameter
$a_T$
represents the level of social interference, which encompasses various control measures such as medical interventions, government policies, and vaccination efforts. Notably,
$a_T$
is a function of time T, and it typically decreases as T increases, reflecting the effectiveness of these interventions in reducing the spread of infection over time. So, from this point of view,
$a_T>1$
is quite consistent with reality.
1.5. Assumptions
We always assume
Also, for
$\lambda> 0$
, we assume either
The assumption
$\int_0^\infty s\varphi(s)\,{\textrm{d}} s = m > 0$
is crucial in our context and enables us to approximate
$\Psi^T$
by an exponential function in Lemma 1.
1.6. Notation
We first recall the convolution of two functions
$f,g\colon[0,\infty)\mapsto\mathbb{R}$
, which (if it exists) is defined by
$(f*g)(t)=\int_0^tf(t-s)g(s)\,{\textrm{d}} s$
.
$(\varphi)^{* n}$
represents the nth convolution product of the function
$\varphi$
, and
$(\varphi)^{*1}=\varphi$
. Since
$\int_0^\infty\varphi(s)\,{\textrm{d}} s = 1$
, it’s not hard to observe that
$\int_0^\infty (\varphi^T)^{* n}(s)\,{\textrm{d}} s=(a_T)^n$
. We adopt the conventions
$\varphi^{*0}(s)\,{\textrm{d}} s=\delta_0({\textrm{d}} s)$
and
$\varphi^{*0}(t-s)\,{\textrm{d}} s=\delta_t(\,{\textrm{d}} s)$
, and the convention
$\varphi^{* n}(s)=0$
for
$s<0$
. We also recall the
$L^1$
norm of a function
$\varphi$
, denoted by
$\|\varphi\|_1= \int_0^\infty|\varphi(s)|\,{\textrm{d}} s$
.
Since
$ 1<a_T=\int_0^\infty\varphi^T(s)\,{\textrm{d}} s<+\infty$
, we can choose a unique positive sequence
$\{b_T\}_{T\ge 1}$
such that
$\int_0^\infty{\textrm{e}}^{-b_T s}\varphi^T(s)\,{\textrm{d}} s={1}/{a_T}$
;
$b_T$
is sometimes referred to as the Malthusian parameter in the literature for fixed T [Reference Zhu33]. We define the following functions on
$\mathbb{R}^+$
:
Note that
$\tilde \Psi^T(t)$
is well defined on
$\mathbb{R}^+$
since
$\| \tilde \varphi^T\|_1=1/a_T<1$
. A direct computation implies that
We also define, for
$t\ge0$
,
which is also well defined. For
$n\ge 1$
, it’s not hard to find that
${\textrm{e}}^{-b_T t}(\varphi^T)^{*n}(t)=({\textrm{e}}^{-b_T t}\varphi^T)^{*n}(t)=(\tilde \varphi^T)^{*n}(t)$
. Thus, we have
We recall
$(\lambda^T_t)_{t\ge0}$
defined in (1), and introduce the compensated Poisson measure
$\tilde\Pi({\textrm{d}} s, {\textrm{d}} z)=\Pi({\textrm{d}} s, {\textrm{d}} z)-{\textrm{d}} s\,{\textrm{d}} z$
associated with the Poisson measure
$\Pi$
. For
$t\ge0$
, set
which is a martingale process related to
$Z^T_t$
. We then readily find that
$M_t^T=Z_{t}^{T}-\int_0^t \lambda^T_s\,{\textrm{d}} s$
. According to [Reference Delattre and Fournier12, Remark 10] (see also [Reference Jacod and Shiryaev25, Chapter 1, Section 4e] for definitions and properties of pure jump martingales and of their quadratic variations) we then have
$\mathbb{E}[M^{T}_s M^{T}_t]= \mathbb{E}[Z^{T}_{s\land t}]$
and
$[M^T,M^T]_t=Z_t^T$
.
1.7. Main results
We first give the law of large numbers in the following sense.
Theorem 1. Under the assumptions H(
$\Lambda$
) and H(
$\infty$
), the sequence of Hawkes processes is asymptotically deterministic, in the sense that the following convergence in
$L^2$
holds as
$T\to \infty$
:
Next, we introduce our second main result related to the central limit theorem.
Theorem 2. Under the assumptions (
$\Lambda$
) and H(
$\Lambda$
), for the Skorohod topology,
where
$(X_t)_{t\in[0,1]}$
is the unique strong solution to the SDE
where
$(B_t)_{t\ge0}$
is a Brownian motion.
1.8. Quantity analysis in epidemiology
As stated in Section 1.4, where an informal application of our result is given, the process
$(Z_{t}^{T})_{t\ge 0}$
represents the number of infected individuals over the observation period [0, T]. In this subsection, we will show how we apply Theorem 2 to quantify the number of infected individuals
$(Z_{t}^{T})_{t\ge 0}$
on [0, T]. Throughout the subsection, we fix the positive constants
$\mu$
,
$\lambda$
, and m, and assume
$n\;:\!=\;4\mu m^2\in \mathbb{N}$
. The process
$(X_t)_{t\ge 0}$
solving (7) introduced in Theorem 2 admits the following distributional representation. There exist n independent and identically distributed (i.i.d.) Gaussian processes
$\{(Y_t^i)_{t\in [0,1]}\}_{i=1,\ldots n}$
such that
and each component
$Y^{i}$
satisfies the linear SDE
where
$\{W^i\}_{i=1}^{n}$
are independent standard Brownian motions. For
$t\in[0,1]$
, the distribution function of the random variable
$\int_{0}^{t}(Y^1_s)^{2}\,{\textrm{d}} s$
is denoted by
$F_{t}(x)=\mathbb{P}\big(\int_{0}^{t}(Y^1_s)^{2}\,{\textrm{d}} s\le x\big)$
for all
$x\in\mathbb{R}$
, and the corresponding n-fold convolution is denoted by
$F_t^{*n}$
, i.e. the distribution function of
$\sum_{i=1}^{n}\int_{0}^{t}(Y^{i}_{s})^{2}\,{\textrm{d}} s$
. Then, by Theorem 2, we have the following quantity estimation.
Proposition 1. Assume that there exists some
$n\in \mathbb{N}$
such that
$4\mu m^2=n$
. Then, for any positive constants
$0<c\le C$
,
where
$(Z^{T}_{Tt})_{t\ge 0}$
is defined in Theorem 2, and
$F_{t}$
is the distribution function of the random variable
$\int_{0}^{t}(Y^1_s)^{2}\,{\textrm{d}} s$
, where
$Y^1$
is defined in (8).
Proof. Recall that
$n=4\mu m^2\in \mathbb{N}$
, and that
$\{(Y_t^i)_{t\in [0,1]}\}_{i=1,\ldots n}$
are n i.i.d. Gaussian processes solving (8). Set
$\tilde X_t\;:\!=\; \sum_{i=1}^n (Y^i_t)^2$
. A direct application of Itô’s formula together with the relation
$n=4\mu m^{2}$
yields
\begin{align*} \textrm{d}\tilde X_t = \frac{2}{m}\sum_{i=1}^n Y^i_t\,{\textrm{d}} Y^i_t + \frac{n}{4m^2}\,{\textrm{d}} t & = \frac{1}{m}\Bigg(\lambda\sum_{i=1}^n(Y^i_t)^2+\mu\Bigg)\,{\textrm{d}} t + \sum_{i=1}^n\frac{Y_t^i}{m}\,{\textrm{d}} W_t^i \\ & = \frac{1}{m}(\lambda\tilde X_t+\mu)\,{\textrm{d}} t + \frac{\sqrt{\tilde X_t}}{m}\,{\textrm{d}}\tilde W_t, \end{align*}
where
$\tilde W_t=\sum_{i=1}^n\int_0^t(Y_s^i/\sqrt{\tilde X_s})\,{\textrm{d}} W_s^i$
is a martingale. Moreover, it is easy to check that
$[\tilde W, \tilde W]_t=t$
. Following from Lévy’s characterization theorem, we conclude that
$(\tilde W_t)_{t\in [0,1]}$
is a standard Brownian motion. Hence,
$(\tilde X_t)_{t\in [0,1]} \stackrel{(\textrm{d})}= (X_t)_{t\in [0,1]}$
with common initial condition 0. In particular,
$\int^t_0X_s\,{\textrm{d}} s \stackrel{(\textrm{d})}= \int^t_0\tilde X_s\,{\textrm{d}} s$
. Applying Theorem 2, we therefore obtain the stated limit.
Remark 1. The processes
$\{(Y_t^i)_{t\in [0,1]}\}_{i=1,\ldots n}$
here are in fact n i.i.d. Ornstein–Uhlenbeck processes. The distribution function
$F_t$
of
$\int_0^t(Y^1_s)^2\,{\textrm{d}} s$
, or, more generally, the distribution function of the integration of the squared general Ornstein–Uhlenbeck process, has been very well studied, e.g. see [Reference Dankel11], where the Laplace transform of the probability density of the integral of the squared Ornstein–Uhlenbeck process was established explicitly. The distribution of X and of its time integral have also been studied in depth, although the general case (without
$4\mu m^{2}\in\mathbb{N}$
) quickly becomes complicated (see, e.g., [Reference Filipović14, Chapter 10]).
Remark 2. (Implications for epidemic size.) The assumption
$4\mu m^{2}\in\mathbb{N}$
does not lose any generality. When estimating the number of infected individuals, we can always bracket the true value by choosing
$\mu$
slightly larger or smaller so that the assumption holds. In addition, we consider this assumption not only because the general case is more complicated, but also because it enables us to give a concise and clear estimation of the quantity. Indeed, we can solve
$Y^i_t$
as
and, by a direct computation or the law of large numbers, we deduce that
where
$\chi_n = n^{-1}\sum_{i=1}^n\big(\int_0^1(Y_s^i)^2\,{\textrm{d}} s-\mathbb{E}\big[\int_0^1(Y_s^i)^2\,{\textrm{d}} s\big]\big)$
with
$\mathbb{E}[(\chi_n)^2]^{1/2}\le C/\sqrt{n}$
because the
$(Y^i)_{i=1,\ldots,n}$
are i.i.d. In typical applications we take large
$n=4\mu m^{2}\gg 1$
, where
$\mu$
is the immigration rate. Consequently,
\begin{align*} \lim_{T\to\infty}\frac{Z_{T}^T}{T^2} \stackrel{(\textrm{d})}= \int_0^1 X_s\,{\textrm{d}} s \stackrel{(\textrm{d})}= \int_0^1\tilde X_s\,{\textrm{d}} s = \sum_{i=1}^n\int_0^1(Y_s^i)^2\,{\textrm{d}} s & \approx n\mathbb{E}\bigg[\int_0^1(Y_s^1)^2\,{\textrm{d}} s\bigg] \\ & = \frac{\mu({\textrm{e}}^{{\lambda}/{m}} - 1 - {\lambda}/{m})}{\lambda^2}, \end{align*}
where ‘
$a_n\approx b_n$
’ means
$\mathbb{E}[|a_n-b_n|]/n\to 0$
as
$n\to\infty$
. Since the function
$({\textrm{e}}^x-x-1)/x^2$
is increasing exponentially, it’s clear that reducing the local transmission rate
$\lambda$
is markedly more effective than lowering the immigration rate
$\mu$
. The kernel parameter m, being a property of the pathogen, is typically fixed in the absence of mutation.
In the following, C stands for a positive constant whose value may change from line to line. When necessary, we will indicate as subscripts the parameters it depends on.
2. Analysis of the convolution function
The aim of this section is to provide some auxiliary results on the kernel function
$\varphi$
. We are going to illustrate the long-term behavior of the function
$\Psi^T$
. To do this, we first recall the following proposition.
Proposition 2. ([Reference Jaisson and Rosenbaum26, Proposition 2.2].) Under assumption (
$\Lambda$
), consider a sequence
$\{c_T\}_{T\ge 1}$
with
$0< c_T<1$
and
$\lim_{T\to\infty}c_T=1$
, and a random variable
$X^T\;:\!=\; ({1}/{T})\sum_{i=1}^{I^T} X_i$
, where
$(X_i)$
are i.i.d. random variables with density
$\varphi$
and
$I^T$
is a geometric random variable independent of
$(X_i)$
with parameter
$1-c_T$
. If
$\lim_{T\to\infty}T(1-c_T)=\lambda$
, then
$X^T\stackrel{(\textrm{d})}\to X$
as
$T\to\infty$
, where X admits an exponential distribution with parameter
$\lambda/m$
.
Writing
$\hat\Psi^T(t)\;:\!=\; \sum_{k=1}^\infty (c_T)^k(\varphi)^{*k}(t)$
, let
$\rho^T$
be the density of
$X^T$
; then the proof of Proposition 2 in [Reference Jaisson and Rosenbaum26] gives
which implies directly that
for
$x\ge 0$
in the weak sense.
In the next lemma, we see that
$\Psi^T$
in terms of
$a_T>1$
has the same convergence as
$\hat \Psi^T$
in (9).
Lemma 1. Under assumptions (
$\Lambda$
) and H(
$\Lambda$
), and recalling
$\Psi^T$
defined in (4), we have the following convergence in the weak sense for
$t\ge 0$
:
$\lim_{T\to\infty}\Psi^T(Tt)=({1}/{m}){\textrm{e}}^{{\lambda t}/{m}}$
.
Proof. First, recall that the positive sequence
$\{b_T\}_{T\ge 0}$
satisfy
and define
$m_T \;:\!=\; \int_0^\infty{\textrm{e}}^{-b_Ts}s\varphi^T(s)\,{\textrm{d}} s$
. Since
$\lim_{T\to\infty} a_T=1$
, we must have
$\lim_{T\to\infty} b_T=0$
. Hence, by the dominated convergence theorem, we have
$\lim_{T\to\infty} m_T=m$
, and then
$\lim_{T\to\infty}(a_T-({1}/{a_T}))/b_T=m$
. Indeed,
\begin{align*} \lim_{T\to\infty}\frac{(a_T-({1}/{a_T}))}{b_T} & = \lim_{T\to\infty}a_T\int_0^\infty\frac{(1 - {\textrm{e}}^{-b_T s})\varphi(s)}{b_T}\,{\textrm{d}} s \\ & = \int_0^\infty\lim_{T\to\infty}\frac{(1 - {\textrm{e}}^{-b_T s})\varphi(s)}{b_T}\,{\textrm{d}} s = \int_0^\infty s\varphi(s)\,{\textrm{d}} s = m. \end{align*}
Due to
$\lim_{T\to\infty} T(a_T-1)=\lambda$
, we thus conclude that
$\lim_{T\to\infty} Tb_T=2\lambda/m$
. Note that
$\int_0^\infty s\tilde \varphi^T(s)\,{\textrm{d}} s=m_T$
, and recalling (2) and (9), we have
$\lim_{T\to\infty}\tilde\Psi^T(Tt)=({1}/{m}){\textrm{e}}^{-{\lambda t}/{m}}$
. Consequently, according to (4),
which completes the proof.
We now recall a classical lemma that we will need.
Lemma 2. ([Reference Bacry, Delattre, Hoffmann and Muzy1, Lemma 3], [Reference Delattre and Fournier12, Lemma 8].) Let A be a constant. Consider
$f,g\colon[0,\infty)\mapsto\mathbb{R}$
locally bounded, and assume that
$\varphi\colon[0,\infty)\mapsto[0,\infty)$
is locally integrable. If
$f_t = g_t + \int_0^t\varphi(t-s)Af_s\,{\textrm{d}} s$
for all
$t\ge0$
, then
$f_t = \sum_{n\ge0}\int_0^t\varphi^{*n}(t-s)A^ng_s\,{\textrm{d}} s$
.
We are now ready to estimate the expectation of the Hawkes process, which is crucial to proving the tightness.
Proposition 3. Consider the solution
$(Z_t^T)_{t\ge0}$
to (1).
-
(i) Under the assumptions (
$\Lambda$
) and H(
$\Lambda$
), for
$0\le t\le 1$
, In particular,
\begin{equation*} \lim_{T\to\infty}\frac{\mathbb{E}\big[Z_{Tt}^{T}\big]}{T^2} = \frac{\mu{\textrm{e}}^{{\lambda t}/{m}}}{m}\int_0^t v{\textrm{e}}^{-{\lambda v}/{m}}\,{\textrm{d}} v = -\frac{\mu}{\lambda} + \frac{m\mu}{\lambda^2}({\textrm{e}}^{{\lambda t}/{m}} - 1). \end{equation*}
$\mathbb{E}\big[Z_{Tt}^{T}\big]\le CT^2$
for some constant
$C>0$
.
-
(ii) Under assumptions (
$\Lambda$
) and H(
$\infty$
), In particular,
\begin{equation*} \sup_{t\in[0,1]}\frac{a_T-1}{T{\textrm{e}}^{b_TTt}}\mathbb{E}\big[Z_{Tt}^{T}\big] \le \mu a_T. \end{equation*}
\begin{equation*} \sup_{t\in[0,T]}\frac{a_T-1}{T{\textrm{e}}^{b_Tt}}\mathbb{E}\big[Z_{t}^{T}\big] \le \mu a_T. \end{equation*}
Proof. The proof is inspired by [Reference Delattre and Fournier12, Lemma 11], which deals with multivariate Hawkes processes. We first rewrite the Hawkes process
$(Z_t^T)_{t\ge0}$
. Starting from (1) and using (6), it’s not difficult to find that
Using [Reference Delattre, Fournier and Hoffmann13, Lemma 22], we have
$\int_0^t\int_0^s\varphi^T(s-v)\,{\textrm{d}} Z_v^T\,{\textrm{d}} s = \int_0^t\varphi^T(t-s)Z_s^T\,{\textrm{d}} s$
. Accordingly,
Since
$(M_t^T)_{t\ge0}$
is martingale, we see that
$\mathbb{E}[M_t^T]=0$
for all
$t\ge0$
. Hence, for
$t\ge0$
,
Applying Lemma 2 and noting (5) and that
$\varphi^{*0}(t-s)\,{\textrm{d}} s=\delta_t({\textrm{d}} s)$
, we thus conclude that
Now, we obtain
Following from Lemma 1, we have
It’s thus clear that
$\lim_{T\to\infty}\mathbb{E}\big[Z_{Tt}^{T}\big]/T^2 \le \mu{\textrm{e}}^{{\lambda}/{m}}/{m}$
. Whence, there exists some constant
$C>0$
such that
$\mathbb{E}\big[Z_{Tt}^{T}\big]/T^2\le C$
, which completes the proof of point (i).
For point (ii), the second inequality is nothing but the first one since
$t\in[0,1]$
. We thus only need to prove the first inequality. Let’s first recall from (2) that
$\tilde\Psi^T(t)=\sum_{n\ge 1}(\tilde\varphi^T)^{*n}(t)$
and from (3) that
$\int_{0}^{\infty}\tilde\Psi^T(t)\,{\textrm{d}} t = {1}/({a_T-1})$
. Using (12) again, replacing t by Tt for
$0\le t\le 1$
,
\begin{align*} \frac{a_T-1}{T}\mathbb{E}\big[Z_{Tt}^{T}\big] & = \frac{\mu(a_T-1)}{T}\bigg[Tt + \int_{0}^{Tt}s{\textrm{e}}^{b_T(Tt-s)}{\textrm{e}}^{-b_T(Tt-s)}\Psi^T(Tt-s)\,{\textrm{d}} s\bigg] \\ & \le \frac{\mu(a_T-1)}{T}\bigg[T + T{\textrm{e}}^{b_TTt}\int_{0}^{\infty}\tilde\Psi^T(s)\,{\textrm{d}} s\bigg] \qquad \text{(using $0\le t\le 1$ and (12))} \\ & \le T\frac{\mu(a_T-1){\textrm{e}}^{b_TTt}}{T}\bigg(1+\frac{1}{a_T-1}\bigg) = \mu a_T{\textrm{e}}^{b_TTt}, \end{align*}
which concludes the proof.
3. Proof of the asymptotic behavior of the process
In this section, we prove the law of large numbers and the central limit theorem for the process.
Proof of Theorem
1. Noting from (10), (11), and (5) that
$\Psi^T(t)=\sum_{n\ge 1}(\varphi^T)^{*n}(t)$
and applying Lemma 2, we have
since
$\mathbb{E}[M^{T}_s M^{T}_t] = \mathbb{E}[Z^{T}_{s\land t}]$
. Then, it is easy to conclude from Proposition 3(ii) that, for any
$0\le s,t\le T$
,
Also, noting from (4) that
$\Psi^T(t) = {\textrm{e}}^{b_T t}\tilde\Psi^T(t)$
, we observe that, for any
$t\in[0,1]$
,
\begin{align*} & \mathbb{E}\bigg[\bigg(\int_{0}^{Tt}\Psi^{T}(Tt-s)M^T_{s}\,{\textrm{d}} s\bigg)^2\bigg] \\ & \qquad\qquad = \mathbb{E}\bigg[\int_{0}^{Tt}\int_{0}^{Tt}\Psi^{T}(Tt-s)\Psi^{T}(Tt-u)M^T_{s}M^T_{u}\,{\textrm{d}} s\,{\textrm{d}} u\bigg] \\ & \qquad\qquad = \int_{0}^{Tt}\int_{0}^{Tt}\Psi^{T}(Tt-s)\Psi^{T}(Tt-u)\mathbb{E}[M^T_{s}M^T_{u}]\,{\textrm{d}} s\,{\textrm{d}} u \\ & \qquad\qquad = \int_{0}^{Tt}\int_{0}^{Tt}\Psi^{T}(Tt-s)\Psi^{T}(Tt-u)\mathbb{E}[Z^{T}_{s\land u}]\,{\textrm{d}} s\,{\textrm{d}} u \\ & \qquad\qquad = {\textrm{e}}^{2b_TTt}\int_{0}^{Tt}\int_{0}^{Tt}\tilde\Psi^{T}(Tt-s)\tilde\Psi^{T}(Tt-u){\textrm{e}}^{-b_T(s+u)} \mathbb{E}[Z^{T}_{s\land u}]\,{\textrm{d}} s\,{\textrm{d}} u \\ & \qquad\qquad \le {\textrm{e}}^{2b_TTt}\int_{0}^{Tt}\int_{0}^{Tt}\tilde\Psi^{T}(Tt-s)\tilde\Psi^{T}(Tt-u){\textrm{e}}^{-b_T(s\land u)} \mathbb{E}[Z^{T}_{s\land u}]\,{\textrm{d}} s\,{\textrm{d}} u \\ & \qquad\qquad \le \frac{\mu Ta_T}{a_T-1}{\textrm{e}}^{2b_TTt}\bigg(\int_{0}^{\infty}\tilde\Psi^{T}(s)\,{\textrm{d}} s\bigg)^2 = \mu Ta_T{\textrm{e}}^{2b_TTt}(a_T-1)^{-3} . \end{align*}
Whence, letting
$T\to\infty$
,
where C is a positive constant.
Next, we prove our second main result related to the central limit theorem. The idea of the proof follows from [Reference Zhu33, Section 5.3].
Proof of Theorem 2. We divide the proof into several steps.
Step 1
We first rewrite the cumulated intensity of the Hawkes process. Recalling
$M_t^T = Z_{t}^{T} - \int_0^t\lambda^T_s\,{\textrm{d}} s$
and
$\lambda^T_s$
introduced in (1), we have
\begin{align*} \int^{Tt}_0\lambda_{s}^{T}\,{\textrm{d}} s & = \mu Tt + \int_{0}^{Tt}\int_{0}^{s}\varphi^T(s-m)\,{\textrm{d}} Z_{m}^{T}\,{\textrm{d}} s \\ & = \mu Tt + \int_{0}^{Tt}\int_{0}^{s}\varphi^T(s-m)\,{\textrm{d}} M_{m}^{T}\,{\textrm{d}} s + \int_{0}^{Tt}\int_{0}^{s}\varphi^T(s-m)\lambda_{m}^{T}\,{\textrm{d}} m\,{\textrm{d}} s. \end{align*}
Rearranging, we then get
By Fubini’s theorem and then changing the variable to
$u=s-m$
, we have
Define
$\Phi^T(t) \;:\!=\; \int_0^t\varphi^T(s)\,\textrm{d}s$
for
$t\ge 0$
. Then, we have
$\Phi^T(t)=0$
for
$t<0$
. Thus, the equality in (13) can be rewritten equivalently as
By a change of variables to
$u=Ts$
, we get
Setting
and dividing by T on both sides, we can write
\begin{equation} \int^{t}_0T(1-\Phi^T(T(t-s)))\frac{\lambda_{Ts}^{T}}{T}\,{\textrm{d}} s = \mu t + \int_{0}^{t}\Phi^T(T(t-s))\sqrt{\frac{\lambda_{Ts}^{T}}{T}}\,{\textrm{d}} B^T_{s}. \end{equation}
Step 2
In this step, we verify that, for the Skorohod topology,
where
$(B_t)_{t\in [0,1]}$
is a Brownian motion. To prove this, by [Reference Jacod and Shiryaev25, Theorem VIII-3.11], it suffices to verify, as
$T\to \infty$
,
-
(i) the quadratic variation of
$(B^T_t)_{t\in[0,1]}$
, i.e.
$[B^T,B^T]_t \to t$
in probability for all
$t\in [0,1]$
fixed; -
(ii)
$\sup_{t \in [0,1]}|B^T_{t}-B^T_{t-}| \to 0$
in probability.
It is not difficult to check (ii): for
$t\in[0,1]$
, we use that the jumps of
$M^T$
are always equal to 1 (because the jumps of
$M^T$
are counted by
$Z^T$
, which are all of size 1), and that
$\lambda_{t}^{T}$
are always bigger than
$\mu>0$
. Hence, we have
$\sup_{t \in [0,1]}|B^T_{t}-B^T_{t-}| \le {1}/{\sqrt{\mu T}}$
, which tends to 0 as
$T\to \infty$
.
Concerning (i), recall that the quadratic variation
$[M^T, M^T]_t=Z^T_t$
for
$t\in[0,1]$
. Then for any
$t\in[0,1]$
, we have
Hence, to get (i), it suffices to prove that
In fact, the Burkholder–Davis–Gundy inequality implies that, for any
$t\in[0,1]$
,
\begin{align*} \mathbb{E}\bigg[\bigg(\frac{1}{T}\int^{tT}_0\frac{{\textrm{d}} M^T_s}{\lambda^T_s}\bigg)^2\bigg] & \le \frac{C}{T^2}\mathbb{E}\bigg[\int^{tT}_0\frac{{\textrm{d}} Z^T_s}{(\lambda^T_s)^2}\bigg] \\ & = \frac{C}{T^2}\mathbb{E}\bigg[\int^{tT}_0\frac{{\textrm{d}} M^T_s}{(\lambda^T_s)^2} + \int^{tT}_0\frac{\lambda^T_s\,{\textrm{d}} s}{(\lambda^T_s)^2}\bigg] \\ & \le \frac{C}{T^2}\mathbb{E}\bigg[\int^{tT}_0\frac{{\textrm{d}} M^T_s}{(\lambda^T_s)^2}\bigg] + \frac{CtT}{\mu T^2} \le \frac{C}{\mu T} . \end{align*}
The last inequality follows from
Now letting
$T\to\infty$
, we complete this step.
Step 3
In the third step, we verify that the sequence
$({1}/{T})\int^{t}_0\lambda_{Ts}^{T}\,{\textrm{d}} s$
is tight in D[0,1] equipped with the Skorohod topology. It is obvious that
$({1}/{T})\int^{t}_0\lambda_{Ts}^{T}\,{\textrm{d}} s$
is continuous for
$t\in[0,1]$
. Recalling
$Z^T$
and
$\lambda^T$
introduced in (1), we find that
Whence, again using Proposition 3(i) for any fixed
$x\in[0,1]$
, we conclude that there exists a constant
$C>0$
such that, for any T and any
$t\in[0,1]$
,
By Prokhorov’s theorem, there exists some increasing sequence
$\{T_i\}_{i\ge 1}$
satisfying
$\lim_{i\to\infty}T_i=\infty$
and a continuous process
$\phi(t)$
such that
$\lim_{i\to\infty}({1}/{T_i})\int^{t}_0\lambda_{T_is}^{T_i}\,{\textrm{d}} s=\phi(t)$
.
Step 4
In the last step, we prove the result. For any positive smooth function
$K(\cdot)$
supported on
$\mathbb{R}^+$
, taking the convolutions of the both sides of (14) we get
\begin{align} \qquad\qquad = \mu\int^t_0K(t-s)s\,{\textrm{d}} s + \int^{t}_0\int_{0}^{s}K(t-s)\Phi^T(T(s-s'))\sqrt{\frac{\lambda_{Ts'}^{T}}{T}}\,{\textrm{d}} B^T_{s'}\,{\textrm{d}} s . \end{align}
For (15), using Fubini’s theorem and taking the substitution
$s=k+s'$
means it is equal to
\begin{align*} & \int^{t}_0\int^{t}_{s'}K(t-s)T(1-\Phi^T(T(s-s')))\frac{\lambda_{Ts'}^{T}}{T}\,{\textrm{d}} s\,{\textrm{d}} s' \\ & \qquad\quad = \int^{t}_0\int^{t-s'}_0K(t-s'-k)T(1-\Phi^T(Tk))\frac{\lambda_{Ts'}^{T}}{T}\,{\textrm{d}} k\,{\textrm{d}} s' \\ & \qquad\ \stackrel{k'=Tk}{=} \int^{t}_0\int^{T(t-s')}_0 K\bigg(t-s'-\frac{k'}{T}\bigg)\bigg(1-\Phi^T(k')\bigg) \frac{\lambda_{Ts'}^{T}}{T}\,{\textrm{d}} k'\,{\textrm{d}} s'. \end{align*}
Recalling
$\Phi^T(t)\;:\!=\;\int_0^t\varphi^T(s)\,{\textrm{d}} s$
and using Fubini’s theorem again, we can easily find that
Hence, by the dominated convergence theorem, we have
\begin{align*} & \lim_{T\to\infty}\int^{T(t-s')}_0 K\bigg(t-s'-\frac{k'}{T}\bigg)(1-\Phi^T(k'))\,{\textrm{d}} k' \\ & = \lim_{T\to\infty}\bigg[\int^{T(t-s')}_0 K\bigg(t-s'-\frac{k'}{T}\bigg)\bigg(a_T-\Phi^T(k')\bigg)\,{\textrm{d}} k' + \int^{t-s'}_0 K(t-s'-k)T(1-a_T)\,{\textrm{d}} k\bigg] \\ & = K(t-s')\int_0^\infty\varphi(k')\,{\textrm{d}} k' - \lambda\int^{t-s'}_0 K(t-s'-k)\,{\textrm{d}} k \\ & = mK(t-s') - \lambda\int_0^{t-s'}K(u)\,{\textrm{d}} u. \end{align*}
Recall that
$({1}/{T_i})\int^{t}_0\lambda_{T_is'}^{T_i}\,{\textrm{d}} s' \to \phi_t$
as
$i\to\infty$
, and using integration by parts and noticing that
$\phi_0=0$
, we have
$\int^{t}_0\big(\int_0^{t-s}K(u)\,{\textrm{d}} u\big)\,{\textrm{d}}\phi_s = \int^{t}_0\phi_s K(t-s)\,{\textrm{d}} s$
. Hence, taking the limit, (15) turns into
\begin{multline*} \lim_{i\to\infty}\int^{t}_0\int^{t-s'}_0K(t-s'-k)T_i(1-\Phi^T_i(T_ik))\frac{\lambda_{T_is'}^{T_i}}{T_i}\, {\textrm{d}} k\,{\textrm{d}} s' \\ = m\int_0^t K(t-s)\,{\textrm{d}} \phi_s - \lambda\int^{t}_0 K(t-s)\phi_s\,{\textrm{d}} s. \end{multline*}
Next, we analyze (16). We observe that
since
$\lim_{T\to\infty}\Phi^T(T(s-s'))=1$
when
$s\ge s'$
. Hence,
\begin{equation*} \lim_{i\to\infty}\int^{t}_0\int_{0}^{s}K(t-s)\Phi^{T_i}(T_i(s-s'))\sqrt{\frac{\lambda_{T_is'}^{T_i}}{T_i}}\, {\textrm{d}} B^{T_i}_{s'}\,{\textrm{d}} s = \int_0^t\bigg(\int_0^{s}\sqrt{\frac{{\textrm{d}}\phi_{s'}}{{\textrm{d}} s'}}\,{\textrm{d}} B_{s'}\bigg)K(t-s)\,{\textrm{d}} s. \end{equation*}
Since (15) = (16), we thus conclude that, for any smooth function K,
\begin{multline*} m\int_0^t K(t-s)\,{\textrm{d}} \phi_s - \lambda\int^{t}_0 K(t-s)\phi_s\,{\textrm{d}} s \\ = \mu\int^t_0K(t-s)s\,{\textrm{d}} s + \int_0^t\bigg(\int_0^{s}\sqrt{\frac{{\textrm{d}}\phi_{s'}}{{\textrm{d}} s'}}\,{\textrm{d}} B_{s'}\bigg) K(t-s)\,{\textrm{d}} s, \end{multline*}
which implies that
$\phi_t$
satisfies
Let
$X_t={{\textrm{d}}\phi_{t}}/{{\textrm{d}} t}$
; then
Moreover, since
we find that the difference,
is a martingale. Thus, for any
$\varepsilon>0$
,
$$ \mathbb{P}\Bigg(\sup_{t\in[0,1]}\bigg|\frac{M^T_{Tt}}{T^2}\bigg| \ge \epsilon\Bigg) \le \frac{C}{T^4}\mathbb{E}\bigg[\int_0^T\lambda_{s}^T\,{\textrm{d}} s\bigg] \to 0 $$
as
$T\to\infty$
. In fact, using Doob’s martingale inequality,
\begin{equation*} \mathbb{E}\Bigg[\sup_{t\in[0,1]}\bigg|\frac{M^T_{Tt}}{T^2}\bigg|^2\Bigg] \le \frac{4}{T^4}\mathbb{E}[(M_T^T)^2] = \frac{4}{T^4}\mathbb{E}[N_T^T] = \frac{4}{T^4}\mathbb{E}\bigg[\int_0^T\lambda_{s}^T\,{\textrm{d}} s\bigg]. \end{equation*}
Consequently, for the Skorohod topology,
Finally, we can readily find that the SDE (17) admits a unique strong solution by using [Reference Ikeda and Watanabe24, Theorem 3.2]. Indeed, the drift coefficient
$b(x)\;:\!=\;(\mu/{m})+(\lambda x/{m})$
is Lipschitz continuous and the diffusion coefficient
$\sigma(x)\;:\!=\;({1}/{m})\sqrt{x}$
is
$\frac12$
-Hölder continuous on
$[0,\infty)$
. Together with the previous steps, this concludes the proof.
Acknowledgements
We would like to greatly thank Sylvain Delattre for fruitful discussions. The authors also thank the anonymous referee for her/his very careful readings and helpful comments that greatly improved the quality of the manuscript.
Funding information
LX is supported by the National Natural Science Foundation of China (12101028). AZ is partially supported by the Beijing Natural Science Foundation (1242009), the National Key R&D Program of China (2024YFA1015300), the National Natural Science Foundation of China (11801536), the China Scholarship Council (202506020208), and the Fundamental Research Funds for the Central Universities (Beihang QYJC Fund and the S & T Innovation Team Support Program – Young Scientist Team).
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.


