(Empirical) Gramian-based dimension reduction for stochastic differential equations driven by fractional Brownian motion

Nahid Jamshidi; Martin Redmann

doi:10.1017/jpr.2025.10039

(Empirical) Gramian-based dimension reduction for stochastic differential equations driven by fractional Brownian motion

Part of: Stochastic processes Probabilistic methods, simulation and stochastic differential equations Stochastic systems and control

Published online by Cambridge University Press: 07 November 2025

Nahid Jamshidi and

Martin Redmann

Show author details

Nahid Jamshidi*: Affiliation:
Martin Luther University Halle-Wittenberg, Institute of Mathematics
Martin Redmann*: Affiliation:
University of Rostock, Institute of Mathematics
*: *Postal address: Theodor-Lieser-Str. 5, 06120 Halle (Saale), Germany. Email address: Nahid.Jamshidi@mathematik.uni-halle.de
**Postal address: Ulmenstr. 69, 18057 Rostock, Germany. Email address: martin.redmann@uni-rostock.de

Article contents

Abstract
Introduction
Fractional Brownian motion and Young/Stratonovich integration
Setting and (projection-based) reduced system
Fundamental solutions and Gramians
Model reduction of Young/Stratonovich differential equations
Numerical results
Funding information
Competing interests
References

Rights & Permissions

Abstract

In this paper we investigate large-scale linear systems driven by a fractional Brownian motion (fBm) with Hurst parameter $H\in [1/2, 1)$. We interpret these equations either in the sense of Young ($H>1/2$) or Stratonovich ($H=1/2$). In particular, fractional Young differential equations are well suited to modeling real-world phenomena as they capture memory effects, unlike other frameworks. Although it is very complex to solve them in high dimensions, model reduction schemes for Young or Stratonovich settings have not yet been much studied. To address this gap, we analyze important features of fundamental solutions associated with the underlying systems. We prove a weak type of semigroup property which is the foundation of studying system Gramians. From the Gramians introduced, a dominant subspace can be identified, which is shown in this paper as well. The difficulty for fractional drivers with $H>1/2$ is that there is no link between the corresponding Gramians and algebraic equations, making the computation very difficult. Therefore we further propose empirical Gramians that can be learned from simulation data. Subsequently, we introduce projection-based reduced-order models using the dominant subspace information. We point out that such projections are not always optimal for Stratonovich equations, as stability might not be preserved and since the error might be larger than expected. Therefore an improved reduced-order model is proposed for $H=1/2$. We validate our techniques conducting numerical experiments on some large-scale stochastic differential equations driven by fBm resulting from spatial discretizations of fractional stochastic PDEs. Overall, our study provides useful insights into the applicability and effectiveness of reduced-order methods for stochastic systems with fractional noise, which can potentially aid in the development of more efficient computational strategies for practical applications.

Keywords

Model order reduction stochastic differential equations Petrov–Galerkin projections non-Markovian processes

MSC classification

Primary: 60G22: Fractional processes, including fractional Brownian motion

Secondary: 65C30: Stochastic differential and integral equations 93E03: Stochastic systems, general

Information

Type: Original Article
Information: Journal of Applied Probability , First View , pp. 1 - 32

DOI: https://doi.org/10.1017/jpr.2025.10039 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2025. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Model order reduction (MOR) is an important tool when it comes to solving high-dimensional dynamical systems. MOR is, for instance, exploited in the optimal control context, where many system evaluations are required, and is successfully used in various other applications. There has been an enormous interest in these techniques for deterministic equations. Let us refer to [Reference Antoulas3] and [Reference Benner, Cohen, Ohlberger and Willcox5], where an overview on different approaches is given and further references can be found. MOR for Itô stochastic differential equations is also very natural, thinking of computationally very involved techniques such as Monte Carlo methods. There has been vast progress in the development of MOR schemes in the Itô setting. Let us refer to [Reference Benner and Redmann7], [Reference Redmann and Freitag30], and [Reference Tyranowski36] in order to point out three different approaches in this context.

In this paper we focus on MOR for stochastic systems driven by fractional Brownian motions and with non-zero initial data. The fractional Brownian motion is an excellent candidate for simulating various phenomena in practice due to its self-similarity and long-range dependence. The significance of long-range dependence is underscored by the extensive volume of literature that features this concept in their titles. Such publications span diverse fields including finance [Reference Lo21], econometrics [Reference Robinson34], internet modeling [Reference Karagiannis, Molle and Faloutsos17], hydrology [Reference Painter28], climate studies [Reference Varotsos and Kirk-Davidoff37], linguistics [Reference Alvarez-Lacalle, Dorow, Eckmann and Moses2], DNA sequencing [Reference Karmeshu and Krishnamachari18], and physics [Reference Grecksch and Lisei13]. However, when $H\neq {{1}/{2}}$ , the process $W^H$ is neither a (semi)martingale nor a Markov process. These are the main obstacles when MOR techniques are designed for such systems. The dimension reduction we focus on is conducted by identifying the dominant subspaces using quadratic forms of the solution of the stochastic equation. These matrices are called Gramians. By characterizing the relevance of different state directions using Gramians, less important information can be removed to achieve the desired reduced-order model (ROM). Our work considers various types of Gramians depending on the availability in the different settings. Exact Gramians on compact intervals [0, T] as well as on infinite time horizons are studied. These have previously been used in deterministic frameworks [Reference Gawronski and Juang12, Reference Moore24] or Itô stochastic differential equations [Reference Becker and Hartmann4, Reference Benner and Damm6, Reference Benner and Redmann7, Reference Redmann and Jamshidi31]. There are basically no results on Gramian-based MOR for systems driven by fractional Brownian motion, apart from the heuristic work in [Reference Jamshidi and Redmann16]. In that work, only the special case of Galerkin projection techniques is considered without addressing any kind of theoretical questions. Our motivation is to provide a comprehensive Petrov–Galerkin projection-based MOR framework for systems driven by fractional Brownian motion. At the same time, the contribution of this paper is a detailed theoretical treatment in contrast to [Reference Jamshidi and Redmann16]. Given the Young case of $H>1/2$ , the fractional driver does not have independent increments making it hard to extend the concept of Gramians to this setting. One of our contributions is the analysis of fundamental solutions of Young differential equations. We prove a weak form of semigroup property in Lemma 4.1 which is the basis for a proper definition of Gramians for $H>1/2$ and new even if $H=1/2$ . This lemma is the key for the entire theory and opens up opportunities to study MOR for equations with drivers having only stationary increments. Lemma 4.1, for example, is exploited to show that certain eigenspaces of the proposed Gramians are associated with dominant subspaces of the system and therefore confirms that the choice of Gramians is meaningful. This is another important theoretical contribution of our paper. However, this approach is still very challenging from the computational point of view for fractional Brownian motions with $H>1/2$ . This is due to a missing link between the exact Gramians proposed here and Lyapunov equations, a connection that is the foundation of the previous theory of MOR with $H=1/2$ (Itô). The link to matrix equations only exists when $H=1/2$ , because the increments of the driver are independent in that case. Therefore empirical Gramians based on simulation data are introduced in particular for $H>1/2$ . Computing this approximation of the exact Gramians is still challenging yet vital since they are needed for deriving the ROM. We further point out how exact Gramians can be computed for Stratonovich stochastic differential equations. Here the equivalence to Itô equations is exploited. Although we show that these Gramians identify redundant information in Stratonovich settings, MOR turns out to be not as natural as in the Itô case. In fact, we illustrate that projection-based dimension reduction for Stratonovich equations leads to ROMs that lack important properties. For instance, stability might not be preserved in the ROM and the error does not solely depend on the truncated eigenvalues of the Gramians. This indicates that there are situations in which the projection-based ROM performs poorly. For that reason, we propose a modification of the ROM having all these nice properties known for Itô equations (stability and meaningful error bounds). The reason why the case $H=1/2$ is so special is that we need an extension of classical integration. However, such an extension is not unique, which raises the question: Which setting is better (or best) in the MOR context? One of our findings is that Stratonovich equations should not be projected directly, but rather should be transformed into an Itô equation first before applying MOR.

The paper is structured as follows. In Section 2 we provide a quick introduction to fractional Brownian motions as well as the associated integration. In particular, we define Young’s integral ( $H>1/2$ ) and the integrals in the sense of Itô/Stratonovich ( $H=1/2$ ). In Section 3 we briefly discuss the setting and the general reduced system structure by projection. We give an initial insight into how projection-based reduced systems need to be modified in order to ensure a better approximation quality in the Stratonovich setting. Section 4 contains a study of properties of the fundamental solution to the underlying stochastic system, which is vital for any kind of theoretical concerns and new for each choice of the Hurst index H. A weak type of semigroup property leads to a natural notion of Gramians. We show that these Gramians indeed characterize dominant subspaces of the system and are hence the basis of our later dimension reduction. This subspace identification is an essential theoretical contribution as it explains the role of the Gramian introduced here. Since exact Gramians are not available for each H, we discuss several modifications and approximations in this context. Strategies on how to compute Gramians for Stratonovich equations are delivered as well. In Section 5 we describe the concept of balancing for all variations of Gramians that we have proposed. A truncation procedure then yields a ROM. We further prove that the truncation method is not optimal in the Stratonovich case (no stability preservation and a potentially large error) and suggest an alternative that is based on transformation into the equivalent Itô framework. This is another key contribution of this work. Finally, in Section 6, we utilize the methods described in the previous sections, along with the POD method detailed in Section 6.2, to solve stochastic heat and wave equations with fractional noises. This section presents the results of our simulations and demonstrates the effectiveness of the proposed methods in solving these equations with various noise cases.

2. Fractional Brownian motion and Young/Stratonovich integration

Below, it is assumed that all stochastic processes occurring in this paper are defined on the filtered probability space $ (\Omega, \mathcal{F},(\mathcal{F}_t)_{t\in[0,T]},\mathbb{P})$ , where $(\mathcal{F}_t)_{t\in[0,T]}$ is the natural filtration of the process driving the differential equations studied in this paper. Our main focus is on the fractional Brownian motion (fBm) $W^H(t)$ , $t\geq 0$ , with Hurst parameter $H\in(0,1)$ , that will later take the role of the driving process. It is Gaussian with mean zero and covariance function given by

(2.1)

\begin{equation}\mathbb{E}[W^H(t)W^H(s)]=\dfrac{1}{2}\bigl(s^{2H}+t^{2H}-|t-s|^{2H}\bigr).\end{equation}

The fBm was initially proposed by Kolmogorov, and it was later investigated by Mandelbrot and Van Ness, who developed a stochastic integral representation of fBm using a standard Brownian motion. Additionally, Hurst used statistical analysis of annual Nile river runoffs to create the Hurst index, which is a resulting parameter.

The fBm exhibits self-similarity, which means that the probability distributions of the processes $a^{-H}W^H(at)$ , $t\geq 0$ , and $W^H(t)$ , $t\geq 0$ , are the same for any constant $a>0$ , which is a direct consequence of the covariance function. The increments of the fBm are stationary and, if $H=1/2$ , they are also independent.

The Hölder continuity of the fBm trajectories can be calculated using the modulus of continuity described in [Reference Garsia, Rodemich and Rumsey11]. To be more precise, we find a non-negative random variable $G_{\epsilon,T}$ for all $\epsilon>0$ and $T>0$ , so that

\begin{align*}|W^H(t)-W^H(s)|\leq G_{\epsilon,T}|t-s|^{H-\epsilon},\end{align*}

almost surely, for all $s,t \in [0,T]$ . Therefore the Hurst parameter H not only accounts for the correlation of the increments but also characterizes the regularity of the sample paths. In other words, the trajectories are Hölder-continuous with parameter arbitrary close to H.

In the following, we will always consider $H\geq 1/2$ and briefly recall the corresponding integration theory. In order to cover the ‘smooth case’ $H>1/2$ , the integral defined by Young [Reference Young39] in 1936 is considered. This scenario covers integrands and integrators with a certain Hölder regularity.

Definition 2.1. Let $ C^{\alpha} $ be the set of Hölder-continuous functions defined on [0, T], with index $ 0<\alpha \leq 1 $ . Suppose that $ f \in C^{\alpha} $ and $ g \in C^{\beta}$ , where $ \alpha + \beta > 1 $ . Given a sequence $(t_i^n )^{k_n}_{i=0}$ of partitions of [0, T], with

\begin{align*}\lim_{n\to\infty}\max_{i=0}^{k_n-1}\bigl\{t_{i+1}^n-t_{i}^n\bigr\}=0,\end{align*}

the Young integral $ \int_0^T f(s) \,{\mathrm{d}} g(s) $ is then defined by

\begin{align*} \int_0^T f(s) \,{\mathrm{d}} g(s)\,:\!=\, \lim_{n\rightarrow \infty} \sum_{i=0}^{k_n-1}f(t_i^n)\bigl[g\bigl(t_{i+1}^n\bigr)-g(t^n_i)\bigr].\end{align*}

As the paths of $W^H$ are a.s. Hölder-continuous with $\alpha=H-\epsilon$ , we define $\int_0^T Y(s)\circ {\mathrm{d}} W^H(s)$ for processesYwith $(H-\epsilon)$ -Hölder-continuous trajectories pathwise in the sense of Young if $H>1/2$ . More generally, letYbe a process that has $\beta$ -Hölder-continuous trajectories with $\beta > 1 - H$ and $H\in(0,1)$ . In this case, the Young integral $\int_0^t Y(s) \circ {\mathrm{d}} W^H(s)$ is also well-defined. However, we are only interested in the case of differential equations driven by fBm, where integrals occur withYhaving the same regularity as $W^H$ .

$H=1/2$ represents the boundary case, in which Young integration no longer works. For that reason, the probabilistic approach of Stratonovich is chosen in the following.

Definition 2.2. Let $H=1/2$ and $(t_i^n )^{k_n}_{i=0}$ a partition as in Definition 2.1 Given a continuous semimartingale Y, we set

\begin{equation*}\int_0^T Y(s)\circ {\mathrm{d}} W^H(s) \,:\!=\, \int_0^T Y(s) \,{\mathrm{d}} W^H(s) + \dfrac{1}{2}[Y, W^H]_T,\end{equation*}

where the first term is the Itô integral

\begin{align*}\int_0^T Y(s) \,{\mathrm{d}} W^H(s)\,:\!=\, \mathbb P-\lim_{n\rightarrow \infty} \sum_{i=0}^{k_n-1}Y(t_i^n)\bigl[W^H\bigl(t_{i+1}^n\bigr)-W^H(t^n_i)\bigr]\end{align*}

and

\begin{align*}[Y, W^H]_T\,:\!=\, \mathbb P-\lim_{n\rightarrow \infty} \sum_{i=0}^{k_n-1}\bigl[Y\bigl(t_{i+1}^n\bigr)-Y(t_i^n)\bigr]\bigl[W^H\bigl(t_{i+1}^n\bigr)-W^H(t^n_i)\bigr]\end{align*}

is the quadratic covariation. The expression ‘ $\mathbb P-\lim$ ’ indicates the limit in probability.

Let us refer, for example, to [Reference Kloeden and Platen20] and [Reference Øksendal27] for more details concerning stochastic calculus given $H=1/2$ . The Stratonovich integral can be viewed as the natural extension of Young, since the Stratonovich setting still ensures having a ‘classical’ chain rule. Moreover, $W^H$ , $H=1/2$ , can be approximated by ‘smooth’ processes $W^{H, \epsilon}$ with bounded variation paths when Stratonovich stochastic differential equations are considered; for example, $W^{H, \epsilon}$ can be piecewise linear (Wong–Zakai [Reference Shmatkov35, Reference Wong and Zakai38]). Due to these connections and in order to distinguish from the Itô setting, we use the circle notation $\circ \,{\mathrm{d}} W^H$ for both the Young and Stratonovich cases. It is worth mentioning that the lack of a martingale property makes the analysis of such integrals particularly challenging, and might require advanced mathematical techniques such as Malliavin calculus; see e.g. [Reference Alòs and Nualart1]. Nevertheless, Young and Stratonovich differential equations driven by an fBm have important applications in various fields. The Stratonovich integral, in particular, is widely employed in the physical sciences and engineering due to its compatibility with the traditional chain rule, which simplifies its interpretation in these disciplines. This form of integration is crucial in feedback systems and scenarios where the noise exhibits multiplicative characteristics. Applications of Stratonovich calculus include modeling thermal fluctuations and statistical mechanics [Reference Gardiner10], addressing non-linear noise effects in control theory [Reference Mao22], and studying population dynamics influenced by environmental variability [Reference Moss and McClintock25].

3. Setting and (projection-based) reduced system

We consider a Young/Stratonovich stochastic differential equation controlled by u satisfying

\begin{align*}\|u\|_{L^2_T}^2\,:\!=\, \mathbb E\int_0^T \|u(t)\|_2^2 \,{\mathrm{d}} t < \infty.\end{align*}

Given $H=1/2$ , we further assume that u is an $(\mathcal{F}_t)_{t\in[0,T]}$ -adapted process, since this property is required for the state processes in the context of stochastic integration. Otherwise, the control is general, meaning that it can also be non-Markovian. Now we study the following dynamics:

(3.1)

\begin{equation}\begin{split}{\mathrm{d}} x(t) &=[Ax(t)+Bu(t)]\,{\mathrm{d}} t+\sum_{i=1}^qN _i x(t)\circ{\mathrm{d}} W_{i}^H(t), \quad x(0)=x_0=X_0z,\\y(t)&=Cx(t),\quad t\in[0,T],\end{split}\end{equation}

where $ A,N_i\in \mathbb{R}^{n\times n} $ , $ B\in\mathbb{R}^{n\times m} $ , $ C\in \mathbb{R}^{p\times n} $ , $X_0\in \mathbb{R}^{n\times v} $ , $z\in\mathbb{R}^v$ , and $T>0$ is the terminal time. $W_1^H,\ldots,W_q^H $ are independent fBm with Hurst index $H\in[1/2,1)$ . System (3.1) is defined as an integral equation using Definitions 2.1 ( $H>1/2$ ) and 2.2 ( $H=1/2$ ) to make sense of $\int_0^t N_ix(s)\circ{\mathrm{d}} W_i^H(s).$

For the later dimension reduction procedure, it can be beneficial to rewrite the Stratonovich setting in the Itô form. Given $H=1/2$ , the state equation in (3.1) is equivalent to the Itô equation

(3.2)

\begin{equation}{\mathrm{d}} x(t) =\Biggl[\Biggl(A+\dfrac{1}{2}\sum_{i=1}^q N_i^2\Biggr)x(t)+Bu(t)\Biggr]\,{\mathrm{d}} t+\sum_{i=1}^q N _i x(t)\,{\mathrm{d}} W_{i}^H(t),\end{equation}

exploiting the fact that the quadratic covariation process is

\begin{align*}\sum_{i=1}^q\int_0^t N_i^2 x(s)\,{\mathrm{d}} s,\quad t\in [0, T].\end{align*}

The goal of this paper is to find a system of reduced order. This can be done using projection methods, in which a subspace spanned by the columns of $ V\in \mathbb{R}^{n\times r} $ is identified, so that $x\approx Vx_r$ . Here the particular techniques studied in this paper aim for a strong approximation in the $L^2_T$ sense. Inserting this into (3.1) yields

(3.3)

\begin{equation}Vx_r(t) = X_0z+ \int_0^t [AV x_r(s)+Bu(s)]\,{\mathrm{d}} s+\sum_{i=1}^q \int_0^t N_iV x_r(s)\circ{\mathrm{d}} W^H_i(s)+e(t),\quad y_r(t) =CVx_r(t),\end{equation}

We enforce the error e(t) to be orthogonal to some space $\operatorname{im}(W)$ characterized by a full-rank matrix $W\in\mathbb{R}^{n\times r}$ , that is, the Petrov–Galerkin condition $W^\top e(t)=0$ holds. In addition, let us only focus on methods, where we additionally have $W^\top V=I$ . Now, multiplying (3.3) by $W^\top$ from the left yields

(3.4)

\begin{equation}\begin{split}{\mathrm{d}} x_r(t) &= [A_r x_r(t)+B_r u(t)]\,{\mathrm{d}} t+\sum_{i=1}^q N_{i,r} x_r(t)\circ{\mathrm{d}} W_{i}^H(t), \quad x_r(0)=x_{0,r}= X_{0, r} z,\\y_r(t)&=C_r x_r(t),\quad t\in[0,T],\end{split}\end{equation}

where $X_{0, r}= W^\top X_0$ and

\begin{align*}A_r=W^\top AV,\quad B_r=W^\top B,\quad N_{i,r}=W^\top N_iV,\quad C_r=CV.\end{align*}

The step from (3.1) to (3.4) can also be interpreted as a Petrov–Galerkin approximation using the projection $V W^\top$ onto $\operatorname{im}(V)$ . If $W=V$ has orthonormal columns, we obtain a Galerkin approximation. On the other hand, we want to point out that reduced-order systems can also be of a different form when $H=1/2$ . Inserting $x\approx Vx_r$ into (3.2) instead of (3.1) and conducting the same Petrov–Galerkin procedure, we obtain a reduced Itô system with drift coefficient $A_r+ \frac{1}{2}\sum_{i=1}^q W^\top N_i^2 V$ . Transforming this back into a Stratonovich equation yields

(3.5)

\begin{equation}{\mathrm{d}} \bar x_r(t) =\Biggl[\Biggl(A_r+\dfrac{1}{2}\sum_{i=1}^q \bigl(W^\top N_i^2 V- N_{i,r}^2\bigr)\Biggr)\bar x_r(t)+B_r u(t)\Biggr]\,{\mathrm{d}} t+\sum_{i=1}^q N_{i,r} \bar x_r(t)\circ{\mathrm{d}} W_{i}^H(t),\end{equation}

which is clearly different from the state equation in (3.4). This is due to the Itô–Stratonovich correction not being a linear transformation. Another contribution of this paper is to analyze whether $x_r$ or $\bar x_r$ performs better for $H=1/2$ . This question arises due to the fact that as soon as classical integration (Young/Stieltjes sense) no longer works (e.g. $H\leq 1/2$ ), potential extensions of Young/Stieltjes integrals are no longer unique. Therefore it is interesting to ask for an optimal setting in which the dimension reduction is conducted. So, it can make sense to consider reduced systems (3.5) rather than the direct and maybe more intuitive choice (3.4).

4. Fundamental solutions and Gramians

4.1. Fundamental solutions and their properties

Before we are able to compute suitable reduced systems, we require fundamental solutions $\Phi$ . These $\Phi$ will later lead to the concept of Gramians that identify dominant subspaces. The fundamental solution associated with (3.1) is a two-parameter matrix-valued stochastic process $\Phi$ solving

(4.1)

\begin{equation}\Phi(t,s)=I+\int_s^tA\Phi(\tau,s) \,{\mathrm{d}} \tau+\sum_{i=1}^q\int_s^tN_i\Phi(\tau,s)\circ{\mathrm{d}} W^H_i(\tau)\end{equation}

for $ t\geq s\geq 0 $ . For simplicity, we set $\Phi(t)\,:\!=\, \Phi(t,0)$ , meaning that we omit the second argument if it is zero. We can separate the variables, since we have $ \Phi(t,s)=\Phi(t)\Phi(s)^{-1} $ for $ t\geq s\geq 0 $ . This is due to the fact that $ \Phi(t)\Phi^{-1}(s) $ fulfills equation (4.1). Now we derive the solution of the state equation (3.1) in the following proposition, which is a known result based on the product rule.

Proposition 4.1. The solution of the state equation (3.1) for $ H\in[1/2, 1)$ is given by

(4.2)

\begin{equation}x(t)=\Phi(t)x_0+\int_0^t \Phi(t, s) B u(s)\,{\mathrm{d}} s,\quad t\in [0, T].\end{equation}

Proof. Defining

\begin{align*}k(t)=x_0+\int_0^t \Phi(s)^{-1} B u(s)\,{\mathrm{d}} s,\end{align*}

the result is an immediate consequence of applying the classical product rule (available in the Young/Stratonovich case) to $ \Phi(t)k(t), t\in [0, T]$ . It follows that

\begin{align*}\Phi(t)k(t)&=x_0+\int_0^t \Phi(s)\,{\mathrm{d}} k(s)+\int_0^t ({\mathrm{d}}\Phi(s)) k(s)\\&=x_0+\int_0^t Bu(s)\,{\mathrm{d}} s+ \int_0^tA\Phi(s)k(s)\,{\mathrm{d}} s+ \sum_{i=1}^q\int_0^t N_i\Phi(s)k(s)\circ {\mathrm{d}} W^H_i(s),\end{align*}

meaning that $\Phi(t)k(t)$ , $t\in [0, T]$ , is the solution to (3.1). The result follows by

\begin{equation*}\Phi(t,s)=\Phi(t)\Phi^{-1}(s).\end{equation*}

The fundamental solution lacks the strong semigroup feature of the deterministic case $ (N_i = 0)$ . This means that $ \Phi(t,s) = \Phi(t-s) $ does not hold $\mathbb P$ -almost surely, as the trajectories of $W^H$ on $ [0,t-s] $ and [s, t] are distinct. In the following lemma we can demonstrate that the semigroup property holds in distribution exploiting the stationary increments of $W^H$ . In fact, this lemma is the key to studying MOR for stochastic systems driven by processes with stationary (not necessarily independent) increments and can therefore be applied to settings beyond the case studied in this work.

Lemma 4.1. It holds that the fundamental solution of (3.1) satisfies

\begin{align*}\Phi(t,s)\overset{d}{=} \Phi(t-s), \quad t\geq s\geq 0.\end{align*}

Proof. We consider $\Phi(\cdot)$ on the interval $[0, t-s]$ and $\Phi(\cdot,s)$ on [s, t]. Introducing the step size $\Delta t = {{(t-s)}/{N}}$ , we find the partitions $t_k= k \Delta t$ and $t_k^{(s)}=s+t_k$ , $k\in\{0, 1, \ldots, N\}$ , of $[0, t-s]$ and [s, t]. We introduce the Euler discretization of (4.1) as follows:

(4.3)

\begin{equation}\begin{split}\Phi_{k+1}&=\Phi_k+A\Phi_k\Delta t+\sum_{j=1}^q N_j\Phi_k\Delta W^{H}_{j, k},\\\Phi_{k+1}^{(s)}&=\Phi_k^{(s)}+A\Phi_k^{(s)}\Delta t+\sum_{j=1}^q N_j\Phi_k^{(s)}\Delta W^{H, (s)}_{j, k},\end{split}\end{equation}

where we define

\begin{align*}\Delta W^{H}_{j,k}=W^H_{j}(t_{k+1})-W^H_{j}(t_k)\quad\text{and}\quad\Delta W^{H, (s)}_{j, k}=W^H_{j}\bigl(t_{k+1}^{(s)}\bigr)-W^H_{j}\bigl(t_k^{(s)}\bigr).\end{align*}

According to [Reference Mishura23] and [Reference Neuenkirch and Nourdin26], the Euler scheme converges $\mathbb P$ -almost surely for $H>1/2$ , yielding in particular convergence in distribution, that is,

(4.4)

\begin{equation}\Phi_N\xrightarrow{d} \Phi(t-s) ,\quad \Phi_N^{(s)}\xrightarrow{ d } \Phi(t,s),\end{equation}

as $N\rightarrow \infty$ . The Euler method does not converge almost surely in the Stratonovich setting. However, for $H=1/2$ , we can rewrite (4.1) as the Itô equation

\begin{align*}\Phi(t,s)=I+\int_s^t\Biggl(A+\dfrac{1}{2}\sum_{i=1}^q N_i^2\Biggr)\Phi(\tau,s)\,{\mathrm{d}}\tau+\sum_{i=1}^q\int_s^tN_i\Phi(\tau,s)\,{\mathrm{d}} W^H_i(\tau).\end{align*}

This equation can be discretized by a scheme as in (4.3) (Euler–Maruyama). The corresponding convergence is in $L^1(\Omega, \mathcal F, \mathbb P)$ [Reference Kloeden and Platen20], so we have (4.4) for $H=1/2$ as well. By a simple calculation we can get from (4.3) that

\begin{align*}\Phi_{N}&=\prod_{k=0}^{N-1}\Biggl( I+A\Delta t+\sum_{j=1}^qN_j\Delta W^{H}_{j, k}\Biggr)\,=\!:\, F(Z),\\\Phi_{N}^{(s)}&=\prod_{k=0}^{N-1}\Biggl(I+A\Delta t+\sum_{j=1}^q N_j\Delta W^{H, (s)}_{j, k}\Biggr)=F(Z^{(s)}),\end{align*}

where

\begin{align*}Z\,:\!=\, \bigl(\Delta W^{H}_{j, k}\bigr)\quad\text{and}\quad Z^{(s)}\,:\!=\, \bigl(\Delta W^{H, (s)}_{j, k}\bigr),\quad j=1, \ldots, q, \ k=0, \ldots, N-1,\end{align*}

are Gaussian vectors with mean zero. Notice that the function F is just slightly different for $H=1/2$ , that is, A is replaced by $A+\frac{1}{2}\sum_{i=1}^q N_i^2$ . It remains to show that the covariance matrices of Z and $Z^{(s)}$ coincide, leading to $\Phi_N(t,s)\overset{d}{=} \Phi_N(t-s)$ . Subsequently the result follows by (4.4). Using the independence of $W^H_i$ and $W^H_j$ for $i\neq j$ , the non-zero entries of the covariances of Z and $Z^{(s)}$ are

\begin{align*}\mathbb{E}\bigl[\Delta W^{H}_{j, k} \Delta W^{H}_{j, \ell}\bigr]\quad\text{and}\quad \mathbb{E}\bigl[\Delta W^{H, (s)}_{j, k} \Delta W^{H, (s)}_{j, \ell}\bigr],\quad k, \ell = 0, 1, \ldots, N-1,\end{align*}

respectively. These expressions are the same, since, exploiting (2.1), we see that

\begin{align*}\mathbb{E}\bigl[\Delta W^{H, (s)}_{j, k} \Delta W^{H, (s)}_{j, \ell}\bigr]&=\mathbb{E}\bigl[\bigl(W^{H}_{j}(s+t_{k+1})-W^{H}_{j}(s+t_k)\bigr)\bigl(W^{H}_{j}(s+t_{\ell+1})-W^{H}_{j}(s+t_\ell)\bigr)\bigr]\\&=\dfrac{1}{2}\bigl(\vert t_{k+1}-t_\ell \vert^{2 H}+\vert t_{k}-t_{\ell+1} \vert^{2 H}-\vert t_{k+1}-t_{\ell+1} \vert^{2 H}-\vert t_{k}-t_\ell\vert^{2 H}\bigr)\end{align*}

is independent of s. This concludes the proof.

Let us mention that the result of Lemma 4.1 is new even for the well-studied case of $H=1/2$ . In fact, we later exploit the fact that Lemma 4.1 yields

(4.5)

\begin{equation}\mathbb E[\Phi(t,s)M\Phi(t,s)^\top]=\mathbb E[\Phi(t-s)M\Phi(t-s)^\top]\end{equation}

for a matrix M of suitable dimension. Property (4.5) had already been proved for $H=1/2$ using relations to matrix ODEs [Reference Benner and Redmann7]. This is not possible for general H, so the much stronger result of Lemma 4.1 is required.

4.2. Exact and empirical Gramians

4.2.1. Exact Gramians and dominant subspaces

Similar to the approach presented in the second POD-based method outlined in the reference [Reference Jamshidi and Redmann16], our methodology involves partitioning the primary system described in equation (3.1) into distinct subsystems in the following manner:

(4.6)

\begin{align}{\mathrm{d}} x_u(t)&=[Ax_u(t)+Bu(t)]\,{\mathrm{d}} t+\sum_{i=1}^qN_i x_u(t) \circ {\mathrm{d}} W_i^H(t), \quad x_u(0)=0,\ y_u(t)=Cx_u(t),\end{align}

(4.7)

\begin{align}{\mathrm{d}} x_{x_0}(t)&= Ax_{x_0}(t)\,{\mathrm{d}} t+\sum_{i=1}^q N_i x_{x_0}(t)\circ {\mathrm{d}} W_i^H(t), \quad x_{x_0}(0)=x_0=X_0z, \ y_{x_0}(t)=Cx_{x_0}(t).\end{align}

Proposition 4.1 shows that we have the representations $x_{x_0}(t)=\Phi(t)x_0$ and

\begin{align*}x_u(t)=\int_0^t \Phi(t, s) B u(s)\,{\mathrm{d}} s,\end{align*}

so $y(t)=y_{x_0}(t)+ y_u(t)$ follows. Lemma 4.1 is now vital for a suitable definition of Gramians. Due to the weak semigroup property of the fundamental solution in Lemma 4.1, it turns out that

(4.8)

\begin{equation}P_{u, T}\,:\!=\, \mathbb{E}\biggl[ \int_0^T\Phi(s)BB^\top\Phi(s)^\top\,{\mathrm{d}} s \biggr],\quad P_{x_0, T}\,:\!=\, \mathbb{E}\biggl[ \int_0^T\Phi(s)X_0X_0^\top\Phi(s)^\top\,{\mathrm{d}} s \biggr]\end{equation}

are the right notion of Gramians for (4.6) and (4.7). With (4.8) we then define a Gramian $P_T\,:\!=\, P_{u, T}+P_{x_0, T}$ for the original state equation (3.1). For the output equation in (3.1), a Gramian can be introduced directly by

\begin{equation*}Q_{T}\,:\!=\, \mathbb E\int_0^{T} \Phi(s)^\top C^\top C \Phi(s) \,{\mathrm{d}} s.\end{equation*}

The following proposition contains estimates that tell us in which sense the above Gramians characterize dominant subspaces of the system. It heavily relies on Lemma 4.1, indicating the importance of this key lemma in the theory of dimension reduction for stochastic systems driven by fBm. In particular, the next proposition addresses the theoretical shortcomings of [Reference Jamshidi and Redmann16], where model reduction methods have been used without proving the relation between Gramians and dominant subspaces of the system. In the following, we prove that the proposed Gramians can be used to identify unimportant information in the system.

Proposition 4.2. Given $v \in \mathbb R^n$ , an initial state of the form $x_0=X_0 z$ , a parameter $p=1, 2$ and a control $u\in L^2_T$ that is assumed to be deterministic if $p=2$ . Then we have

(4.9)

\begin{equation}\int_0^T\mathbb{E}\langle x_{x_0}(t), v\rangle^2_2\,{\mathrm{d}} t \leq v^\top P_{x_0, T} v \|z\|_2^2, \quad \sup\nolimits_{t\in[0,T]}\mathbb{ E}\vert\langle x_u(t), v\rangle_2\vert^p \leq (v^\top P_{u, T} v)^{{{p}/{2}}} \|u\|_{L^2_T}^p\end{equation}

for $p=1, 2$ . Consequently we have

(4.10)

\begin{equation}\int_0^T\mathbb{E}\vert\langle x(t), v\rangle_2\vert \,{\mathrm{d}} t \leq \sqrt{2} \sqrt{v^\top P_{T} v} \max\bigl\{\sqrt{T}\|z\|_2,T \|u\|_{L^2_T}\bigr\}\end{equation}

for all $u\in L^2_T$ and

(4.11)

\begin{equation}\int_0^T\mathbb{E}\vert\langle x(t), v\rangle_2\vert^2 \,{\mathrm{d}} t \leq 2 v^\top P_{T} v \max\Bigl\{\|z\|_2^2,T \|u\|_{L^2_T}^2\Bigr\}\end{equation}

if u is also deterministic. Moreover, we have

(4.12)

\begin{equation}\int_0^T \mathbb{E}\|C\Phi(t)v\|_2^2\,{\mathrm{d}} t=v^\top Q_T v.\end{equation}

Proof. The first relation is a simple consequence of the inequality of Cauchy–Schwarz and the representation of $x_{x_0}$ in Proposition 4.1. Thus

\begin{align*}\int_0^T\mathbb{E}\langle x_{x_0}(t), v\rangle^2_2\,{\mathrm{d}} t&=\mathbb{E}\int_0^T\langle \Phi(t)X_0z, v\rangle^2_2\,{\mathrm{d}} t\\[4pt]& = \mathbb{E}\int_0^T\langle z, X_0^\top\Phi(t)^\top v\rangle_2^2\,{\mathrm{d}} t\\[4pt]&\leq \|z\|^2_2 \mathbb E\int_0^T\| X_0^\top\Phi(t)^\top v\|^2_2\,{\mathrm{d}} t\\[4pt]&=v^\top P_{x_0, T} v \|z\|^2_2.\end{align*}

Utilizing equation (4.2) and the Cauchy–Schwarz inequality once more, we have

\begin{align*}\mathbb{E}\vert\langle x_u(t), v\rangle_2\vert^p&=\mathbb{E}\biggl\vert\biggl\langle \int_0^t\Phi(t,s)Bu(s)\,{\mathrm{d}} s, v\biggr\rangle_2\biggr\vert^p\\[4pt]&\leq\mathbb{E}\biggl[\biggl(\int_0^t\vert\langle \Phi(t,s)Bu(s), v\rangle_2\vert\,{\mathrm{d}} s\biggr)^p\biggr]\\[4pt]&= \mathbb{E}\biggl[\biggl(\int_0^t\vert\langle u(s), B^\top\Phi(t,s)^\top v\rangle_2\vert\,{\mathrm{d}} s\biggr)^p\biggr]\\[4pt]&\leq \mathbb{E}\biggl[\biggl(\int_0^t\|u(s)\|_2 \| B^\top\Phi(t,s)^\top v\|_2\,{\mathrm{d}} s\biggr)^p\biggr] \\[4pt]&\leq \biggl(v^\top\mathbb{E}\int_0^t\Phi(t,s)BB^\top\Phi(t,s)^\top\,{\mathrm{d}} s\, v\biggr)^{{{p}/{2}}} \,\|u\|_{L^2_t}^p.\end{align*}

Based on Lemma 4.1, we obtain

\begin{align*}\mathbb E[\Phi(t,s)BB^\top\Phi(t,s)^\top]=\mathbb E[\Phi(t-s)BB^\top\Phi(t-s)^\top].\end{align*}

Hence

\begin{equation*}\mathbb{E}\vert\langle x_u(t), v\rangle_2\vert^p \leq \biggl(v^\top\mathbb{E}\int_0^t\Phi(t-s)BB^\top\Phi(t-s)^\top\,{\mathrm{d}} s\, v\biggr)^{{{p}/{2}}} \,\|u\|_{L^2_t}^p\leq (v^\top P_{u, T}\, v)^{{{p}/{2}}} \,\|u\|_{L^2_T}^p\end{equation*}

by variable substitution and the increasing nature of $P_{u, T}$ and $\|u\|_{L^2_T}$ in T. This shows the second part of (4.9). Exploiting Proposition 4.1, we know that $x= x_{x_0} + x_u$ . Therefore we have

\begin{align*}\int_0^T\mathbb{E}\langle x(t), v\rangle^2_2\,{\mathrm{d}} t &\leq 2\biggl(\int_0^T\mathbb{E}\langle x_{x_0}(t), v\rangle^2_2\,{\mathrm{d}} t+\int_0^T\mathbb{E}\langle x_{u}(t), v\rangle^2_2\,{\mathrm{d}} t\biggr)\\&\leq 2\biggl(\int_0^T\mathbb{E}\langle x_{x_0}(t), v\rangle^2_2\,{\mathrm{d}} t+T\sup_{t\in[0,T]}\mathbb{E}\langle x_{u}(t), v\rangle^2_2\biggr)\end{align*}

by the linearity of the inner product in the first argument. Applying (4.9) with $p=2$ to this inequality yields (4.11), using $P_T=P_{x_0, T}+ P_{u, T}$ . On the other hand, we obtain

\begin{align*}\int_0^T\mathbb{E}\vert\langle x(t), v\rangle_2\vert\,{black}{\mathrm{d}} t &\leq \int_0^T\mathbb{E}\vert\langle x_{x_0}(t), v\rangle_2\vert\,{\mathrm{d}} t+\int_0^T\mathbb{E}\vert\langle x_{u}(t), v\rangle_2\vert\,{\mathrm{d}} t\\&\leq \sqrt{T}\sqrt{\int_0^T\mathbb{E}\langle x_{x_0}(t), v\rangle^2_2\,{\mathrm{d}} t}+T\sup_{t\in[0,T]}\mathbb{E}\vert\langle x_{u}(t), v\rangle_2\vert.\end{align*}

Applying (4.9) with $p=1$ to this inequality yields (4.10) using

\begin{align*}\sqrt{v^\top P_{x_0, T} v}+\sqrt{v^\top P_{u, T} v}\leq \sqrt{2}\sqrt{v^\top P_{T} v}.\end{align*}

By the definitions of $Q_T$ and the Euclidean norm, (4.12) immediately follows, so this proof is concluded.

Remark 4.1. If the limits $P_{x_0}=\lim_{T\rightarrow \infty} P_{x_0, T}$ , $P_u=\lim_{T\rightarrow \infty} P_{u, T}$ , $P=\lim_{T\rightarrow \infty} P_T$ and $Q=\lim_{T\rightarrow \infty} Q_T$ exist, the Gramians in Proposition 4.2 can be replaced by their limit, as we have $v^\top P_T v \leq v^\top P v$ , $v^\top Q_T v \leq v^\top Q v$ etc. for all $v\in \mathbb R^n$ .

The following remark explains the role of the results in Proposition 4.2 in more detail. In fact, Proposition 4.2 delivers the theoretical motivation for the dimension reduction procedure studied in this work, a motivation that is missing in the special case considered in [Reference Jamshidi and Redmann16].

Remark 4.2. We can read Proposition 4.2. as follows. If v is an eigenvector of $P_{x_0, T}$ and $P_{u, T}$ , respectively, associated with a small eigenvalue, then $x_{x_0}$ and $x_u$ are small in the direction of v. Such state directions can therefore be neglected. The same interpretation holds for x using (4.10) or (4.11) when v is a respective eigenvector of $P_T$ . Now, given $t_0\in [0, T)$ , we expand the state variable as

\begin{align*}x(t_0)=\sum_{k=1}^n\langle x(t_0),q_k\rangle_2q_k,\end{align*}

where $ (q_k)_{k=1,\ldots,n} $ represents an orthonormal set of eigenvectors of $ Q_T $ . We aim to answer the question of which directions $q_k$ in $x(t_0)$ barely contribute to y on $[t_0, T]$ . We can represent the state in (3.1) by

(4.13)

\begin{equation}x(t) =x(t_0)+\int_{t_0}^t [Ax(s)+Bu(s)]\,{\mathrm{d}} s+\sum_{i=1}^q \int_{t_0}^t N _i x(s)\circ\,{\mathrm{d}} W_{i}^H(s), \quad t\in[t_0, T].\end{equation}

and introduce $\tilde x$ as the solution of (4.13) when replacing $x(t_0)$ by $\tilde x(t_0)\,:\!=\, x(t_0)-\langle x(t_0),q_k\rangle_2q_k$ , i.e. the direction $q_k$ is neglected. The process $x-\tilde x$ then solves (4.13) with $u\equiv 0$ starting in $\langle x(t_0),q_k\rangle_2q_k$ at $t_0$ . Therefore the difference in the associated outputs is

\begin{align*}y(t)-C \tilde x(t)=\langle x(t_0),q_k\rangle_2C\Phi(t, t_0)q_k,\quad t\in [t_0, T],\end{align*}

using the solution representation in (4.2). For that reason, we focus solely on the term $C\Phi(t, t_0)q_k$ and observe that

\begin{align*}\int_{t_0}^T \mathbb{E}\|C\Phi(t, t_0)q_k\|_2^2\,{\mathrm{d}} t= \int_{t_0}^T \mathbb{E}\|C\Phi(t-t_0)q_k\|_2^2\,{\mathrm{d}} t\leq \int_{0}^T \mathbb{E}\|C\Phi(t)q_k\|_2^2\,{\mathrm{d}} t\end{align*}

using Lemma 4.1. Identity (4.12) therefore tells us that the direction $v=q_k$ in $x(t_0)$ has a low impact on y(t), $t\in[t_0,T]$ , if the corresponding eigenvalue is small. Such $q_k$ can be removed from the each state $x(t_0)$ without causing a large error in between the exact output y and its approximation $C \tilde x$ .

4.2.2. Approximation and computation of Gramians

In theory, Proposition 4.2 together with Remark 4.2 is the key when aiming to identify dominant subspaces of (3.1) that lead to ROMs. However, for practical purposes, strategies to compute the associated Gramians are vital.

Empirical Gramians for $H\geq 1/2$ . The Gramians that we defined above are generally hard to compute. In fact, there is no known link between these Gramians and algebraic Lyapunov equations or matrix differential equations when $H>1/2$ . For that reason, we suggest an empirical approach in which approximate Gramians based on sampling are calculated. In particular, we consider a discretization of the integral representations by a Monte Carlo method. Let us introduce a equidistant time grid $0=s_0 < s_1 < \dots < s_{\mathcal N}=T$ and let $\mathcal N_s$ further be the number of Monte Carlo samples. Given that $\mathcal N$ and $\mathcal N_s$ are sufficiently large, we obtain

(4.14)

\begin{equation}\begin{split}{P}_{u, T}\approx\bar {P}_{u, T}&=\dfrac{T}{\mathcal N \cdot \mathcal N_s}\sum_{i=1}^{\mathcal N}\sum_{j=1}^{\mathcal N_s}\Phi(s_i,\omega_j)B B^\top \Phi(s_i,\omega_j)^\top,\\{P}_{x_0, T}\approx\bar{P}_{x_0, T}&=\dfrac{T}{\mathcal N \cdot \mathcal N_s}\sum_{i=1}^{\mathcal N}\sum_{j=1}^{\mathcal N_s} \Phi(s_i,\omega_j)X_0 X_0^\top \Phi(s_i,\omega_j)^\top,\end{split}\end{equation}

where $\omega_j\in \Omega$ . Now the advantage is that $\Phi(\cdot)B$ and $\Phi(\cdot) X_0$ are easy to sample as they are the solutions of the control independent variable $x_{x_0}$ in (4.7) with initial states $x_0\mapsto B$ and $x_0\mapsto X_0$ , respectively. This is particularly feasible if B and $X_0$ only have a few columns. Based on (4.14), we can then define $\bar P_T\,:\!=\, \bar {P}_{x_0, T}+\bar {P}_{u, T}$ approximating $P_T$ . Here the goal is to choose $\mathcal N$ and $\mathcal N_s$ so that the estimates in Proposition 4.2 still hold (approximately) ensuring the dominant subspace characterization by the empirical Gramians. Notice that if the limits of the Gramians as $T\rightarrow \infty$ are considered, then the terminal time needs to be chosen sufficiently large. In fact, it is also possible to write down the empirical version of $Q_T$ , which is

\begin{equation*} \bar {Q}_{T}=\dfrac{T}{\mathcal N \cdot \mathcal N_s}\sum_{i=1}^{\mathcal N}\sum_{j=1}^{\mathcal N_s}\Phi(s_i,\omega_j)^\top C^\top C \Phi(s_i,\omega_j).\end{equation*}

However, this object is computationally much more involved. This is because $C \Phi(\cdot)$ is not a solution to an equation like (4.7) that can be sampled easily when only a few initial states are of interest. In fact, we might have to sample from (4.1) to determine $\bar {Q}_{T}$ . This is equivalent to computing samples of $x_{x_0}$ in (4.7) for n different initial states, i.e. $x_0\mapsto I$ . The issue is that n is very large, whereas the number of columns of B and $X_0$ is generally low. This leaves open the question of whether $\bar Q_T$ is numerically tractable.

Exact computation of Gramians for $H=1/2$ . Let us briefly discuss that the computation of $P_T$ , $Q_T$ or their limits as $T\rightarrow \infty$ is easier when we are in the Stratonovich setting of $H=1/2$ . Once more let us point out the relation between Itô and Stratonovich differential equations. So the fundamental solution of the state equation in (3.1) defined in (4.1) is also the fundamental solution of (3.2), i.e. it satisfies

\begin{align*}\Phi(t)=I+\int_0^t A_N\Phi(s)\,{\mathrm{d}} s+\sum_{i=1}^q\int_0^tN_i\Phi(s)\,{\mathrm{d}} W^H_i(s),\end{align*}

where $A_N\,:\!=\, A+\frac{1}{2}\sum_{i=1}^q N_i^2$ . Now, defining the linear operators $\mathcal{L}_{A_N}(X) =A_NX + XA_N^\top $ and

\begin{align*}\Pi(X) =\sum_{i=1}^qN_iXN_i^\top,\end{align*}

it is a well-known fact (a consequence of Itô’s product rule in [Reference Øksendal27]) that $Z(t)= \mathbb E[\Phi(t) M \Phi(t)^\top]$ solves

(4.15)

\begin{equation}\dfrac{{\mathrm{d}}}{{\mathrm{d}} t} Z(t)=\mathcal L_{A_N}[Z(t)]+\Pi[Z(t)], \quad Z(0) =M,\ t\geq 0,\end{equation}

where M is a matrix of suitable dimension. We refer, for example, to [Reference Redmann and Jamshidi31] for more details. Setting $M=B B^\top+X_0 X_0^\top$ and integrating (4.15) yields

(4.16)

\begin{equation}Z(T)-B B^\top- X_0 X_0^\top=\mathcal L_{A_N}[P_T]+\Pi[P_T]\end{equation}

using

\begin{align*}P_T=\mathbb{E}\biggl[ \int_0^T\Phi(s)\bigl(B B^\top+X_0X_0^\top\bigr)\Phi(s)^\top\,{\mathrm{d}} s \biggr].\end{align*}

If system (3.1) is mean-square asymptotically stable, that is, $\mathbb E \|\Phi(t)\|^2$ decays exponentially to zero, then we also find $-B B^\top- X_0 X_0^\top=\mathcal L_{A_N}[P]+\Pi[P]$ for the limit P of $P_T$ .

There is still a small gap in the theory left in [Reference Redmann and Jamshidi31, Proposition 2.2] on how to compute $Q_T$ in the case of $H=1/2$ . Therefore the following proposition was stated under the additional assumption that $C^\top C$ is contained in the eigenspace of $\mathcal L_{A_N}^*+ \Pi^*$ , where

\begin{align*}\mathcal L_{A_N}^*(X)= A_N^\top X+X A_N, \quad \Pi^*(X)= \sum_{i=1}^q N_i^\top X N_i.\end{align*}

We prove this result in full generality below.

Proposition 4.3. In the Stratonovich setting of $H=1/2$ , the function $Z_*(t)=\mathbb{E}[\Phi(t)^\top C^\top C\Phi(t)]$ solves

(4.17)

\begin{equation}\dfrac{{\mathrm{d}}}{{\mathrm{d}} t} Z_*(t)=\mathcal L_{A_N}^*[Z_*(t)]+\Pi^*[Z_*(t)], \quad Z_*(0) =C^\top C,\ t\geq 0.\end{equation}

Proof. Let us vectorize the matrix differential equation (4.15) leading to

\begin{align*}\frac{{\mathrm{d}}}{{\mathrm{d}} t} \operatorname{vec}[Z(t)] = \mathcal K \operatorname{vec}[Z(t)],\quad \operatorname{vec}[Z(0)]=\operatorname{vec}[M],\end{align*}

where

\begin{equation*}\mathcal{K} = A_N\otimes I+I\otimes A_N+\sum_{i=1}^q N_i\otimes N_i\end{equation*}

with $\otimes$ representing the Kronecker product between two matrices and $\operatorname{vec}[\cdot]$ being the vectorization operator. Therefore we know that

\begin{align*}{\mathrm{e}}^{\mathcal K t} \operatorname{vec}[M]=\operatorname{vec}[Z(t)]=\operatorname{vec}[ \mathbb E[\Phi(t) M \Phi(t)^\top ]]=\mathbb E[\Phi(t) \otimes \Phi(t) ]\operatorname{vec}[M],\end{align*}

again exploiting the relation between the vectorization and the Kronecker product. Since this identity is true for all matrices M, we have $\mathbb E[\Phi(t) \otimes \Phi(t) ]={\mathrm{e}}^{\mathcal K t}$ . This is now applied to

\begin{align*}\operatorname{vec}[Z_*(t)]= \operatorname{vec} [\mathbb E[\Phi(t)^\top C^\top C \Phi(t)]]=\mathbb E[\Phi(t)^\top \otimes \Phi(t)^\top ]\operatorname{vec}[C^\top C]= {\mathrm{e}}^{\mathcal K^\top t}\operatorname{vec}[C^\top C],\end{align*}

since $\mathbb E[\Phi(t)^\top \otimes \Phi(t)^\top ]=(\mathbb E[\Phi(t)\otimes \Phi(t)])^\top$ . Therefore we obtain

\begin{align*}\frac{{\mathrm{d}}}{{\mathrm{d}} t} \operatorname{vec}[Z_*(t)] = \mathcal K^\top \operatorname{vec}[Z_*(t)],\quad \operatorname{vec}[Z_*(0)]=\operatorname{vec}[C^\top C].\end{align*}

Devectorizing this equation and exploiting the fact that $\mathcal K^\top$ is the matrix representation of $\mathcal L_{A_N}^*+\Pi^*$ leads to the claim of this proposition.

Integrating (4.17) and using

\begin{align*}Q_T=\mathbb E\biggl[\int_0^T\Phi(t)^\top C^\top C \Phi(t)\,{\mathrm{d}} t\biggr]\end{align*}

leads to

(4.18)

\begin{equation}Z_*(T)-C^\top C=\mathcal L_{A_N}^*[Q_T]+\Pi^*[Q_T].\end{equation}

Once more, mean-square asymptotic stability yields the well-known relation $-C^\top C=\mathcal L_{A_N}^*[Q]+\Pi^*[Q]$ by taking the limit as $T\rightarrow \infty$ in (4.18). Although we found algebraic equation (4.16) and (4.18) from which $P_T$ and $Q_T$ could be computed, it is still very challenging to solve these equations. This is mainly due to the unknowns Z(T) and $Z_*(T)$ . In fact, in [Reference Redmann and Jamshidi31] we suggest sampling and variance reduction-based strategies to solve (4.16) and (4.18). We refer to this paper for more details.

5. Model reduction of Young/Stratonovich differential equations

In this section we introduce ROMs that are based on the (empirical) Gramians of Section 4.2 as they (approximately) identify the dominant subspaces of (3.1). In order to accomplish this, we discuss state space transformations first that diagonalize these Gramians. This diagonalization allows us to assign unimportant direction in the dynamics to certain state components according to Proposition 4.2. Subsequently, the issue is split up into two parts. A truncation procedure is briefly explained for the general case of $H\in[1/2, 1)$ , in which unimportant state variables are removed. This strategy is associated with (Petrov–) Galerkin schemes sketched in Section 3. Later, we focus on the case of $H=1/2$ and point out an alternative ansatz that is supposed to perform better than the previously discussed projection method. Let us notice once more that since a fractional Brownian motion with $H>1/2$ does not have independent increments, no Lyapunov equations associated with the Gramians can be derived. Therefore we frequently refer to the empirical versions of these Gramians and the corresponding reduced dimension techniques.

5.1. State space transformation and balancing

We introduce a new variable $\tilde x(t) = S x(t)$ , where S is a regular matrix. This can be interpreted as a coordinate transform that is chosen in order to diagonalize the Gramians of Section 4.2. This transformation is the basis for the dimension reduction discussed in Sections 5.2 and 5.3. Now, inserting $x(t) = S^{-1} \tilde x(t)$ into (3.1), we obtain

(5.1)

\begin{equation}\begin{split}{\mathrm{d}}\tilde x(t) &=[\tilde A\tilde x(t)+\tilde Bu(t)]\,{\mathrm{d}} t+\sum_{i=1}^q \tilde N _i \tilde x(t)\circ{\mathrm{d}} W_{i}^H(t), \quad \tilde x(0)=\tilde x_0=\tilde X_0z,\\y(t)&=\tilde C\tilde x(t),\quad t\in[0,T],\end{split}\end{equation}

where $\tilde A=S{A}S^{-1}$ , $\tilde B= SB$ , $\tilde N_i=S{N_i}S^{-1}$ , $\tilde X_0= SX_0$ , and $\tilde C= C S^{-1}$ . As we can observe from (5.1), the output remains unchanged under the transformation. However, the fundamental solution of the state equation in (5.1) is

(5.2)

\begin{equation}\tilde \Phi(t) = S \Phi(t) S^{-1}.\end{equation}

This is obtained by multiplying (4.1) by S from the left and $S^{-1}$ from the right. Relation (5.2) immediately transfers to the Gramians, which are

(5.3)

\begin{align}\tilde P_T&\,:\!=\, \mathbb E \int_0^T \tilde \Phi(s)\bigl(\tilde B\tilde B^\top+\tilde X_0\tilde X_0^\top\bigr) \tilde\Phi(s)^\top \,{\mathrm{d}} s = S P_T S^\top ,\end{align}

(5.4)

\begin{align}\tilde Q_T&\,:\!=\, \mathbb E \int_0^T \tilde \Phi(s)^\top\tilde C^\top\tilde C \tilde\Phi(s) \,{\mathrm{d}} s = S^{-\top} Q_T S^{-1}.\end{align}

Exploiting (5.2) again, the same relations as in (5.3) and (5.4) hold true if $P_T$ and $Q_T$ are replaced by their limits P, Q or their empirical versions $\bar P_T, \bar Q_T$ . In the next definition, different diagonalizing transformations S are introduced.

Definition 5.1.

(i) Let the state space transformation S be given by the eigenvalue decomposition $P_T= S^\top \Sigma S$ , where $\Sigma$ is the diagonal matrix of eigenvalues of $P_T$ . Then the procedure is called $P_T$ -balancing.
(ii) Let $P_T$ and $Q_T$ be positive definite matrices. If S is of the form $ S=\Sigma^{{{1}/{2}}} U^\top L_P^{-1}$ with the factorization $P_T=L_PL_P^\top$ and the spectral decomposition $L_P^\top Q_T L_P=U\Sigma^2 U^\top$ , where $\Sigma^2$ is the diagonal matrix of eigenvalues of $P_TQ_T$ . Then the transformation is called $P_T$ / $Q_T$ -balancing.
(iii) Replacing $P_T$ and $Q_T$ with their limits (as $T\rightarrow \infty$ ) in (i) and (ii), then the schemes are called P-balancing or P/Q-balancing, respectively, where in these cases $\Sigma$ is either the matrix of eigenvalues of P or $\Sigma^2$ contains the eigenvalues of PQ.
(iv) Using the empirical versions of $P_T$ and $Q_T$ instead, the methods in (i) and (ii) are called $\bar P_T$ -balancing and $\bar P_T$ / $\bar Q_T$ -balancing. Here $\Sigma$ can be viewed as a random diagonal matrix of the respective eigenvalues.

Notice that balancing based on Gramians $P_T$ , P, or $\bar P_T$ refers to the aim of an approximation of the full state x instead of y. Diagonalizing only one Gramian is, of course, also computationally cheaper but certainly leads to a worse approximation of y if information in $Q_T$ , Q, or $\bar Q_T$ is not involved in the model reduction procedure. It is not difficult to check that the transformations introduced in Definition 5.1 diagonalize the underlying Gramians. Nevertheless, we formulate the following proposition.

Proposition 5.1.

• Using the matrix S in Definition 5.1(i), we find that the state variable Gramian of system (5.1) is $\tilde P_T= \Sigma$ .
• If instead S is of the form given in Definition 5.1(ii), we have $\tilde P_T=\tilde Q_T= \Sigma$ .
• The same type of diagonalization is established if the underlying Gramians are either P, Q or $\bar P_T, \bar Q_T$ .

Proof. The result follows by inserting the respective S into (5.3) and (5.4). Since these relations also hold true for the pairs P, Q and $\bar P_T, \bar Q_T$ , the same argument applies in these cases as well.

Having diagonal Gramians $\Sigma$ , Proposition 4.2 (choose v to be the ith unit vector in $\mathbb R^n$ ) together with Remark 4.2 tells us that we can neglect state components in (5.1) that correspond to small diagonal entries $\sigma_i$ of $\Sigma$ . These have to be truncated to obtain a reduced system.

5.2. Reduced-order models based on projection

In that spirit, we decompose the diagonal Gramian based on one of the balancing procedures in Definition 5.1. We write

(5.5)

\begin{equation}\Sigma= \begin{bmatrix}{\Sigma}_{1}& \\&{\Sigma}_{2}\end{bmatrix}\!,\end{equation}

where $\Sigma_1\in\mathbb R^{r\times r}$ contains the r large diagonal entries of $\Sigma$ and $\Sigma_2$ the remaining small ones. We further partition the balanced coefficient of (5.1) as follows:

(5.6)

\begin{equation}\tilde A=\bigl[\!\begin{smallmatrix}{A}_{11}&{A}_{12}\\ {A}_{21}&{A}_{22}\end{smallmatrix}\!\bigr],\quad \tilde{B} =\bigl[\!\begin{smallmatrix}{B}_1\\ {B}_2\end{smallmatrix}\!\bigr],\quad \tilde N_i=\bigl[\!\begin{smallmatrix}{N}_{i, 11}&{N}_{i, 12}\\ {N}_{i, 21}&{N}_{i, 22}\end{smallmatrix}\!\bigr],\quad \tilde{X}_0 =\bigl[\!\begin{smallmatrix}{X}_{0, 1}\\ {X}_{0, 2}\end{smallmatrix}\!\bigr],\quad \tilde{C} =[\!\begin{smallmatrix}{C}_1 & {C}_2\end{smallmatrix}\!].\end{equation}

The balanced state of (5.1) is decomposed as $\tilde x=\bigl[\!\begin{smallmatrix}{x}_1\\ x_2 \end{smallmatrix}\!\bigr]$ , where $x_1$ and $x_2$ are associated with $\Sigma_1$ and $\Sigma_2$ , respectively. Now, exploiting the insights of Proposition 4.2, $x_2$ barely contributes to (5.1). We remove the equation for $x_2$ from the dynamics and set it equal to zero in the remaining parts. This yields a reduced system

(5.7)

\begin{equation}\begin{split}{\mathrm{d}} x_r(t) &=[A_{11}x_r(t)+ B_1 u(t)]\,{\mathrm{d}} t+\sum_{i=1}^q N_{i, 11} x_r(t)\circ{\mathrm{d}} W_{i}^H(t), \quad x_r(0)=x_{0, r}= X_{0, 1}z,\\y_r(t)&=C_1x_r(t),\quad t\in[0,T],\end{split}\end{equation}

which is of the form (3.4). If balancing according to Definition 5.1 is used, then V are the first r columns of $S^{-1}$ , whereas W represents the first r columns of $S^\top$ . Notice that if solely $P_T$ , P, or $\bar P_T$ are diagonalized (instead of a pair of Gramians), we have $S^{-1}=S^\top$ and hence $W=V$ .

5.3. An alternative approach for the Stratonovich setting ( $H=1/2$ )

5.3.1. The alternative

As sketched in Section 3, the truncation/projection procedure is not unique for $H=1/2$ , meaning that (3.5) can be considered instead of (5.7) (being of the form (3.4)). Such a reduced system is obtained if we rewrite the state of (5.1) as a solution to an Itô equation, meaning that $\tilde A$ becomes $\tilde A_N = \tilde A + \frac{1}{2}\sum_{i=1}^q \tilde N_i^2$ in the Itô setting. Now, removing $x_2$ from this system as we explained in Section 5.2, we obtain a reduced Itô system

(5.8)

\begin{equation}\begin{split}{\mathrm{d}} x_r(t) &=[A_{N, 11}x_r(t)+ B_1 u(t)]\,{\mathrm{d}} t+\sum_{i=1}^q N_{i, 11} x_r(t)\,{\mathrm{d}} W_{i}^H(t), \quad x_r(0)=x_{0, r}= X_{0, 1}z,\\y_r(t)&=C_1x_r(t),\quad t\in[0,T],\end{split}\end{equation}

where

\begin{align*}A_{N, 11}= A_{11}+\frac{1}{2}\sum_{i=1}^q \bigl(N_{i, 11}^2+N_{i, 12}N_{i, 21}\bigr)\end{align*}

is the left upper $r\times r$ block of $\tilde A_N$ . In Stratonovich form, the system is

(5.9)

\begin{equation}\begin{split}{\mathrm{d}} x_r(t) &=\biggl[\biggl(A_{11}+\dfrac{1}{2}\sum_{i=1}^q N_{i, 12}N_{i, 21}\biggr)x_r(t)+ B_1 u(t)\biggr]\,{\mathrm{d}} t\\&\quad +\sum_{i=1}^q N_{i, 11} x_r(t)\circ{\mathrm{d}} W_{i}^H(t),\quad x_r(0)=x_{0, r}= X_{0, 1}z,\\y_r(t)&=C_1x_r(t),\quad t\in[0,T],\end{split}\end{equation}

which has a state equation of the structure given in (3.5).

5.3.2. Comparison of (5.7) and (5.9) for $H=1/2$

Let us continue setting $H=1/2$ . Moreover, we assume $x_0=0$ in this subsection for simplicity. We only focus on P-balancing along with P/Q-balancing (explained in Definition 5.1(iii)) in order to emphasize our arguments. In addition, we always suppose that P and Q are positive definite. Let us point out that relations between (3.1) and (5.9) are well studied due to the model reduction theory of Itô equations exploiting the fact that these Stratonovich equations are equivalent to (3.2) and (5.8). In fact, the (uncontrolled) state equation is mean-square asymptotically stable ( $\mathbb E \|\Phi(t)\|^2\rightarrow 0$ as $t\rightarrow \infty$ ) if and only if the same is true for (3.2). This type of stability is well investigated in Itô settings; see e.g. [Reference Damm9, Reference Khasminskii19]. It is again equivalent to the existence of a positive definite matrix X, so the operator $\mathcal L_{A_N}+\Pi$ evaluated at X is a negative definite matrix, that is,

(5.10)

\begin{equation}\mathcal L_{A_N}[X]+\Pi[X] <0.\end{equation}

Now, applying P/Q-balancing to (3.1) under the assumptions we made in this subsection, the reduced system (5.9) preserves this property, that is, there exists a positive definite matrix $X_r$ , so that

(5.11)

\begin{equation}A_{N, 11} X_r+ X_r A_{N, 11}^\top +\sum_{i=1}^q N_{i, 11} X_r N_{i, 11}^\top <0. \end{equation}

This result was established in [Reference Benner, Damm, Redmann and Rodriguez Cruz8] given $\sigma_r\neq \sigma_{r+1}$ , where $\sigma_i$ is the ith diagonal entry of $\Sigma$ . If P-balancing is used instead, (5.11) basically holds as well [Reference Redmann and Pontes Duff32]. However, generally a further Galerkin projection of the reduced system (not causing an error) is required in order to ensure stability preservation. We illustrate with the following example that stability is not necessarily preserved in (5.7) given the Stratonovich case.

Example 5.1. Let us fix $x_0=0$ , $q=1$ and consider (3.1) with

\begin{equation*}A= \begin{bmatrix} {-\frac{13}{8}}&\,\,{\frac{5}{4}}\\[6pt]{-\frac{5}{4}}&\,\,{-2}\end{bmatrix},\quad {B}=C^\top = \begin{bmatrix} 1\\[6pt] 0\end{bmatrix} ,\quad N_1= \begin{bmatrix} \frac{3}{2}&\,\, -1\\[6pt]1&\,\,1\end{bmatrix}\end{equation*}

and hence $A_N=\bigl[\!\begin{smallmatrix}{-}1& 0\\ 0 & -2\end{smallmatrix}\!\bigr]$ . This system is asymptotically mean-square stable, since (5.10) is satisfied. We apply P/Q-balancing in order to compute ROMs (5.7) and (5.9) for $r=1$ and $H=1/2$ . Now, we find that $2A_{N, 11}+N_{1, 11}^2=-0.85926<0$ , which is equivalent to (5.11) in the scalar case. On the other hand, (5.7) is not stable, because

\begin{align*}2\bigl(A_{11}+0.5N_{1, 11}^2\bigr)+N_{1, 11}^2= 0.13825>0.\end{align*}

Since the fundamental solution of (5.7) is $\Phi(t)={\mathrm{e}}^{A_{11}t+N_{1, 11} W^H_1(t)}$ , we obtain

\begin{align*}\mathbb E[\Phi(t)^2]={\mathrm{e}}^{2(A_{11}+N_{1, 11}^2)t}={\mathrm{e}}^{0.13825 t}.\end{align*}

This illustrates that (5.7) grows exponentially, whereas the original system decays exponentially. Hence, (5.7) provides a bad approximation quality, especially on large time scales.

Example 5.1 shows us that we cannot generally expect a good approximation of (3.1) by (5.7) in the Stratonovich setting as the asymptotic behavior can be contrary. This is an important theoretical finding as it indicates that direct dimension reduction in the Stratonovich framework can lead to bad approximations.

We emphasize this argument further by looking at the error of the approximations if the full model (3.1) and the reduced system (5.7) have the same asymptotic behavior. First, let us note the following. If (3.1) is mean-square asymptotically stable, then applying P- or P/Q-balancing to this equation ensures the existence of a matrix $\mathcal W$ (depending on the method), so that

(5.12)

\begin{equation}\sup_{t\in [0, T]}\mathbb E \left\|y(t) - y_r(t) \right\|_2 \leq (\operatorname{tr}(\Sigma_2\mathcal W))^{{{1}/{2}}} \left\| u\right\|_{L^2_T},\end{equation}

where $y_r$ is the output of (5.9). This was proved in [Reference Benner and Redmann7] and [Reference Redmann and Pontes Duff32]. Notice that $\mathcal W$ is independent of the diagonalized Gramian $\Sigma$ and $\Sigma_2$ contains the truncated eigenvalues only; see (5.5). It is important to mention that Redmann and Pontes Duff [Reference Redmann and Pontes Duff32] just looked at the P-balancing case if $C=I$ but (5.12) holds for general C, too. Let us now look at ROM (5.7) and check for a bound like (5.12). First of all, we need to assume stability preservation in (5.7) for the existence of a bound. This preservation is not naturally given according to Example 5.1, in contrast to (5.9).

Theorem 5.1. Let us consider the Stratonovich setting of $H=1/2$ . Let system (3.1) with output y and $x_0=0$ be mean-square asymptotically stable. Moreover, suppose that (5.7) with output $y_r$ and $x_{0, r} =0$ preserves this stability. If (5.7) is based on either P-balancing or P/Q-balancing according to Definition 5.1(iii), we have

(5.13)

\begin{equation}\sup_{t\in [0, T]}\mathbb E \|y(t) - y_r(t) \|_2 \leq\bigl(\operatorname{tr}\bigl(\Sigma_1\bigl(\hat Q_1^\top-Q_r\bigr)\Delta_{N, 11}\bigr)+ \operatorname{tr}(\Sigma_2\mathcal W)\bigr)^{{{1}/{2}}} \| u\|_{L^2_T},\end{equation}

where

\begin{align*}\mathcal W\,:\!=\, C_2^\top C_2+2{A}^\top_{N, 12} \hat Q_2 + \sum_{i=1}^{q} {N}^\top_{i, 12} \bigl(2\hat Q\bigl[\!\begin{smallmatrix}{N}_{i, 12}\\ {N}_{i, 22}\end{smallmatrix}\!\bigr]- Q_r {N}_{i, 12}\bigr).\end{align*}

The above matrices result from the partition (5.6) of the balanced realization (5.1) of (3.1) and

\begin{align*}\tilde A_N=\bigl[\!\begin{smallmatrix}{A}_{N, 11}&{A}_{N, 12}\\ {A}_{N, 21}&{A}_{N, 22}\end{smallmatrix}\!\bigr],\end{align*}

where $\tilde A_N = \tilde A + \frac{1}{2}\sum_{i=1}^q \tilde N_i^2$ . Moreover, we set $\Delta_{N, 11}=\sum_{i=1}^q N_{i, 12}N_{i, 21}$ and assume that $\hat Q=[\!\begin{smallmatrix}{\hat Q_1}& {\hat Q_2}\end{smallmatrix}\!]$ and $Q_r$ are the unique solutions to

(5.14)

\begin{align}\quad\quad\,\,\,\qquad\qquad\biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top \hat Q+ \hat Q \tilde A_N +\sum_{i=1}^{q} {N}^\top_{i, 11} \hat Q \tilde N_{i} &= -C_1^\top \tilde C, \end{align}

(5.15)

\begin{align} \biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top Q_r+ Q_r \biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)+\sum_{i=1}^{q} N_{i, 11}^\top Q_r N_{i, 11}&=-C_1^\top C_1.\end{align}

The bound in (5.13) further involves the matrix $\Sigma=\bigl[\!\begin{smallmatrix}{\Sigma}_{1}& \\ &{\Sigma}_{2}\end{smallmatrix}\!\bigr]$ of either eigenvalues of P (P-balancing) or square roots of eigenvalues of PQ (P/Q-balancing). In particular, $\Sigma_2$ represents the truncated eigenvalues of the system.

Proof. We have to compare the outputs of (5.1) and (5.7). This is the same as calculating the error between the corresponding Itô versions of these systems. In the Itô equation of (5.1), $\tilde A$ is replaced by $\tilde A_{N}$ and the Itô form of (5.7) involves

\begin{align*}A_{11}+\frac{1}{2}\sum_{i=1}^q N_{i, 11}^2=A_{N, 11}-\frac{1}{2}\Delta_{N, 11}\end{align*}

instead of $A_{11}$ . Since either P-balancing or P/Q-balancing is used, we know that at least one of the Gramians is diagonal, i.e. $P=\Sigma$ (see Proposition 5.1). Since we are in the case of $H=1/2$ , we also know the relation to Lyapunov equations by Section 4.2.2, so we obtain

(5.16)

\begin{equation}\tilde A_N \Sigma + \Sigma \tilde A_N^\top+ \sum_{i=1}^{q} \tilde N_{i} \Sigma \tilde N_{i}^\top = -\tilde B \tilde B^\top.\end{equation}

In the Itô setting, an error bound has been established in [Reference Benner and Redmann7]. Applying this result yields

(5.17)

\begin{equation}\sup_{t\in [0, T]}\mathbb E \|y(t) - y_r(t) \|_2 \leq \bigl(\operatorname{tr}\bigl(\tilde C \Sigma \tilde C^\top\bigr) + \operatorname{tr}\bigl(C_1 P_r C_1^\top\bigr) - 2 \operatorname{tr}\bigl(\tilde C \hat P C_1^\top\bigr)\bigr)^{{{1}/{2}}} \| u\|_{L^2_T}.\end{equation}

The reduced system Gramian $P_r$ as well as the mixed Gramian $\hat P$ exist due to the assumption that stability is preserved in the reduced system. They can be defined as the unique solutions of

(5.18)

\begin{align}\biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)P_r+ P_r \biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top+\sum_{i=1}^{q} N_{i, 11} P_r N_{i, 11}^\top &=-B_1B_1^\top,\end{align}

(5.19)

\begin{align}\tilde A_N \hat P + \hat P \biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top+\sum_{i=1}^{q} \tilde N_{i} \hat P {N}_{i, 11}^\top &= -\tilde B B_1^\top.\end{align}

Using the partitions of $\tilde A_N$ and the other matrices in (5.6), we evaluate the first r columns of (5.16) to obtain

(5.20)

\begin{align}- \tilde B B_1^\top &= \tilde A_N\bigl[\!\begin{smallmatrix}{\Sigma}_{1}\\ 0 \end{smallmatrix}\!\bigr]+ \Sigma\Bigl[\!\begin{smallmatrix}{A}^\top_{N, 11}\\{A}^\top_{N, 12}\end{smallmatrix}\!\Bigr]+ \sum_{i=1}^{q} \tilde N_{i} \Sigma\Bigl[\!\begin{smallmatrix}{N}^\top_{i, 11}\\{N}^\top_{i, 12}\end{smallmatrix}\!\Bigr]\\ \nonumber&=\bigl[\!\begin{smallmatrix}{A}_{N, 11}\\ {A}_{N, 21}\end{smallmatrix}\!\bigr]\Sigma_{1} +\Bigl[\!\begin{smallmatrix}{\Sigma}_1 {A}^\top_{N, 11}\\ \Sigma_2 {A}^\top_{N, 12}\end{smallmatrix}\!\Bigr]+ \sum_{i=1}^{q} \bigl(\bigl[\!\begin{smallmatrix}{N}_{i, 11}\\ {N}_{i, 21}\end{smallmatrix}\!\bigr]\Sigma_1 {N}^\top_{i, 11} +\bigl[\!\begin{smallmatrix}{N}_{i, 12}\\ {N}_{i, 22}\end{smallmatrix}\!\bigr]\Sigma_2 {N}^\top_{i, 12}\bigr).\end{align}

Using the properties of the trace, we find the relation $\operatorname{tr}(\tilde C \hat P C_1^\top) = \operatorname{tr}(\hat Q \tilde B B_1^\top)$ between the mixed Gramians satisfying (5.14) and (5.19). In more detail, one can find this relation by inserting (5.19) into $\operatorname{tr}(\hat Q \tilde B B_1^\top)$ and exploiting the fact that two matrices can be switched in the trace of a product of both without changing the result. We insert (5.20) into this relation, giving us

\begin{align*}-\operatorname{tr}(\tilde C \hat P C_1^\top) &= \operatorname{tr}\Biggl(\hat Q\Biggl[\bigl[\!\begin{smallmatrix}{A}_{N, 11}\\ {A}_{N, 21}\end{smallmatrix}\!\bigr]\Sigma_{1} +\Bigl[\!\begin{smallmatrix}{\Sigma}_1 {A}^\top_{N, 11}\\ \Sigma_2 {A}^\top_{N, 12}\end{smallmatrix}\!\Bigr]+ \sum_{i=1}^{q} \bigl(\bigl[\!\begin{smallmatrix}{N}_{i, 11}\\ {N}_{i, 21}\end{smallmatrix}\!\bigr]\Sigma_1 {N}^\top_{i, 11} +\bigl[\!\begin{smallmatrix}{N}_{i, 12}\\ {N}_{i, 22}\end{smallmatrix}\!\bigr]\Sigma_2 {N}^\top_{i, 12}\bigr)\Biggr]\Biggr) \\&= \operatorname{tr}\Biggl(\Sigma_1\Biggl[\hat Q\bigl[\!\begin{smallmatrix}{A}_{N, 11}\\ {A}_{N, 21}\end{smallmatrix}\!\bigr]+\biggl({A}_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top \hat Q_1 + \sum_{i=1}^{q} {N}^\top_{i, 11} \hat Q\bigl[\!\begin{smallmatrix}{N}_{i, 11}\\ {N}_{i, 21}\end{smallmatrix}\!\bigr]\Biggr]\Biggr) \\&\quad + \dfrac{1}{2} \operatorname{tr}\bigl(\Sigma_1\Delta_{N, 11}^\top \hat Q_1 \bigr)+\operatorname{tr}\Biggl(\Sigma_2\Biggl[{A}^\top_{N, 12} \hat Q_2 + \sum_{i=1}^{q} {N}^\top_{i, 12} \hat Q\bigl[\!\begin{smallmatrix}{N}_{i, 12}\\ {N}_{i, 22}\end{smallmatrix}\!\bigr]\Biggr]\Biggr).\end{align*}

The first r columns of (5.14) give us

\begin{align*}\hat Q\bigl[\!\begin{smallmatrix}{A}_{N, 11}\\ {A}_{N, 21}\end{smallmatrix}\!\bigr]+\biggl({A}_{N, 11}-\frac{1}{2}\Delta_{N, 11}\biggr)^\top \hat Q_1 + \sum_{i=1}^{q} {N}^\top_{i, 11} \hat Q\bigl[\!\begin{smallmatrix}{N}_{i, 11}\\ {N}_{i, 21}\end{smallmatrix}\!\bigr]= -C_1^\top C_1\end{align*}

and hence

\begin{equation*}-\operatorname{tr}\bigl(\tilde C \hat P C_1^\top\bigr) = -\operatorname{tr}\bigl(C_1\Sigma_1C_1^\top\bigr)+ \frac{1}{2} \operatorname{tr}\bigl(\Sigma_1\Delta_{N, 11}^\top \hat Q_1 \bigr)+\operatorname{tr}\Biggl(\Sigma_2\Biggl[{A}^\top_{N, 12} \hat Q_2 + \sum_{i=1}^{q} {N}^\top_{i, 12} \hat Q\bigl[\!\begin{smallmatrix}{N}_{i, 12}\\ {N}_{i, 22}\end{smallmatrix}\!\bigr]\Biggr]\Biggr).\end{equation*}

We exploit this for the bound in (5.17) and further find that

\begin{align*}\operatorname{tr}(\tilde C \Sigma \tilde C^\top) =\operatorname{tr}\bigl(C_1 \Sigma_1 C_1^\top\bigr) +\operatorname{tr}\bigl(C_2 \Sigma_2 C_2^\top\bigr) .\end{align*}

Thus we have

(5.21)

\begin{align}&\operatorname{tr}(\tilde C \Sigma \tilde C^\top) + \operatorname{tr}\bigl(C_1 P_r C_1^\top\bigr) - 2 \operatorname{tr}\bigl(\tilde C \hat P C_1^\top\bigr)\nonumber\\ \nonumber&\quad = \operatorname{tr}\bigl(C_1(P_r-\Sigma_1)C_1^\top\bigr)+\operatorname{tr}\left(\Sigma_1\Delta_{N, 11}^\top \hat Q_1 \right)\\&\quad\quad +\operatorname{tr}\Biggl(\Sigma_2\Biggl[C_2^\top C_2+2{A}^\top_{N, 12} \hat Q_2 + 2\sum_{i=1}^{q} {N}^\top_{i, 12} \hat Q\bigl[\!\begin{smallmatrix}{N}_{i, 12}\\ {N}_{i, 22}\end{smallmatrix}\!\bigr]\Biggr]\Biggr).\end{align}

Now we analyze $P_r-\Sigma_1$ . The left upper $r\times r$ block of (5.16) fulfills

\begin{align*}& \biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)\Sigma_1+ \Sigma_1 \biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top+\sum_{i=1}^{q} N_{i, 11} \Sigma_1 N_{i, 11}^\top \\&\quad =-B_1B_1^\top-\sum_{i=1}^q N_{i, 12}\Sigma_2 N_{i, 12}^\top-\dfrac{1}{2}\Delta_{N, 11}\Sigma_1- \Sigma_1\dfrac{1}{2}\Delta_{N, 11}^\top.\end{align*}

Comparing this with (5.18) yields

\begin{align*}&\biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)(P_r-\Sigma_1)+ (P_r-\Sigma_1) \biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top+\sum_{i=1}^{q} N_{i, 11} (P_r-\Sigma_1) N_{i, 11}^\top \\&\quad =\sum_{i=1}^q N_{i, 12}\Sigma_2 N_{i, 12}^\top+\dfrac{1}{2}\Delta_{N, 11}\Sigma_1 + \Sigma_1\dfrac{1}{2}\Delta_{N, 11}^\top.\end{align*}

Therefore, using (5.15), we obtain

\begin{align*}\operatorname{tr}\bigl(C_1(P_r-\Sigma_1)C_1^\top\bigr)&=\operatorname{tr}\bigl((P_r-\Sigma_1)C_1^\top C_1\bigr)\\& =-\operatorname{tr}\Biggl((P_r-\Sigma_1)\biggl[\biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top Q_r+ Q_r \biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)\\&\quad \hspace{8mm}+\sum_{i=1}^{q} N_{i, 11}^\top Q_r N_{i, 11}\biggr]\Biggr)\\& =-\operatorname{tr}\Biggl(\biggl[\biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)(P_r-\Sigma_1)+(P_r-\Sigma_1)\biggl(A_{N, 11}-\dfrac{1}{2}\Delta_{N, 11}\biggr)^\top\\&\quad\hspace{8mm} +\sum_{i=1}^{q} N_{i, 11}(P_r-\Sigma_1) N_{i, 11}^\top\biggr] Q_r \Biggr)\\& =-\operatorname{tr}\Biggl(\Biggl[\sum_{i=1}^q N_{i, 12}\Sigma_2 N_{i, 12}^\top+\Delta_{N, 11}\Sigma_1\Biggr] Q_r \Big)\\& =-\operatorname{tr}\Biggl(\Biggl[\Sigma_2\sum_{i=1}^q N_{i, 12}^\top Q_r N_{i, 12}+\Sigma_1 Q_r \Delta_{N, 11}\Biggr]\Biggr).\end{align*}

Inserting this into (5.21) concludes the proof.

Even if stability is preserved in (5.7), we cannot ensure a small error if we only know that $\Sigma_2$ has small diagonal entries. This is the main conclusion from Theorem 5.1, as the bound depends on a matrix $\Sigma_1$ with potentially very large diagonal entries reflecting the dominant eigenvalues associated with the key modes of the system. If stability is not preserved in (5.7), the bound in Theorem 5.1 will even be infinite. This is an indicator that there are cases in which (5.7) might perform poorly. The correction term $\frac{1}{2}\Delta_{N, 11}=\frac{1}{2}\sum_{i=1}^q N_{i, 12}N_{i, 21}$ in (5.9) ensures that the expression in (5.13) that depends on $\Delta_{N, 11}$ is canceled out. This leads to the bound in (5.12). Further notice that our considerations are not restricted to finite intervals [0, T]. We can replace [0, T] with $[0, \infty)$ in both (5.12) and (5.13).

At this point, let us also refer to the error analysis for $P_T$ / $Q_T$ -balancing for $H=1/2$ in the Itô setting [Reference Redmann and Jamshidi31]. In this reference, one can find an error bound in the same norm. However, it is proved differently as it contains terms depending on the terminal time T, which are not present when P/Q-balancing is used. Let us conclude this paper by conducting several numerical experiments.

6. Numerical results

In this section, the reduced-order techniques that are based on balancing and lead to a system like that in (5.7) or (5.9) are applied to two examples. In detail, stochastic heat and wave equations driven by fractional Brownian motions with different Hurst parameters H are considered and formally discretized in space. This discretization yields a system of the form (3.1), which we reduce concerning the state space dimension. Before we provide details of the model reduction procedure, let us briefly describe the time discretization that is required here as well. We use an implicit scheme because spatial discretizations of the underlying stochastic partial differential equations are stiff.

6.1. Time integration

The stochastic differential equations (3.1), (5.7), and (5.9) can be numerically solved by employing a variety of general-purpose stochastic numerical schemes (see e.g. [Reference Hu, Liu and Nualart15], [Reference Kloeden and Platen20], and [Reference Neuenkirch and Nourdin26], and the references therein). Encountered frequently in practice, stiff differential equations present a difficult challenge for numerical simulation in both deterministic and stochastic systems. Implicit methods are generally found to be more effective than explicit methods for solving stiff problems. The goal of this work is to exploit an implicit numerical method that is well-suited to addressing stiff stochastic differential equations. The stochastic implicit midpoint method will be the subject of our attention throughout the entire numerical section. We refer to [Reference Hong, Huang and Wang14] ( $H>1/2$ ) and [Reference Redmann and Riedel33] ( $H= 1/2$ ) for more detailed consideration of Runge–Kutta methods based on increments of the driver. In particular, the stochastic implicit midpoint method is a Runge–Kutta scheme with Butcher tableau

It therefore takes the form

(6.1)

\begin{equation}x_{k+1}=x_k+\biggl[A\biggl(\dfrac{x_k+x_{k+1}}{2}\biggr)+Bu\biggl(t_k+\dfrac{\Delta t}{2}\biggr)\biggr]\Delta t+\sum_{i=1}^q N_i\biggl(\dfrac{x_k+x_{k+1}}{2}\biggr)\Delta W^H_{i, k}\end{equation}

when applying it to (3.1), where $\Delta t$ denotes the time step related to equidistant grid points $t_k$ . Moreover, we define $ \Delta W^H_{i, k} = W^H_i(t_k+1)-W^H_i(t_k)$ . The midpoint method converges with almost sure/ $L^p$ -rate (arbitrary close to) $2H-1/2$ for $H\in[1/2, 1)$ . Before proceeding to the numerical experiments, let us briefly sketch another MOR scheme that we use as a reference method within the numerics.

6.2. POD-based method

The proper orthogonal decomposition (POD) method is a data-driven strategy for the reduction of large-scale models that is based on the singular value decomposition (SVD) of snapshot matrices. However, POD techniques for stochastic differential equations driven by fBm have not yet been studied. For the convenience of the readers, a brief explanation of the POD method is provided here. For a more detailed discussion, let us refer to [Reference Jamshidi and Redmann16].

The idea is to sample the solution for fixed u and $x_0$ to obtain matrices

\begin{equation*}Z_j=[x(t_1, \omega_j), x(t_2, \omega_j), \ldots, x(t_{\mathcal N}, \omega_j)], \quad \text{for} \ \omega_j \in \Omega, \ j=1, \ldots, \mathcal N_s,\end{equation*}

where $\mathcal N, \mathcal N_s>0$ are the number of considered time points and samples. We introduce a data matrix $Z\,:\!=\, [X_0,Z_1,Z_2,\ldots,Z_{\mathcal N_s}]$ and calculate its SVD:

\begin{align*}Z=\begin{pmatrix} V \quad \star \end{pmatrix}\begin{pmatrix} \Sigma_Z & \\ & \star\end{pmatrix} \begin{pmatrix} U^\top \\ \star\end{pmatrix}\!.\end{align*}

The dominant subspace is identified by considering only singular vectors associated with the singular values in $\Sigma_Z$ above a certain threshold. We end up with a POD-based reduced system (3.4), where the projection matrix $V=W$ consists of vectors associated with large singular values of the snapshot matrix. Instead of using POD for (3.1) directly, we can also apply it to subsystems (4.6) and (4.7). Subsequently, we find an approximation for (3.1) by the sum of the reduced subsystems.

6.3. Dimension reduction for a stochastic heat equation

We begin with a modified version of an example studied in [Reference Benner and Redmann7]. In particular, we do not study an Itô equation driven by Brownian motion. Instead, we consider the following Young/Stratonovich stochastic partial differential equation driven by a (scalar) fractional Brownian motion $W^H$ with Hurst parameter $ H\in[1/2,1)$ :

(6.2)

\begin{equation}\begin{split}\dfrac{\partial X(t,\zeta)}{\partial t}&=a \Delta X(t,\zeta)+1_{[{{\pi}/{4}},{{3\pi}/{4}}]^2}(\zeta)u(t)\\&\quad +\gamma {\mathrm{e}}^{-|\zeta_1-{{\pi}/{2}}|-\zeta_2}X(t,\zeta)\circ \dfrac{\partial W^H(t)}{\partial t}, \quad t\in[0,1],\ \zeta\in[0,\pi]^2,\\X(t,\zeta)&=0,\quad t\in[0,1],\ \zeta\in\partial[0,\pi]^2,\quad \text{and} \quad X(0,\zeta)=b \cos(\zeta),\end{split}\end{equation}

where $ a ,b>0 $ , $ \gamma\in\mathbb{R} $ and a single input meaning that $m=1$ . The solution space for a mild solution is assumed to be $ \mathcal H=L^2([0,\pi]^2), $ exploiting the fact that the Dirichlet Laplacian generates a $C_0$ -semigroup. The following average temperature is assumed to be the quantity of interest:

\begin{align*}Y(t)=\dfrac{4}{3\pi^2}\int_{[0,\pi]^2\setminus[{{\pi}/{4}},{{3\pi}/{4}}]^2}X(t,\zeta)\,{\mathrm{d}} \zeta.\end{align*}

Based on the eigenfunctions of the Laplacian, a spectral Galerkin scheme analogous to the method proposed and explained in [Reference Benner and Redmann7] is applied to (6.2). Roughly speaking, such a discretization is based on an orthogonal projection onto the subspace spanned by the dominant eigenvector of the Laplacian. This results in system (3.1) of order n with scalar control and a fixed initial state $x_0$ . The detailed structure of the matrices A, B, $N_1$ , and C can be found in [Reference Benner and Redmann7]. In the following, we fix $ a=0.2 $ , $ b=1$ and set $n = 1024$ . We investigate two cases. These are $H=0.5$ and $H=0.75$ . In the following, we explain the particular dimension reduction techniques for each scenario.

Case $\mathbf{H=0.75}$ . We pointed out in Section 4.2.2 that Gramians $P_T$ and $Q_T$ (or their limits P and Q) are hard to compute for $H>1/2$ , since a link between these matrices and ordinary differential or algebraic equations is unknown. Therefore we solely consider empirical Gramians discussed in Section 4.2.2 for $H=0.75$ . In fact, $\bar P_T$ is available by sampling the solution of (4.7), whereas $\bar Q_T$ seems computationally much more involved. For that reason, we apply $\bar P_T$ -balancing (see Definition 5.1(iv)) to system (3.1) obtained from the above heat equation. This results in (5.1), which is truncated in order to find the reduced equation (5.7). Two other related approaches are conducted in this section as well.

• We apply the same $\bar P_T$ -balancing procedure to subsystems (4.6) and (4.7), that is, $\bar {P}_{u, T}$ -balancing is used for (4.6) and $\bar{P}_{x_0, T}$ -balancing for (4.7); compare with (4.14). The sum of the resulting reduced-order systems is then used to approximate (3.1). We refer to this second ansatz as splitting-based $\bar P_T$ -balancing.
• Another empirical dimension reduction technique, discussed in Section 6.2, is POD. In this method, the solution space of (3.1) is learned using samples which are potentially based on various initial states ${x_0}$ and controls $u$ . Note that the snapshot matrices are computed from a small set of ${x_0}$ and $u$ to provide a POD-based reduced system (3.4) that performs well for a larger number of ${x_0}$ and $u$ . In this approach we apply the POD scheme to the subsystems (4.6) and (4.7), and approximate (3.1) by the sum of the reduced subsystems. We refer to this method as splitting-based POD.

Case $\mathbf{H=0.5}$ . Similar techniques are exploited for the Stratonovich setting. However, we have the advantage that $P_T$ and $Q_T$ can be computed from matrix equations; see (4.16) and (4.18). Still these equations are difficult to solve. Therefore we use the sampling and variance reduction based schemes proposed in [Reference Redmann and Jamshidi31] in order to solve them. Due to the availability of both Gramians, we apply $P_T$ / $Q_T$ -balancing (see Definition 5.1(ii)) instead of the procedure based on diagonalizing $\bar P_T$ . However, we truncate differently, that is, the reduced system (5.9) is used instead due to the drawbacks of (5.7) pointed out in Section 5.3.2 when $H=0.5$ . The splitting-based $P_T$ / $Q_T$ -balancing is defined the same way. It is the technique where ${P}_{u, T}$ / $Q_T$ -balancing is conducted for (4.6) and ${P}_{x_0, T}$ / $Q_T$ -balancing is exploited for (4.7) to obtain reduced systems of the form (5.9) for each subsystem. Again we use a splitting-based POD scheme according to [Reference Jamshidi and Redmann16] for $H=0.5$ .

For the discretization in time, the stochastic midpoint method (6.1) is employed here, where the number of time steps is $\mathcal N = 100$ . Moreover, all empirical objects are calculated based on $\mathcal N_s= 10^3$ samples. The error between the reduced systems and the original model is computed for the control

\begin{align*}u(t)=\sqrt{\frac{2}{\pi}}\sin(t) ,\end{align*}

where the reduction error is measured by the quantity

\begin{align*} \mathcal{R}_{ E}=\dfrac{\sup_{t\in[0,1]}\mathbb{E}\|y(t)-y_r(t)\|_2}{\sup_{t\in[0,1]}\mathbb{E}\|y(t)\|_2} .\end{align*}

In the case of $H=0.5$ , Figure 1 illustrates that splitting-based $P_T$ / $Q_T$ -balancing (Gramian 2) and $P_T$ / $Q_T$ -balancing (Gramian 1) generate very similar results. Both techniques produce notably better outcomes compared to the splitting-based POD method. The worst case errors of the plot are also stated in the associated Table 1.

Figure 1. $\mathcal{R}_E $ for three approaches with Hurst parameters $ H=0.5$ .

Table 1. $\mathcal{R}_E $ for $ r\in\{2,4,8,16\} $ and $ H=0.5 $ .

On the other hand, the Young setting in which we have $H=0.75$ presents a different scenario. Figure 2 demonstrates that splitting-based POD exhibits better performance compared to splitting-based $\bar{P}_T$ -balancing (Gramian 2) and the usual $\bar{P}_T$ -balancing (Gramian 1), except when the reduced dimension is 16. Surprisingly, for $ r=16 $ , the Gramian 2 method yields better results compared to POD. It is worth noting that both empirical Gramian methods provide similar outcomes, which is an indicator for a nearly identical reduction potential for both subsystems (4.6) and (4.7). Note that the error of the plot can be found in Table 2.

Figure 2. $\mathcal{R}_E $ for three approaches with Hurst parameters $ H=0.75 $ .

For both $H=0.5$ and $H=0.75$ an enormous reduction potential can be observed, meaning that small dimensions r lead to accurate approximations. According to Remark 4.2, this is known a priori by the strong decay of certain eigenvalues associated with the system Gramians, since small eigenvalues indicate variables of low relevance. Given $H=0.75$ , Figure 3 shows the eigenvalues of $\bar P_T$ (Gramian 1), the sum eigenvalues of $\bar P_{u, T}$ and $\bar P_{x_0, T}$ (Gramian 2) as well as the sum of the singular values corresponding to the POD snapshot matrices of subsystems (4.6) and (4.7). Similar types of algebraic values are considered for $H=0.5$ in Figure 4. Here, square roots of eigenvalues of $P_T Q_T$ (Gramian 1) or the sum of square roots of eigenvalues of $P_{u, T} Q_T$ and $P_{x_0, T} Q_T$ (Gramian 2) are depicted. The large number of small eigenvalues (or singular values) explains why small errors could be achieved in our simulations.

Table 2. $\mathcal{R}_E $ for $ r\in\{2,4,8,16\} $ and $ H=0.75 $ .

6.4. Dimension reduction for a stochastic wave equation

We consider the following controlled stochastic partial differential equation, which is a modification of the example studied in [Reference Redmann and Benner29]. In detail, we consider fractional drivers $W^H$ with $H\in [0.5, 1)$ in a Young/Stratonovich setting instead of Itô differential equations driven by Brownian motion. For $ t\in[0,1]$ and $\zeta\in[0,\pi] $ ,

Figure 3. First 50 POD singular values or eigenvalues associated with $\bar P_T$ for $H=0.75$ .

Figure 4. First 50 POD singular values or eigenvalues associated with $P_T$ / $Q_T$ for $H=0.5$ .

Figure 5. $\mathcal{R}_E $ for three approaches with Hurst parameters $ H=0.75$ .

(6.3)

\begin{equation}\begin{split}&\dfrac{\partial^2 X(t,\zeta)}{\partial t^2}+a\dfrac{\partial X(t,\zeta)}{\partial t} =\dfrac{\partial^2}{\partial \zeta^2} X(t,\zeta)+ {\mathrm{e}}^{-|\zeta-{{\pi}/{2}}|}u(t)+2 {\mathrm{e}}^{-|\zeta-{{\pi}/{2}}|}X(t,\zeta)\circ \dfrac{\partial W^H(t)}{\partial t},\\&X(t,0)=0=X(t,\pi),\quad t\in[0,1],\quad X(0,\zeta)\equiv 0, \quad \dfrac{\partial}{\partial t} X(t, \zeta)\big\vert_{t=0}=b \cos(\zeta)\end{split}\end{equation}

is investigated and the output equation is

\begin{align*}Y(t)=\dfrac{1}{2\epsilon}\left( \int_{{{\pi}/{2}}-\epsilon}^{{{\pi}/{2}}+\epsilon}X(t,\zeta)\,{\mathrm{d}} \zeta\quad \int_{{{\pi}/{2}}-\epsilon}^{{{\pi}/{2}}+\epsilon}\dfrac{\partial }{\partial t}X(t,\zeta)\,{\mathrm{d}} \zeta\right)^\top,\end{align*}

so that both the position and velocity of the middle of the string are observed. Moreover, $ a,b>0 $ and $ \epsilon>0 $ . Again the solution of (6.3) should be in the mild sense (after transformation into a first-order equation), where $X(t,\cdot)\in H_0^1([0, \pi])$ and $\frac{\partial }{\partial t}X(t,\cdot)\in L^2([0, \pi])$ . Formally discretizing (6.3) as in [Reference Redmann and Benner29], the spectral Galerkin-based system is given by a model of the form (3.1) with $q=1$ . We refer to [Reference Redmann and Benner29] for details of the matrices of this system. In our simulations, we assume $ b=1 $ and $ a=2 $ . Further, the sizes of spatial and time discretization are $n=1000$ and $\mathcal N=100$ , respectively. In this example we consider the same scenario as we did in the first example (6.2), which means that we calculate a splitting-based POD ROM using snapshots of subsystems (4.6) and (4.7) for some $x_0$ , controls u, and a low number of samples $\mathcal N_s$ . Moreover, (splitting-based) $\bar P_T$ -based balancing is applied to the wave equation given $H=0.75$ . If $H=0.5$ , empirical Gramians are replaced by exact pairs of Gramians, meaning that (splitting-based) $P_T$ / $Q_T$ -based balancing is exploited. The results are shown in Figures 5 and 6 for

\begin{align*}u(t)=\sqrt{\frac{2}{\pi}}\sin(t).\end{align*}

Figure 6. $\mathcal{R}_E $ for three approaches with Hurst parameters $ H=0.5 $ .

Table 3. $\mathcal{R}_E $ for $ r\in\{4,8,16,32\} $ and $ H=0.75 $ .

Table 4. $\mathcal{R}_E $ for $ r\in\{4,8,16,32\} $ and $ H=0.5 $ .

Based on our observations, we find that the splitting-based $P_T$ / $Q_T$ -based balancing method (Gramian 2) outperforms the $P_T$ / $Q_T$ -based balancing method (Gramian 1) for both cases when $H=0.75$ and $H=0.5$ . Additionally, the splitting-based POD performs best for $H=0.75$ and worst for $H=0.5$ . The results are again presented in Tables 3 and 4, where the exact numbers are shown.

Interestingly, for both the heat and the wave equation, splitting-based POD performs best in the Young setting ( $H=0.75$ ) but worst in the Stratonovich case ( $H=0.5$ ).

Analogous to Figures 3 and 4, Figures 7 and 8 illustrate the eigenvalues of approximated or exact Gramians as well as the sum of singular values corresponding to the POD snapshot matrices.

Figure 7. First 50 POD singular values or eigenvalues associated with $\bar P_T$ for $H=0.75$ .

Figure 8. First 50 POD singular values or eigenvalues associated with $P_T$ / $Q_T$ for $H=0.5$ .

Funding information

MR is supported by the DFG via the individual grant ‘Low-order approximations for large-scale problems arising in the context of high-dimensional PDEs and spatially discretized SPDEs’, project number 499366908.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Alòs, E. and Nualart, D. (2003). Stochastic integration with respect to the fractional Brownian motion. Stoch. Stoch. Rep. 75, 129–152.10.1080/1045112031000078917CrossRef Google Scholar

Alvarez-Lacalle, E., Dorow, B., Eckmann, J.-P. and Moses, E. (2006). Hierarchical structures induce long-range dynamical correlations in written texts. Proc. Nat. Acad. Sci. USA 103, 7956–7961.10.1073/pnas.0510673103CrossRef Google Scholar

Antoulas, A. C. (2005). Approximation of Large-Scale Dynamical Systems , Vol. 6 of Adv. Des. Control. SIAM, Philadelphia, PA.Google Scholar

Becker, S. and Hartmann, C. (2019). Infinite-dimensional bilinear and stochastic balanced truncation with error bounds. Math. Control. Signals Syst. 31, 1–37.10.1007/s00498-019-0234-8CrossRef Google Scholar

Benner, P., Cohen, A., Ohlberger, M. and Willcox, K. (eds) (2017). Model Reduction and Approximation: Theory and Algorithms. SIAM, Philadelphia, PA.10.1137/1.9781611974829CrossRef Google Scholar

Benner, P. and Damm, T. (2011). Lyapunov equations, energy functionals, and model order reduction of bilinear and stochastic systems. SIAM J. Control Optim. 49, 686–711.10.1137/09075041XCrossRef Google Scholar

Benner, P. and Redmann, M. (2015). Model reduction for stochastic systems. Stoch. Partial Differ. Equ. Anal. Comput. 3, 291–338.Google Scholar

Benner, P., Damm, T., Redmann, M. and Rodriguez Cruz, Y. R. (2016). Positive operators and stable truncation. Linear Algebra Appl 498, 74–87.10.1016/j.laa.2014.12.005CrossRef Google Scholar

Damm, T. (2004). Rational Matrix Equations in Stochastic Control (Lecture Notes in Control and Information Sciences 297). Springer, Berlin.Google Scholar

Gardiner, C. (2009). Stochastic Methods, Vol. 4. Springer, Berlin.Google Scholar

Garsia, M., Rodemich, A. E. and Rumsey, H. Jr (1971). A real variable lemma and the continuity of paths of some Gaussian processes. Indiana Univ. Math. J. 20, 565–578.10.1512/iumj.1971.20.20046CrossRef Google Scholar

Gawronski, W. and Juang, J. (1990). Model reduction in limited time and frequency intervals. Internat. J. Systems Sci. 21, 349–376.10.1080/00207729008910366CrossRef Google Scholar

Grecksch, W. and Lisei, H. (eds) (2020). Infinite Dimensional and Finite Dimensional Stochastic Equations and Applications in Physics. World Scientific, Hackensack, NJ.10.1142/11538CrossRef Google Scholar

Hong, J., Huang, C. and Wang, X. (2021). Optimal rate of convergence for two classes of schemes to stochastic differential equations driven by fractional Brownian motions. IMA J. Numer. Anal. 41, 1608–1638.10.1093/imanum/draa019CrossRef Google Scholar

Hu, Y., Liu, Y. and Nualart, D. (2016). Rate of convergence and asymptotic error distribution of Euler approximation schemes for fractional diffusions. Ann. Appl. Prob. 26, 1147–1207.10.1214/15-AAP1114CrossRef Google Scholar

Jamshidi, N. and Redmann, M. (2023). Sampling-based model order reduction for stochastic differential equations driven by fractional Brownian motion. Proc. Appl. Math. Mech. 23, e202200109.10.1002/pamm.202200109CrossRef Google Scholar

Karagiannis, T., Molle, M. and Faloutsos, M. (2004). Long-range dependence ten years of internet traffic modeling. IEEE Internet Comput. 8, 57–64.10.1109/MIC.2004.46CrossRef Google Scholar

Karmeshu, and Krishnamachari, A. (2004). Sequence variability and long-range dependence in DNA: An information theoretic perspective. In International Conference on Neural Information Processing, pp. 1354–1361. Springer.10.1007/978-3-540-30499-9_210CrossRef Google Scholar

Khasminskii, R. Z. (1980). Stochastic Stability of Differential Equations (Mechanics: Analysis 7). Sijthoff & Noordhoff.Google Scholar

Kloeden, P. E. and Platen, E. (1999). Numerical Solution of Stochastic Differential Equations. Springer, Berlin.Google Scholar

Lo, A. W. (1997). Fat tails, long memory, and the stock market since the 1960’s. Economic Notes 26, 219–252.Google Scholar

Mao, X. (2007). Stochastic Differential Equations and Applications. Elsevier.Google Scholar

Mishura, Y. S. (2008). Stochastic Calculus for Fractional Brownian Motion and Related Processes. Springer, Berlin.10.1007/978-3-540-75873-0CrossRef Google Scholar

Moore, B. C. (1981). Principal component analysis in linear systems: Controllability, observability, and model reduction. IEEE Trans. Autom. Control AC-26, 17–32,10.1109/TAC.1981.1102568CrossRef Google Scholar

Moss, F. and McClintock, P. V. E. (1989). Noise in Nonlinear Dynamical Systems, Vol. 2. Cambridge University Press.Google Scholar

Neuenkirch, A. and Nourdin, I. (2007). Exact rate of convergence of some approximation schemes associated to SDEs driven by a fractional Brownian motion. J. Theoret. Prob. 20, 871–899.10.1007/s10959-007-0083-0CrossRef Google Scholar

Øksendal, B. (2013). Stochastic Differential Equations: An Introduction with Applications. Springer, Berlin.Google Scholar

Painter, S. (1998). Long-range dependence in the subsurface: Empirical evidence and simulation methods. Invited paper at the American Geophysical Union 1998 Fall Meeting.Google Scholar

Redmann, M. and Benner, P. (2015). Approximation and model order reduction for second order systems with Lévy-noise. AIMS Proceedings 2015, 945–953.Google Scholar

Redmann, M. and Freitag, M. A. (2021). Optimization based model order reduction for stochastic systems. Appl. Math. Comput. 398, 125783.Google Scholar

Redmann, M. and Jamshidi, N. (2022). Gramian-based model reduction for unstable stochastic systems. Control Signals Syst. 34, 855–881.10.1007/s00498-022-00328-zCrossRef Google Scholar

Redmann, M. and Pontes Duff, I. (2022). Full state approximation by Galerkin projection reduced order models for stochastic and bilinear systems. Appl. Math. Comput. 420, 126561.Google Scholar

Redmann, M. and Riedel, S. (2022). Runge–Kutta methods for rough differential equations. J. Stoch. Anal. 3, 6.Google Scholar

Robinson, P. M. (2003). Time Series with Long Memory (Advanced Texts in Econometrics). Oxford University Press.10.1093/oso/9780199257294.001.0001CrossRef Google Scholar

Shmatkov, A. (2005). Rate of convergence of Wong–Zakai approximations for SDEs and SPDEs. PhD thesis, The University of Edinburgh.Google Scholar

Tyranowski, T. M. (2022). Data-driven structure-preserving model reduction for stochastic Hamiltonian systems. Available at https://arxiv.org/abs/2201.13391.Google Scholar

Varotsos, C. and Kirk-Davidoff, D. (2006). Long-memory processes in global ozone and temperature variations. Atmos. Chem. Phys. Discuss. 6, 4325–4340.Google Scholar

Wong, E. and Zakai, M. (1965). On the relation between ordinary and stochastic differential equations. Internat. J. Engineering Sci. 3, 213–229.Google Scholar

Young, L. C. (1936). An inequality of Hölder type, connected with Stieltjes integration. Acta Math. 67, 251–282.10.1007/BF02401743CrossRef Google Scholar