Residual Dynamic Mode Decomposition: Robust and verified Koopmanism

Dynamic Mode Decomposition (DMD) describes complex dynamic processes through a hierarchy of simpler coherent features. DMD is regularly used to understand the fundamental characteristics of turbulence and is closely related to Koopman operators. However, verifying the decomposition, equivalently the computed spectral features of Koopman operators, remains a major challenge due to the infinite-dimensional nature of Koopman operators. Challenges include spurious (unphysical) modes, and dealing with continuous spectra, both of which occur regularly in turbulent flows. Residual Dynamic Mode Decomposition (ResDMD), introduced by (Colbrook&Townsend 2021), overcomes some of these challenges through the data-driven computation of residuals associated with the full infinite-dimensional Koopman operator. ResDMD computes spectra and pseudospectra of general Koopman operators with error control, and computes smoothed approximations of spectral measures (including continuous spectra) with explicit high-order convergence theorems. ResDMD thus provides robust and verified Koopmanism. We implement ResDMD and demonstrate its application in a variety of fluid dynamic situations, at varying Reynolds numbers, arising from both numerical and experimental data. Examples include: vortex shedding behind a cylinder; hot-wire data acquired in a turbulent boundary layer; particle image velocimetry data focusing on a wall-jet flow; and acoustic pressure signals of laser-induced plasma. We present some advantages of ResDMD, namely, the ability to verifiably resolve non-linear, transient modes, and spectral calculation with reduced broadening effects. We also discuss how a new modal ordering based on residuals enables greater accuracy with a smaller dictionary than the traditional modulus ordering. This paves the way for greater dynamic compression of large datasets without sacrificing accuracy.

Herein we specifically consider dynamical systems whose state evolves over a state-space Ω ⊆ R in discrete time-steps according to a function : Ω → Ω. That is, for an initial state 0 . Such a dynamical system forms a trajectory of iterates 0 , 1 , 2 , . . . in Ω. We consider the discrete-time setting since we are interested in analysing data collected from experiments and numerical simulations, where a continuum of data cannot practically be obtained. The Koopman operator for such dynamical systems acts on functions : Ω → C, also known as 'observables' because they indirectly measure the state of the dynamical system. For example, any measurement taken experimentally is an example of such an observable. The Koopman operator K is defined by the composition formula where D (K) is a suitable domain of observables. For a fixed , [K ] ( ) measures the state of the dynamical system after one time-step. On the other hand, for a fixed , (1.2) is a linear composition operator with . K is therefore a linear operator, regardless of whether the dynamics are linear or non-linear. The Koopman operator transforms the non-linear dynamics in the state variable into equivalent linear dynamics in the observables . However, there is price to pay for this linearisation -K acts on an infinite-dimensional space. The behavior of the dynamical system (1.1) is determined by the spectral information of K. In the data-driven setting of this paper, we do not assume knowledge of the function in (1.1). Instead, the name of the game is to recover spectral properties of K from measured trajectory data. The leading numerical algorithm used to approximate Koopman operators is dynamic mode decomposition (DMD) (Tu et al. 2014;Kutz et al. 2016a). First introduced by Schmid (2010Schmid ( , 2009 in the fluids community and connected to the Koopman operator in Rowley et al. (2009), there are now numerous extensions and variants of DMD (Chen et al. 2012;Wynn et al. 2013;Jovanović et al. 2014;Williams et al. 2015a;Proctor et al. 2016;Kutz et al. 2016b;Noack et al. 2016;Loiseau & Brunton 2018;Korda & Mezić 2018;Klus et al. 2020;Deem et al. 2020;Herrmann et al. 2021;Wang & Shoele 2021;Baddoo et al. 2021Baddoo et al. , 2022. The reader is encouraged to consult Schmid (2022) for a recent overview of DMD-type methods, and Taira et al. (2017,2020); Towne et al. (2018) for an overview of modaldecomposition techniques in fluid flows. Koopman modes are projections of the physical field onto eigenfunctions of the Koopman operator. For fluid flows, Koopman modes provide collective motions of the fluid that are occurring at the same spatial frequency, growth, or decay rate, according to an (approximate) eigenvalue of the Koopman operator. DMD breaks apart a high-dimensional spatiotemporal signal into a triplet of Koopman modes, scalar amplitudes, and purely temporal signals. This decomposition allows the user to describe complex flow patterns by a hierarchy of simpler processes. When linearly superimposed, these simpler processes approximately recover the full flow.
DMD is undoubtedly a powerful data-driven algorithm for the approximation of Koopman operators that has led to rapid progress over the past decade (Grilli et al. 2012 Chai et al. 2015;Brunton et al. 2016b;Hua et al. 2016;Priebe et al. 2016;Pasquariello et al. 2017;Brunton & Kutz 2019;Page & Kerswell 2018, 2019, 2020Korda et al. 2020). However, the infinite-dimensional nature of K leads to several fundamental challenges. The spectral information of infinite-dimensional operators can be far more complicated, and also far more challenging to compute, than that of a finite matrix (Colbrook 2020;Ben-Artzi et al. 2020). For example, K can have both discrete and continuous spectra (Mezić 2005). Recently, Colbrook & Townsend (2021) introduced residual DMD (ResDMD) aimed at tackling some of the challenges associated with infinite dimensions. More specifically, ResDMD addresses the challenges of: • Spectral pollution (spurious modes). A well-known difficulty of computing spectra of infinite-dimensional operators is spectral pollution, where discretisations to a finite matrix cause spurious eigenvalues to appear that have nothing to do with the operator (Lewin & Séré 2010;Colbrook et al. 2019). DMD-type methods typically suffer from spectral pollution, and can produce modes with no physical relevance to the system being investigated (Budišić et al. 2012;Williams et al. 2015a). Heuristics, such as comparing different discretization sizes, are common to reduce a user's concern. In some cases, it is possible to approximate the spectral information of a Koopman operator without spectral pollution. For example, when working in a finite-dimensional invariant subspace of known dimension, the Hankel-DMD algorithm computes the corresponding eigenvalues for ergodic systems (Arbabi & Mezic 2017). Instead, it is highly desirable to have a principled way of detecting spectral pollution with as few assumptions as possible. By computing residuals associated with the spectrum with error control, ResDMD allows the computation of spectra of general Koopman operators with rigorous convergence guarantees and without spectral pollution. Moreover, ResDMD provides a verification of computations by computing pseudospectra with error control. ResDMD thus allows data-driven study of dynamical systems with error control.
• Invariant subspaces. A finite-dimensional invariant subspace of K is a space of observables G = span{ 1 , . . . , } such that K ∈ G for all ∈ G. A common, but often incorrect (e.g., when the system is mixing), assumption in the literature is that K has a non-trivial finite-dimensional invariant space. Even if there is such a subspace, it can be challenging to compute, or may not capture all of the dynamics of interest. Often, one must settle for approximate invariant subspaces, and DMD-type methods can result in closure issues (Brunton et al. 2016b). By computing upper bounds on residuals, ResDMD provides a way of measuring how invariant a candidate subspace is. It is important to stress that ResDMD computes residuals associated with the underlying infinite-dimensional operator K, in contrast to earlier work that computes residuals associated with finite DMD discretisation matrices (which can never be used as error bounds for spectral information of K and suffer from issues such as spectral pollution) (Drmac et al. 2018). Residuals computed by ResDMD also allow a natural ranking or ordering of approximated eigenvalues and Koopman modes, which can be exploited for efficient compression of the system (see Section 8).
• Continuous spectra and chaos. K can have a continuous spectrum, which is a generic feature of chaotic or turbulent flow (Basley et al. 2011;Arbabi & Mezić 2017). A formidable challenge is how to deal with the continuous spectrum (Colbrook 2021;Colbrook et al. 2021;Mezić 2013), since discretising to a finite-dimensional operator destroys its presence. Most existing nonparametric approaches for computing continuous spectra of K are restricted to ergodic systems (Giannakis 2019;Arbabi & Mezić 2017;Korda et al. 2020;Das et al. 2021), as this allows relevant integrals to be computed using long-time averages. Many of these methods are related to techniques in signal processing, but can be challenging to apply in the presence of noise or if a pair of eigenvalues are close together, and often rely on heuristic parameter choices or cleanup procedures. ResDMD (Colbrook & Townsend   4 2021) provides a general computational framework to deal with continuous spectra through smoothed approximations of spectral measures, leading to explicit and rigorous high-order convergence. The methods deal jointly with both discrete and continuous spectra, do not assume ergodicity, and can be applied to data from either short or long trajectories.
• Non-linearity and high-dimensional state-space. For many fluid flows, e.g., turbulent phenomena, the corresponding dynamical system is strongly non-linear and has a very large state-space dimension. ResDMD can be combined with learned dictionaries to tackle this issue. A key advantage compared to traditional learning methods is that one still has convergence theory and can perform a-posteriori verification that the learned dictionary is suitable. In this paper, we demonstrate this for two choices of dictionary: kernelized extended dynamic mode decomposition (kEDMD) (Williams et al. 2015b) and (rank-reduced) DMD.
In this paper, we take the framework of ResDMD and demonstrate its use in a wide range of fluid dynamic situations, at varying Reynolds numbers, arising from both numerical and experimental data. We discuss how new ResDMD methods can be reliably used for turbulent flows such as aerodynamic boundary layers and we include, for the first time, a rigorous link between the spectral measure and the turbulent power spectra. We illustrate this link explicitly for experimental measurements of boundary layer turbulence. We also discuss how a new modal ordering based on the residual permits greater accuracy with a smaller DMD dictionary than when using traditional modal modulus ordering. This paves the way for greater dynamic compression of large data sets without sacrificing accuracy.
The paper is outlined as follows. In Section 2, we recap extended dynamic mode decomposition (EDMD), for which DMD is a special case, and interpret the algorithm as a Galerkin method. ResDMD is then introduced in Section 3, where we recall and expand upon the results of Colbrook & Townsend (2021). We stress that ResDMD does not make any assumptions on the Koopman operator or dynamical system. In Section 4, we recall one of the algorithms of Colbrook & Townsend (2021) for computing spectral measures (dealing with continuous spectra) under the assumption that the dynamical system is measure-preserving, and make a new link between an algorithm for spectral measures of Koopman operators and the power spectra of signals. We then validate and apply our methods to four different flow cases. We treat flow past a cylinder (numerical data) in Section 5, turbulent boundary layer flow (hot-wire experimental data) in Section 6, wall-jet boundary layer flow (PIV experimental data) in Section 7, and laser-induced plasma (experimental data collected with a microphone) in Section 8. In each case, only the flow or pressure fields are used to extract relevant dynamical information. We end with a discussion and conclusions in Section 9. Code for our methods can be found at https://github.com/MColbrook/Residual-Dynamic-Mode-Decomposition and we provide a diagrammatic chart for implementation in Fig. 1.

Recalling dynamic mode decomposition
We recall the definition of EDMD from Williams et al. (2015a) and its interpretation as a Galerkin method. As a special case, DMD can be interpreted as producing a Galerkin approximation of the Koopman operator using a dictionary of linear functions. We assume that the dynamics are governed by a general dynamical system (1.1) and that we have access to discrete time snapshots of this system, i.e., a finite set of pairs of measurements where the operator evolves the system along one discrete time unit. For example, these snapshots could be measurements of unsteady velocities across discrete spatial grid points  taken via Particle Image Velocimetry (PIV). Suitable data could be collected from one long time trajectory, corresponding to ( ) = −1 ( 0 ), or from multiple shorter trajectories.
We work in the Hilbert space 2 (Ω, ) of observables for a positive measure on Ω. We do not assume that this measure is invariant, and the most common choice of is the standard Lebesgue measure. This choice is natural for Hamiltonian systems for which the Koopman operator is unitary on 2 (Ω, ). For other systems, we can select according to the region where we wish to study the dynamics, such as a Gaussian measure. To be precise, we consider K : D (K) → 2 (Ω, ), where D (K) ⊆ 2 (Ω, ) is a suitable domain of observables. In this section and Section 3, we do not make any assumptions on K.

Extended dynamic mode decomposition
Given a dictionary { 1 , . . . , } ⊂ D (K) ⊆ 2 (Ω, ) of observables, EDMD constructs a matrix K ∈ C × from the snapshot data (2.1) that approximates K on the finite-dimensional subspace = span{ 1 , . . . , }. The choice of dictionary is up to the user, with some common hand-crafted choices given in Williams et al. (2015a, Table 1). When the statespace dimension is large, as in this paper, it is beneficial to use a data-driven choice of dictionary. We discuss this in Section 3.3, where we present DMD (Kutz et al. 2016a) andkEDMD (Williams et al. 2015b) in a unified framework.
Assuming we have chosen our dictionary of observables, we define the vector-valued observable or "quasimatrix" Any new observable ∈ can then be written as ( ) = =1 ( ) = Ψ( ) for some vector of constant coefficients ∈ C . It follows from (1.2) that Typically, our subspace generated by the dictionary is not an invariant subspace of K, and hence there is no choice of K that makes the error ( , ) zero for all choices of ∈ and ∈ Ω. Instead, it is natural to select K as a solution of Here, · denotes the standard Euclidean norm of a vector. Given a finite amount of snapshot data, we cannot directly evaluate the integral in (2.3). Instead, we approximate it via a quadrature rule by treating the data points { ( ) } =1 as quadrature nodes with weights { } =1 . Note that in the original definition of EDMD, is a probability measure and the quadrature weights are = 1/ . We consider the more general case. The discretised version of (2.3) is the following weighted least-squares problem: (2.4) Define the following two matrices and let = diag( 1 , . . . , ) be the diagonal weight matrix of the quadrature rule. A solution to (2.4) can then be written down explicitly as where ' †' denotes the pseudoinverse. In some applications, the matrix Ψ * Ψ may be illconditioned, so it is common to consider truncated singular value decompositions or other forms of regularisation.

Quadrature and Galerkin methods
We can view EDMD as a Galerkin method. Note that If the quadrature converges, it follows that where ·, · is the inner product associated with the Hilbert space 2 (Ω, ). Let P denote the orthogonal projection onto . As → ∞, the above convergence means that K approaches a matrix representation of P KP * . Thus, EDMD can be viewed as a Galerkin method in the large data limit → ∞. It also follows that the EDMD eigenvalues approach the spectrum of P KP * as → ∞. Thus, approximating the spectrum of K, (K), by the eigenvalues of K is closely related to the so-called finite section method (Böttcher & Silbermann 1983). Since the finite section method can suffer from spectral pollution (spurious modes), spectral pollution is also a concern for EDMD (Williams et al. 2015a). It is important to have an independent way to measure the accuracy of the candidate eigenvalue-eigenvector pairs, which is what we propose in our ResDMD method presented in Section 3.
2.2.1. Convergence of the quadrature rule There are typically three scenarios for which the convergence in (2.6) holds: (i) Random sampling: In the initial definition of EDMD, is a probability measure and { ( ) } =1 are drawn independently according to with the quadrature weights = 1/ . The strong law of large numbers shows that (2.6) holds with probability one (Klus et al. 2016, Section 3.4), provided that is not supported on a zero level set that is a linear combination of the dictionary (Korda & Mezić 2018, Section 4). Convergence is typically at a Monte Carlo rate of O ( −1/2 ) (Caflisch 1998).
(ii) High-order quadrature: If the dictionary and are sufficiently regular and we are free to choose the { ( ) } =1 , then it is beneficial to select { ( ) } =1 as an -point quadrature rule with weights { } =1 . This can lead to much faster convergence rates in (2.6) (Colbrook & Townsend 2021), but can be difficult if is large.
From an experimental point of view, an example of random sampling could be { ( ) } observed with a sampling rate that is lower than the characteristic time period of the system of interest. An example of ergodic sampling could be a time-resolved particle image velocimetry dataset of a flow field over a long time period. The examples in this paper use (i) random sampling, and (iii) ergodic sampling, which are typical for experimental data collection in practice as they arise from long time trajectory measurements. Williams et al. (2015a) observed that if we choose a dictionary of observables of the form ( ) = * for = 1, ..., , the matrix K = ( √ Ψ ) † √ Ψ with = 1/ is the transpose of the usual DMD matrix

Relationship with DMD
Thus, DMD can be interpreted as producing a Galerkin approximation of the Koopman operator using the set of linear monomials as basis functions. EDMD can be thought of as an extension that allows non-linear functions in the dictionary.

Koopman mode decomposition
Given an observable ℎ : Ω → C, we approximate the orthogonal projection P ℎ via Assuming convergence of the quadrature rule, the right-hand side converges to the projection P ℎ in 2 (Ω, ) as → ∞. Assuming that K = Λ for a diagonal matrix Λ of eigenvalues and a matrix of eigenvectors, we obtain .
(2.7) 8 Figure 2: The basic idea of ResDMD. The matrices Ψ and Ψ are defined in (2.5) and correspond to the dictionary evaluated at the snapshot data. The matrix = diag( 1 , . . . , ) is a diagonal weight matrix.
As a special case, and vectorising, we have the Koopman mode decomposition (2.8) The th Koopman mode corresponds to the th row of the matrix in square brackets, and Ψ is a quasimatrix of approximate Koopman eigenfunctions.

Residual dynamic mode decomposition (ResDMD)
We now present ResDMD for computing spectral properties of Koopman operators, which in turn allows us to analyse fluid flow structures such as turbulence. ResDMD, first introduced in Colbrook & Townsend (2021), uses an additional matrix constructed from the measured data, { ( ) , ( ) } =1 . The key difference between ResDMD and other DMD-type algorithms is that we construct Galerkin approximations for not only K, but also K * K. This difference allows us to have rigorous convergence guarantees for ResDMD and obtain error guarantees on the approximation. In other words, we can tell a-posteriori which parts of the computed spectra and Koopman modes are reliable, thus rectifying issues such as spectral pollution that arise in previous DMD-type methods.
3.1. ResDMD and a new data-matrix Whilst EDMD obtains eigenvalue-eigenvector pairs for an approximation of the Koopman operator, it cannot verify the accuracy of the computed pairs. By rigorously rejecting inaccurate predictions (spurious modes) we can confidently identify true physical turbulent flow structures. This rigorous measure of accuracy is the linchpin of our new ResDMD method and shown pictorially in Fig. 2.
To measure the accuracy of a candidate eigenvalue-eigenvector pair ( , ), we approximate residuals. For ∈ C and = Ψ ∈ , the squared relative residual is , . 9 Algorithm 1 : ResDMD for computing eigenpairs without spectral pollution. The corresponding Koopman mode decomposition is given in (3.4).
We approximate this residual using the same quadrature rule in Section 2.1, As well as the matrices Ψ * Ψ and Ψ * Ψ found in EDMD, (3.2) has the additional matrix Hence, Ψ * Ψ formally corresponds to an approximation of K * K. We also have In Colbrook & Townsend (2021), it was shown that the quantity res( , ) can be used to rigorously compute spectra and pseudospectra of K. Next, we summarise some of the algorithms and results of Colbrook & Townsend (2021).

3.2.
ResDMD for computing spectra and pseudospectra 3.2.1. Avoiding spurious eigenvalues Algorithm 1 uses the residual defined in (3.2) to avoid spectral pollution (spurious modes). As is usually done, we assume that K is diagonalisable. We first find the eigenvalues and eigenvectors of K, i.e., we solve (Ψ * Ψ ) † (Ψ * Ψ ) = . One can solve this eigenproblem directly, but it is often numerically more stable to solve the generalised eigenproblem (Ψ * Ψ ) = (Ψ * Ψ ) . Afterward, to avoid spectral pollution, we discard eigenpairs with a relative residual larger than a specified accuracy goal > 0.
The procedure is a simple modification of EDMD, as the only difference is a clean-up step where spurious eigenpairs are discarded based on their residual. This clean-up avoids spectral pollution and also removes eigenpairs that are inaccurate because of numerical errors associated with non-normal operators, up to the relative tolerance . The following result (Colbrook & Townsend 2021, Theorem 4.1) makes this precise.
T 3.1. Let K be the associated Koopman operator of (1.1) from which snapshot data is collected. Let Λ denote the eigenvalues in the output of Algorithm 1. Then, assuming convergence of the quadrature rule in Section 2.1, We can also use Algorithm 1 to cleanup the Koopman mode decomposition in (2.8). Algorithm 2 : ResDMD for estimating -pseudospectra.

Computing the full spectrum and pseudospectra
In the large data limit, Theorem 3.1 tells us that Algorithm 1 computes eigenvalues inside the -pseudospectrum of K and hence, avoids spectral pollution and returns reasonable eigenvalues. Recall that the -pseudospectrum of K is (Roch & Silbermann 1996; Trefethen & Embree 2005) where cl is the closure of the set, and that lim ↓0 (K) = (K). Despite Theorem 3.1, Algorithm 1 may not approximate the whole -pseudospectrum of K, even as → ∞ and → ∞. This is because the eigenvalues of P KP * may not approximate the whole To overcome this issue, we further include Algorithm 2 which computes practical approximations of -pseudospectra with rigorous convergence guarantees. Assuming convergence of the quadrature rule in Section 2.1, in the limit → ∞, the key quantity is an upper bound for (K − ) −1 −1 and the output of Algorithm 2 is guaranteed to be inside the -pseudospectrum of K. As → ∞ and the grid of points is refined, Algorithm 2 converges to the pseudospectrum uniformly on compact subsets of C (Colbrook & Townsend 2021, Theorem B.1, B.2). Strictly speaking, we converge to the approximate point pseudospectrum, a more complicated algorithm leads to computation of the full pseudospectrum -see (Colbrook & Townsend 2021, Appendix B). For brevity, we have not included a statement of the results. We can then compute the spectrum by taking ↓ 0. Algorithm 2 also computes observables with res( , ) < , known as -approximate eigenfunctions.

Choice of dictionary
When is large, it can be impractical to store or form the matrix K, since the initial value of is very large. We consider two common methods to overcome this issue: (i) DMD: In this case, the dictionary consists of all monomials over Ω (see Section 2.2.2). It is standard to form a low-rank approximation of (3.6) Here, Σ ∈ C × is diagonal with strictly positive diagonal entries, and ∈ C × and ∈ C × have * = * = . We then form the matrix Note that to fit into our Galerkin framework, this matrix is the transpose of the DMD matrix that is commonly computed in the literature.
In either of these two cases, the approximation of K is equivalent to using a new dictionary with feature map Ψ( ) ∈ C 1× . In the case of DMD, we found it beneficial to use the mathematically equivalent choice Ψ( ) Σ −1 , which is numerically better conditioned. To see why, note that √ Ψ Σ −1 ≈ and has orthonormal columns.

The problem of vanishing residual estimates
Suppose that √ Ψ has full row rank, so that = , and that ∈ C is an eigenvector of K with eigenvalue . This means that ( It follows that res( , ) = 0. Similarly, if is too large, res( , ) will be a bad approximation of the true residual.
In other words, the regime ∼ prevents the large data convergence ( → ∞) of the quadrature rule and (3.3), which holds for a fixed basis and hence a fixed basis size. In turn, this prevents us from being able to apply the results of Section 3.2. We next discuss how to overcome this issue by using two sets of snapshot data; these could arise from two independent tests of the same system, or by partitioning the measured data into two groups.

Using two subsets of the snapshot data
A simple remedy to avoid the problem in Section 3.3.1 is to consider two sets of snapshot data. We consider an initial set {˜( ) ,˜( ) } =1 , which we use to form our dictionary. Algorithm 3 : ResDMD with DMD selected observables.
3: Apply Algorithms 1 and 2 with the matrices (3.9) Output: Spectral properties of Koopman operator according to Algorithms 1 and 2.
with kernel S to compute the matricesK, and Σ using the kernel trick.
We then apply ResDMD to the computed dictionary with a second set of snapshot data {ˆ( ) ,ˆ( ) } =1 , allowing us to prove convergence as → ∞. Exactly how to acquire a second set of snapshot data depends on the problem and method of data collection. Given snapshot data with random and independent { ( ) }, one can simply split up the snapshot data into two parts. For initial conditions that are distributed according to a high-order quadrature rule, if one already has access to snapshots then one must typically go back to the original dynamical system and request further snapshots. For ergodic sampling along a trajectory, we can let {˜( ) ,˜( ) } =1 correspond to the initial + 1 points of the trajectory (˜ correspond to the initial +1 points of the trajectory (ˆ( ) = −1 ( 0 ) for = 1, ..., ). In the case of DMD, the two stage process is summarised in Algorithm 3. Often a suitable choice of can be obtained by studying the decay of the singular values of the data matrix.
In the case of kEDMD, we follow Colbrook & Townsend (2021) and the two stage process is summarised in Algorithm 4. The choice of kernel S determines the dictionary and the best choice depends on the application. In the following experiments, we use the Laplacian kernel S( , ) = exp (− − ), where is the reciprocal of the average ℓ 2 -norm of the snapshot data after it is shifted to have mean zero.
We can now apply the theory of Section 3.2 in the limit → ∞. It is well-known 13 that the eigenvalues computed by DMD and kEDMD may suffer from spectral pollution. However, and crucially in our setting, we do not directly use these methods to compute spectral properties of K. Instead, we are only using them to select a reasonable dictionary of size , after which our rigorous ResDMD algorithms can be used. Moreover, we use {ˆ( ) ,ˆ( ) } =1 to check the quality of the constructed dictionary. By studying the residuals and using the error control in ResDMD, we can tell a-posteriori whether the dictionary is satisfactory and whether is sufficiently large. Finally, it is worth pointing out that the above choices of dictionaries are certainly not the only choices. ResDMD can be applied to any suitable choice. For example, one could use other data-driven methods such as diffusion kernels (Giannakis et al. 2018) or trained neural networks (Li et al. 2017;Murata et al. 2020).

Spectral measures of measure-preserving systems
Many physical systems of interest described by (1.1) are measure-preserving (preserve volume). Examples include Hamiltonian flows (Arnold 1989), geodesic flows on Riemannian manifolds (Dubrovin et al. 1991, Chapter 5), Bernoulli schemes in probability theory (Shields 1973), and ergodic systems (Walters 2000). The Koopman operator K associated with a measure-preserving dynamical system is an isometry, i.e., K = for all observables ∈ D (K) = 2 (Ω, ). In this case, spectral measures provide a way of including continuous spectra in the Koopman mode decomposition. This is beneficial in the case of turbulent flows where a-priori knowledge of the spectra (i.e., does it contain a continuous part) may be unknown. The methods described in this section allow us to compute continuous spectra.
For completeness, we provide a mathematical description of spectral measures in Appendix A. The reader should think of these measures as supplying a diagonalisation of K, and hence of the dynamical system. This is made precise in Section 4.1 through the Koopman mode decomposition. Another interpretation, discussed in Section 4.2, is that the Fourier coefficients of spectral measures are exactly the forward-time dynamical autocorrelations. In Section 4.3, we discuss one of the methods of computing spectral measures from Colbrook & Townsend (2021), which makes use of autocorrelations. Finally, we make a connection between spectral measures and power spectra in Section 4.4, which relates directly to experimentally measuring turbulent flows.

Spectral measures and Koopman mode decompositions
Given an observable ∈ 2 (Ω, ) of interest, the spectral measure of K with respect to is a measure defined on the periodic interval [− , ] per . If is normalised so that = 1, then is a probability measure, otherwise is a positive measure of total mass 2 . Lebesgue's decomposition (Stein & Shakarchi 2009) allows us to split into discrete and continuous parts: (4.1) Throughout this paper, we use the term "discrete spectra" to mean the eigenvalues of K, also known as the point spectrum. This also includes eigenvalues embedded in the continuous spectrum, in contrast to the usual definition of discrete spectrum. The discrete (or atomic) part of is a sum of Dirac delta distributions, supported on p (K), the set of eigenvalues of K. The coefficient of each in the sum is P , = P 2 , where P is the orthogonal spectral projection associated with the eigenvalue . The continuous part of consists of a part that is absolutely continuous with density ∈ 1 ( [− , ] per ), and a singular continuous component (sc) . The decomposition in (4.1) provides important information on the evolution of dynamical systems. For example, suppose that there is no singular continuous spectrum, then any ∈ 2 (Ω, ) can be written as where the are the eigenfunctions of K, are expansion coefficients and ( ) = , , . One should think of , as a "continuously parametrised" collection of eigenfunctions. Using (4.2), one obtains the Koopman mode decomposition (Mezić 2005) For characterisations of the dynamical system in terms of these decompositions, see, for example, Halmos (2017); Mezić (2013); Zaslavsky (2002). Generally speaking, the eigenvalues correspond to isolated frequencies of oscillation present in the fluid flow and the growth rates of stable and unstable modes, whilst the continuous spectrum corresponds to chaotic motion. Computing the measures provides us with a diagonalisation of the non-linear dynamical system in (1.1).

Spectral measures and autocorrelations
The Fourier coefficients of are given by From (4.5), we see that (− ) = ( ) for ∈ Z. In particular, is completely determined by the forward-time dynamical autocorrelations , K with 0. Equivalently, the spectral measure of K with respect to ∈ 2 (Ω, ) is a signature for the forward-time dynamics of (1.1). This connection allows us to interpret spectral measures as an infinitedimensional version of power spectra in Section 4.4.

Computing spectral measures
To approximate spectral measures, we make use of the relation (4.5) between the Fourier coefficients of and the autocorrelations , K . There are typically three ways to compute the autocorrelations, corresponding to the three scenarios discussed in Section 2.2.1: (i) Random sampling: The autocorrelations can be approximated as

15
(ii) High-order quadrature: The autocorrelations can be approximated as (iii) Ergodic sampling: The autocorrelations can be approximated as The first two methods require multiple snapshots of the form . Ergodic sampling only requires a single trajectory, and the ergodic averages in (4.6) can be re-written in terms of discrete (non-periodic) convolutions. Thus, all of the averages are simultaneously and rapidly computed using the FFT.
We suppose now that one has already computed the autocorrelations , K for 0 ac . In practice, given a fixed data set of snapshots, we choose a suitable ac by checking for convergence of the autocorrlations by comparing with smaller values of . We would like to recover an approximation of from the computed autocorrelations. Because of (4.4), the task is similar to • under suitable smoothness assumptions, , ac ( ) approximates the spectral density ( ) in (4.1) to order O ( − ac log( ac )), • a suitably rescaling of , ac ({ 0 }) approximates the point spectrum at 0 , (4.8) This last property is known as weak convergence. Explicit statements of these results and proofs can be found in Colbrook & Townsend (2021, Section 3). In general, one should think of , ac as a smooth function that approximates the spectral measure to order O ( − ac log( ac )), with a frequency smoothing scale of O ( − ac ). One can also compute spectral measures without the autocorrelations. Colbrook & Townsend (2021) place Algorithm 5 in the wider framework of convolution kernels. One can develop high-order methods based on rational kernels that approximate spectral measures Algorithm 5 : Approximating spectral measures from autocorrelations. Input: Trajectory data, a filter , and an observable ∈ 2 (Ω, ). where is the forward time propagator for a timestep . In particular, we have We apply windowing and multiply the integrand in (4.9) by the filter function ( / ). We then discretise the resulting integral using the trapezoidal rule (noting that the endpoint contributions vanish) with step size Δ = / ac to obtain the approximation It follows that , ac (2 Δ ) can be understood as a discretised version of ( )/(2 Δ ) over the time window [− , ]. Taking the limit ac → ∞ with (2 Δ ) = , we see that is an appropriate limit of the power spectrum with time resolution Δ . There are two key benefits of using . First, we do not explicitly periodically extend the signal to compute autocorrelations, and thus avoid the problem of broadening. Instead we window in the frequency domain. Second, we have rigorous convergence theory as ac → ∞. We compare spectral measures to power spectra in Section 6.

Example I: Flow past a cylinder wake
We first verify our method by considering the classic example of low Reynolds number flow past a circular cylinder. Due to its simplicity and its relevance in engineering, this is one of corresponding to periodic oscillations on a stable limit cycle. The Koopman operator of the flow has pure point spectrum with a lattice structure (Bagheri 2013).

Computational setup
The flow around a circular cylinder of diameter is obtained using an incompressible, two-dimensional lattice-Boltzmann computational fluid mechanics (CFD) flow solver. The solver uses the D2Q9 lattice model along with the BGKW collision model to calculate the velocity field. The temporal resolution (timestep) of the flow simulations is such that the Courant-Friedrichs-Lewy number is unity on the uniform numerical grid. However, storing each timestep results in a very finely resolved flow field and a large volume of data. An initial down-sampling study revealed that storing the simulation data such that keeping 12 snapshots of flow field data within a period of vortex shedding still enables us to use our analysis tools without affecting the DMD results. The size of the computational domain is 18 in length and 5 in height. The cylinder is positioned 2 downstream of the inlet at the mid-height of the domain. The cylinder walls and the side-walls are defined as bounce-back no-slip walls, and a parabolic velocity inlet profile is defined at the inlet of the domain such that Re = 100. The outlet is defined as non-reflecting outflow. For a detailed description of the solver, we refer the reader to Józsa et al.

Results
We collect snapshot data of the velocity field along a single trajectory in the post-transient regime of the flow and split the data into two data sets according to Section 3.3.2. The first set {˜( ) ,˜( ) } =1 corresponds to = 500 snapshots. We then collect = 1000 further snapshots {ˆ( ) ,ˆ( ) } =1 , whereˆ( 1) corresponds to approximately 40 time periods of vortex shedding later than˜( ) . We use Algorithms 3 and 4, which we refer to as a linear dictionary (obtained using DMD) and a non-linear dictionary (obtained using kEDMD), respectively. For the linear dictionary we use = 200 functions. For the non-linear dictionary, we use = 400 functions. We use a larger number of functions for the non-linear dictionary since we found this choice of dictionary to be better conditioned than the linear dictionary. Similar results and the same conclusions as ours are obtained using different choices of , and . Fig. 3 shows values of (pseudospectral contours) computed using Algorithm 2, where is the minimal residual in (3.5). The circular contours show excellent agreement with the distance to the spectrum, which is the unit circle in this example. The spectrum corresponds to the closure of the set of eigenvalues which wrap around the circle. The eigenvalues of the finite × Galerkin matrix K in each case are shown as red dots. The linear dictionary demonstrates spectral pollution, i.e., "spurious modes", which are easily detected by ResDMD (e.g., Algorithm 1). Fig. 4 shows the convergence of (−0.5) (see (3.5)) as we vary and . As expected, we see convergence as → ∞. Moreover, (−0.5) decreases and converges monotonically down to (K + 0.5) −1 −1 = 0.5 as increases. Fig. 5 shows the eigenvalues of the finite × Galerkin K as a phase-residual plot. Some of the eigenvalues computed using the linear dictionary have very small relative residuals of approximately 10 −8 . Due to the squaring involved in computing the relative residual in (3.2), these small residuals are effectively the level of machine precision. The linear dictionary also has severe spectral pollution. In contrast, the non-linear dictionary captures the lattice structure of the eigenvalues much better. We have labelled the different branches of the phase as the eigenvalues (powers of the fundamental eigenvalues) wrap around the unit circle.
To investigate these points further, Fig. 6 plots the errors of the eigenvalues, the minimal residuals, and the errors of the associated eigenspaces. For each case of dictionary, we first compute a reference eigenvalue 1 ≈ 0.9676+0.2525 corresponding to the first mode, and the corresponding eigenfunction 1 . For each , we compare the computed eigenvalue with 1 . The computed residual satisfies ( ) | − 1 |, confirming that Algorithm 1 provides error bounds on the computed spectra. We evaluate 1 and at the points {ˆ( ) } =1 , and the "eigenspace error" corresponds to the subspace angle between the linear spans of the resulting vectors (Colbrook & Hansen 2019, Eq. (1.4)). The linear dictionary does an excellent job at capturing the lower spectral content, up to about = 35, but is unable to capture the strongly non-linear eigenfunctions. In contrast, the non-linear dictionary does a much better job at capturing the higher-order spectral information. For this problem, only a few Koopman modes are needed to reconstruct the flow. However, for some problems, having non-linear functions in the dictionary is essential to capture the dynamics (e.g., see Brunton et al. (2016b), and our examples in Sections 7 and 8). One is completely free to choose the dictionary used in ResDMD. For example, one could also use a mixture of the DMD and kEDMD computed dictionaries, or other methods entirely. Fig. 7 and Fig. 8 show examples of Koopman modes (see (2.8)) for the component of the velocity, computed using the linear and non-linear dictionaries, respectively. To compare the two dictionaries, each Koopman eigenfunction was normalised so that the vector √ Ψ (: , ) has norm 1. Modes 1 and 2 show excellent agreement between the two choices of dictionary, and can also be compared to Taira et al. (2020, Figure 3). However for the higher mode (Mode 20), the two results bear little similarity at all. The Koopman modes correspond to the vector-valued expansion coefficients of the state in the approximate eigenfunctions as opposed to the eigenfunctions themselves. Thus, the difference indicates that for highorder modes, non-linearity becomes important in this expansion. We can be assured by the residual measure in the ResDMD algorithm that the modes arising from using the non-linear dictionary are physical features of the flow and not spurious. ResDMD therefore brings certainty to the high-order modal analysis of this classic example, which can be lacking in prior DMD or EDMD approaches.

Example II: Embedded shear layer in a turbulent boundary layer flow
We now turn to using our method to analyse a high-Reynolds number turbulent flow with a fine temporal resolution. We consider the turbulence structure within a turbulent boundary layer over a flat plate both with and without an embedded shear profile. A shear profile is achieved  by permitting a steady flow to be injected through a section of the plate. Turbulent boundary layer flow can be considered a challenging test case for DMD algorithms, particularly when assessing potential non-linear flow behaviours, and the wide range of length scales present in a high-Reynolds number flow problem. The embedded shear layer, on the other hand, is anticipated to have a set of characteristic length scales with a broadband energy distribution.
6.1. Experimental setup We consider two cases of turbulent boundary layer flow, a canonical (baseline) and one with an embedded shear layer. Both cases are generated experimentally and detailed in Szőke et al. (2020). For clarity, we briefly recall the key features and illustrate the experimental setup in Fig. 9. The baseline case consists of a canonical, zero pressure gradient turbulent boundary layer (TBL) passing over a flat plate (no flow control turned on). The friction Reynolds number is Re = 1400. In the experiments, the development of a shear layer is triggered using perpendicular flow injection (active flow control) through a finely perforated sheet, denoted by . The velocity field is sensed at several positions downstream of the flow control area by traversing a hot-wire probe across the entire boundary layer. The data gathered using the hot-wire sensor provides a fine temporal description of the flow. For a more detailed description of the experimental setup and the flow field, see Szőke et al. (2020). A wide range of downstream positions and flow control severity values were considered in the original study, with the purpose of assessing the effects of flow control on trailing edge noise. Here, the same data is used, but we focus our attention to one streamwise position (labeled as BL3 in Fig. 9) and consider a flow injection case where a shear layer develops as a result of perpendicular flow injection.

Results
The fine temporal resolution of the flow field enables us to assess the spectral properties of the flow. We use Algorithm 5 to calculate the spectral measures of the flow and compare our results to the power spectral density obtained using Welch's power spectral density estimate as described in Szőke et al. (2020). Note that for this example, DMD-type methods cannot robustly approximate the Koopman operator of the underlying dynamical system since data is only collected along a single line. However, Algorithm 5 can still be used to compute the spectral measures of the Koopman operator. As a first numerical test, we compute spectral measures of the data collected at / 0 = 0.1000. We then integrate this spectral measure against the test function ( ) = exp(sin( )). Rather than being a physical example, this is chosen just to demonstrate the convergence of our method. We consider the integral computed using the power spectral density (4.9) with a window size of ac for direct comparison, and Algorithm 5 for choices of filter hat ( ) = 1 − | |, cos ( ) = 1 2 (1 − cos( )), four ( ) = 1 − 4 (−20| | 3 + 70 2 − 84| | + 35) and bump ( ) = exp − 2 1− | | exp − 4 ( ≈ 0.109550455106347). These filters are first, second, fourth and infinite order, respectively. Fig. 10 shows the results, where we see the expected rates of convergence of Algorithm 5. In contrast, the PSD approximation initially appears to converge at a second order rate, and then stagnates. Fig. 11 compares the spectral measure, as found using Algorithm 5, to the power spectral density (4.9) obtained from direct processing of the experimental data, where we have used a window size of ac for direct comparison. To directly compare to PSD, we use Algorithm 5 with the second order filter cos . A range of different vertical locations / 0 are considered, where 0 is the boundary layer thickness of the baseline case (i.e., = 0). Whilst the high-frequency behaviour is almost identical between the two methods, at low frequencies (< 10Hz) the spectral measure returns values approximately ∼ 1dB greater than the PSD processing because the conventional PSD calculation results observe broadening at low  Figure 10: Relative error in integration against the test function ( ) = exp(sin( )) for data collected at / 0 = 0.1000. Algorithm 5 converges at the expected rates for various filter functions. In contrast, the PSD approximation appears to initially converge at a second order rate, but then stagnates.
frequencies. This is confirmed in Fig. 12 which shows the low frequency values for ac = 4000 and various choices of filter function. In general, higher order filters lead to a sharper peak at low frequencies. As we are assured that the spectral measure rigorously converges, this new method provides the more accurate measure of the power at low frequencies as ac → ∞.

Example III: Wall-jet boundary layer flow
As a further example of turbulent boundary layer flow, we now consider a wall-jet boundary layer (Gersten 2015;George et al. 2000;Kleinfelter et al. 2019). Whilst hot-wire probes enable a very fine temporal resolution of the flow field, they are usually restricted to a single point or line in space, and thus preclude the use of many DMD-type methods. On the other hand, time-resolved (TR) particle image velocimetry (PIV) offers both spatial and temporal description of the flow. For this example, we assess the performance of the ResDMD algorithm on a set of TR-PIV data. We consider the boundary layer generated by a thin jet (ℎ = 12.7mm) injecting air onto a smooth flat wall. As in Example II, this case is challenging for regular DMD approaches due to multiple turbulent scales expected within the boundary layer. This section demonstrates the use of ResDMD for a high Reynolds number, turbulent, complex flow field.
7.1. Experimental setup Experiments using TR-PIV are performed at the Wall Jet Wind Tunnel of Virginia Tech as schematically shown in Fig. 13. For a detailed description of the facility, we refer to Kleinfelter et al. (2019). A two-dimensional two-component TR-PIV system is used to capture the walljet flow and the streamwise origin of the field-of-view (FOV) isˆ=1282.7mm downstream of the wall-jet nozzle. We use a jet velocity of =50m/s, corresponding to a jet Reynolds number of Re = ℎ / = 63.5×10 3 . The length and height of the FOV is approximately 75mm × 45mm, and the spatial resolution of the velocity vector field is 0.25mm. The highspeed cameras are operated in a double frame mode, with a rate of 12,000 frame pairs per second, resulting in a fine temporal resolution of 0.083ms. : Left: Forecast of total kinetic energy (normalised by the time average of the kinetic energy), averaged over the 12000 initial conditions. Values closer to 1 correspond to better predictions. Right: Pseudospectral contours computed using Algorithm 2 for the wall-jet example, using a non-linear dictionary. The eigenvalues of the finite Galerkin matrix K are shown as red dots. The shape of the contours reflect the transient modes. The blue curve corresponds to a fit = exp(− | |) of these contours and the boundary of the eigenvalues, and represents successive powers of modes.
of rather large, energetic flow structures. While the peak in the velocity profile is ≈ 18mm from the wall in our case, the overall thickness of the wall-jet flow is on the order of 200mm. Clearly, the PIV experiments must compromise between a good spatial resolution or capturing the entire flow field. In our case, the FOV was not tall enough to capture the entire wall-jet flow field. For this reason, the standard DMD algorithm under-predicts the energies corresponding to the shear-layer portion of the wall-jet flow as the corresponding length scales fall outside of the limits of the FOV.

Results
We collect snapshot data of the velocity field from two separate realisations of the experiment. We use the first experiment to generate data {˜( ) ,˜( ) } =1 with = 2000, corresponding to 121 boundary layer turnover times. This data is used to select our dictionary of functions. We use the second experiment to generate data {ˆ( ) ,ˆ( ) } =1 with = 12000 (a single trajectory of one second of physical flow time and 728 boundary layer turnover times), which we use to generate the ResDMD matrices, as outlined in Section 3.3.2. To demonstrate the need for non-linear functions in our dictionary, we compute the Koopman mode decomposition of the total kinetic energy of the domain using (2.7). Using this decomposition, we compute forecasts of the total energy from a given initial condition of the system. Fig. 14 (left) shows the results, where we average over the 12000 initial conditions in the data set and normalise by the true time-averaged kinetic energy. We use Algorithms 3 and 4 with = 2000, which we refer to as a linear dictionary and non-linear dictionary, respectively. The importance of including non-linear functions in the dictionary is clear, and corresponds to a much better approximation of K's spectral content near 0. For the rest of this section, we therefore only use the non-linear dictionary. Fig. 14 (right) shows values of (pseudospectral contours) computed using Algorithm 2, where is the minimal residual in (3.5). Unlike the example in Section 5, the contours are not circular. Instead, they appear to be centered around a curve of the form = exp(− | |) (shown as blue in the plot), corresponding to successive powers of transient modes. This is reflected in the eigenvalues of the finite × Galerkin matrix K, shown as red dots, some of which correspond to spectral pollution. The eigenvalues of non-normal matrices can be severely unstable to perturbations, particularly for large , so we checked the computation of the eigenvalues of K by comparing to extended precision and predict a bound of approximately 10 −10 on the error in Fig. 14 (right).
To investigate the Koopman modes, we compute the ResDMD Koopman mode decomposition corresponding to (3.4) with the error tolerance = 0.5 to get rid of the most severe spectral pollution. The total number of modes used is 656. Fig. 15 illustrates a range of Koopman modes which are long-lasting (left-hand column) and transient (righthand column). Due to residual measures, we are able to accurately select physical transient modes. Within each figure, the arrows dictate the unsteady fluid structure (computed from the Koopman modes of the velocity fields), with the magnitude of the arrow indicating the local flow speed, and the colourbar denotes the Koopman mode of the velocity magnitude. The corresponding approximate eigenvalues, , and residual bound, , are provided for each mode.
The modes in the left column of Fig. 15 illustrate the range of rolling eddies within the boundary layer, with the smaller structures containing less energy than the largest structures. Interestingly, the third mode in the left column resembles the shape of ejection-like motions within the boundary layer flow ( / < 1) while larger-scale structures above the boundary layer ( / > 1) are also visible. This may be interpreted as a non-linear interaction in the turbulent flow field, which is efficiently captured using the ResDMD algorithm. The transient modes in the right column of Fig. 15 show a richer structure. Based on our analysis, we interpret these modes as transient, short-lived behaviour of turbulence. The uppermost panel may be seen as the shear layer traveling over the boundary layer ( / > 1), with the following panel potentially seen as the breakdown of this transient structure into smaller structures. The third panel may be seen as an interaction between an ejection-type vortex and the shear layer, note the ejection-like shape of negative contours below / = 1.5 with a height-invariant positive island of contour at / ≈ 1.75. Finally the bottom-most panel could be seen as a flow uplift out of the boundary layer and further turbulent streaks with counter-rotating properties.
Using the autocorrelation functions of both streamwise ( ) and wall normal ( ) velocity components and Algorithm 5, we obtain the streamwise wavenumber spectra of the velocity fluctuations, shown in the top row of Fig. 16. The streamwise averaged spectral measure of   Fig. 16. When observing either the wavenumber spectra or the spectral measures of the streamwise velocity component, we see consistent levels of spectra across the different boundary layer heights, which is due to the dominance of the strong shear layer in the wall jet flow. For the vertical velocity spectra, as one might expect, more energy is observed for both wavenumber spectra and spectral measures with increasing / .

Example IV: Acoustic signature of laser-induced plasma
So far, our examples of ResDMD have focused on fluid flows. However, the capability of ResDMD capturing non-linear mechanics can be applied more broadly. Moreover, the computation of residuals allows an efficient compression of the Koopman mode decomposition through (3.4). As our final example, we demonstrate the use of ResDMD on an acoustic example where the sound source of interest exhibits highly non-linear properties.

Experimental setup
We investigate a near-ideal acoustic monopole source that is generated using the laser optical setup illustrated in Fig. 17. When a high-energy laser beam is focused into a point, the air ionizes and plasma is generated due to the extremely high electromagnetic energy density (on the order of 10 12 W/m 2 ). As a result of the sudden deposit of energy, the volume of air undergoes a sudden expansion that generates a shockwave. The initial propagation characteristics can be modeled using von Neumann's point strong explosion theory (Von Neumann 1963), which was originally developed for nuclear explosion modeling. For our ResDMD analysis, we use laser-induced plasma (LIP) sound signature data measured using an 1/8inch, Bruel & Kjaer (B&K) type 4138 microphone operated using a B&K Nexus module Szőke et al. (2022). The data from the microphone is acquired using an NI-6358 module at a sampling rate of =1.25MS/s. With this apparatus, we can resolve the highfrequency nature of the LIP up to 100kHz. For a detailed description of the laser-optical setup, experimental apparatus, and data processing procedures, see Szőke et al. (2022);Szőke & Devenport (2021).
The important acoustic characteristic of the LIP is that it has a short time period of initial supersonic propagation speed, which are shown as Schlieren images taken over a 15 s window in Fig. 18. When observed from the far field, this initial supersonic propagation is observed as a non-linear characteristic despite that the wavespeed is supersonic only in a short radius around the source, namely, until about 3-4mm from the optical focal point. During the experiments, 65 realisations of LIP are captured using microphones. Each realisation of LIP is then gated in time such that only the direct propagation path of the LIP remains in the signal (Szőke & Devenport 2021). We use this gated data for our ResDMD analysis.

Results
For a positive integer , we take the state at time to be where is acoustic pressure. This corresponds to time-delay embedding, which is a popular method for DMD-type algorithms (Arbabi & Mezic 2017;Kamb et al. 2020;Pan & Duraisamy 2020). There is a further interpretation of when we make future state predictions using the Koopman mode decomposition. The value of corresponds to the initial time interval that we use to make future state prediction. This is shown as vertical dashed lines in the plots below. We split the data into three parts. The first 10 realisations of LIP correspond to {˜( ) ,˜( ) } =1 and are used to train the dictionary. The next 50 realisations correspond to {ˆ( ) ,ˆ( ) } =1 , and are used to construct the ResDMD matrices. The final 5 realisations are used to test the resulting Koopman mode decomposition. We consider two choices of dictionary. The first is a linear dictionary computed using Algorithm 3. The second is the union of the linear dictionary and the dictionary computed using Algorithm 4 with = 200. We refer to this combined dictionary as the non-linear dictionary. Fig. 19 (left) shows the results of the Koopman mode decomposition (2.8), applied to the first realisation of the experiment in the test set, with = 10. Namely, we approximate the state as (8.1) In particular, we test the Koopman mode decomposition on unseen data corresponding to the test set. The values of to the left of the vertical dashed line correspond to 0 . It is clear that the non-linear dictionary does a much better job of representing the non-linear behaviour of the system. While the linear dictionary can predict the positive pressure peak, it fails to predict both the magnitude and shape of the negative peak, and it also fails to capture the smaller, high-frequency oscillations following the fist two large oscillation. These discrepancies between the linear and non-linear dictionary-based results also pinpoint where non-linearity in the signal relies. In other words, the non-linear signature of the pressure wave relies in the negative portion of the wave. Fig. 19 (right) plots the relative mean squared error (RMSE) averaged over the test set for different values of . The non-linear dictionary allows an average relative 2 error of around 6% for = 15. Fig. 20 (left) shows the corresponding pseudospectral contours, computed using Algorithm 2 with = 10. We can use ResDMD to compress the representation of the dynamics, by ordering the Koopman eigenvalues , eigenfunctions , and modes according to their (relative) residual res( , ) (defined in (3.2)). For a prescribed value of , we can alter in (3.4) to produce a Koopman mode decomposition of the eigenfunctions with the smallest residual. In Fig. 20 (right), we compare this to a compression based on the modulus of the eigenvalues using 40 modes in each expansion. It is clear that ordering the eigenvalues by their residuals gives a much better compression of the dynamics. To investigate this further, Fig. 21 shows the error curves of the two different compressions for various dictionary sizes and choices of . This suggests ResDMD may be effective in the construction of reduced order models.

Declaration of interests.
The authors report no conflict of interest.

Appendix A. Spectral measures of Koopman operators
Any normal matrix ∈ C × , i.e., * = * , has an orthonormal basis of eigenvectors 1 , . . . , for C such that . Even though such an extension is not unique, it allows us to understand the spectral information of K. The spectral measures discussed in this paper are canonical and do not depend on the choice of extension. Second, if K has non-empty continuous spectrum, then the eigenvectors of K do not form a basis for H or diagonalise K . Instead, the projections * in (A 1) can be replaced by a projection-valued measure E supported on the spectrum of K (Reed & Simon 1980, Thm. VIII.6). K is unitary and hence (K ) lies inside the unit circle T. The measure E assigns an orthogonal projector to each Borel measurable subset of T such that Analogous to (A 1), E decomposes H and diagonalises the operator K . For example, if ⊂ T contains only discrete eigenvalues of K and no other types of spectra, then E ( ) is simply the spectral projector onto the invariant subspace spanned by the corresponding eigenfunctions. More generally, E decomposes elements of H along the discrete and continuous spectrum of K (see (4.1)).
Given an observable ∈ 2 (Ω, ) ⊂ H of interest, the spectral measure of K with respect to is a scalar measure defined as ( ) := E ( ) , , where ⊂ T is a Borel measurable set. It is convenient to equivalently consider the corresponding measures defined on the periodic interval [− , ] per after a change of variables = exp( ).  (2)