Cluster-based network model of an incompressible mixing layer

We propose an automatable data-driven methodology for robust nonlinear reduced-order modeling from time-resolved snapshot data. In the kinematical coarse-graining, the snapshots are clustered into few centroids representable for the whole ensemble. The dynamics is conceptualized as a directed network where the centroids represent nodes and the directed edges denote possible finite-time transitions. The transition probabilities and times are inferred from the snapshot data. The resulting cluster-based network model constitutes a deterministic-stochastic grey-box model resolving the coherent-structure evolution. This model is motivated by limit-cycle dynamics, illustrated for the chaotic Lorenz attractor and successfully demonstrated for the laminar two-dimensional mixing layer featuring Kelvin-Helmholtz vortices and vortex pairing. Cluster-based network modeling opens a promising new avenue with unique advantages over other model-order reductions based on clustering or proper orthogonal decomposition.


Introduction
We propose a cluster-based network model (CNM) from time-resolved snapshot data exemplified for a laminar mixing layer. The goal is purely data-driven reduced-order modeling trading the physical insights from first principles, e.g., the Galerkin method (see, e.g. Holmes et al. 2012), with simplicity, robustness and closeness to the original data.
The mixing layer is an archetypical flow configuration associated with many academic and industrial applications. The flow is discussed virtually in any textbook of fluid mechanics. In the early stage, the laminar mixing layer gives rise to periodic, spatially growing Kelvin-Helmholtz vortices as described in stability theory (Michalke 1964), by vortex models (Hama 1962) or by a Proper Orthogonal Decomposition (POD) Galerkin model . At later stages, multiple vortex pairings induce the inverse cascade to lower wavenumbers and frequencies (Coats 1997). In addition, three-dimensional instabilities enrich the coherent structures by rib vortices and spanwise waviness (see, Thereafter, each snapshot has a cluster-affiliation k(t m ) being the index of the closest centroid.
Cluster-based Markov models (CMM) describe the evolution of the population of these clusters. The solution of CMM quickly converges against the asymptotic probability distribution. The proposed cluster-based network model (CNM) resolves the dynamic transitions between the clusters by a deterministic-stochastic network.

Cluster-based modeling
In this section, we propose a novel cluster-based reduced-order model (ROM) for the coherent structure dynamics starting at the time-resolved snapshots. In § 2.1 and § 2.2 clustering and cluster-based Markov models (CMM) are recapitulated. Section 2.3 proposes a novel data-driven dynamic network resolving the transition dynamics between the clusters. In § 2.4, the time-discrete cluster-based ROM is enhanced for a continuoustime velocity prediction. The model validation includes the autocorrelation function of the flow as discussed in § 2.5. Figure 1 previews the methodology and will be explained later in the section. The relative advantages of CMM and CNM are illustrated for the Lorenz attractor in § 2.6.

Clustering as coarse-graining
We consider velocity fields in a steady domain Ω which may be obtained from experiments or from numerical simulations. Starting point is an ensemble of M statistically representative, time-resolved snapshots as employed for Cluster-based Reduced-Order Modeling (CROM) , Dynamic Mode Decomposition (DMD) Schmid 2010) or Proper Orthogonal Decomposition (POD) (see, e.g., Holmes et al. 2012). The velocity field is equidistantly sampled with time step ∆t, i.e. the m-th instant reads t m = m∆t. The corresponding snapshot velocity field is denoted by u m (x) := u (x, t m ), m = 1, . . . , M .
In CROM, the M snapshots u m (x) are coarse-grained into K clusters represented by the centroids c k (x), k = 1, . . . , K using the unsupervised k-means++ algorithm (Steinhaus 1956;MacQueen 1967;Lloyd 1982). The corresponding cluster-affiliation function maps a velocity field u to the index of the closest centroid, k(u) = arg min i u − c i Ω . (2.1)

This function defines cluster regions as Voronoi cells around the centroids
This function can also be employed to map a snapshot index m to the representative cluster index k(m) := k (u m ). Alternatively, the characteristic function describes if the mth snapshot is affiliated with the lth centroid. The latter two quantities are equivalent. The performance of a set of centroids {c k } K k=1 with respect to a given set of snapshots {u m } M m=1 is measured by the average variance of the snapshots with respect to their closest centroid. The corresponding inner-cluster variance reads The optimal centroids {c k } K k=1 minimize this inner-cluster variance, The argument is indeterminate with respect to a re-ordering. For CROM, we chose the first cluster as the one with the highest population, i.e. the largest number of associated snapshots. The (k + 1)th cluster, k > 1, has the largest transition probability from the kth one. The optimization problem (2.5) is solved by the k-means++ algorithm. The K centroids are initialized randomly and then iterated until convergence is reached or when the variance J is small enough. k-means++ repeats the clustering process 30 times and take the best set of centroids.
The number of snapshots n k in cluster k is given by The centroids are the mean velocity field of all snapshots in the corresponding cluster.
In other words, In the following centroid visualizations, we accentuate the vortical structures by displaying the fluctuations c k − u around the snapshot mean u and not the full velocity field c k .

Cluster-based Markov model (CMM)
We briefly recapitulate CMM by Kaiser et al. (2014) as our benchmark cluster-based reduced-order model. In CMM, the state variable is the cluster population p = (p 1 , . . . , p K ) T , where p i represents the probability to be in cluster i and the superscript T denotes the transpose. The transition between clusters in a given time step ∆t c is described by the transition matrix P = (P ij ) ∈ R K×K . The superscript 'c' refers to cluster-based model. Here, P ij is the transition probability to move from cluster j to cluster i. Let p l be the probability vector at time t l = l∆t c , then the change in one time step is described by With increasing iterations, the iteration (2.8) converges to the asymptotic probability p ∞ := lim l→∞ p l . In a typical case, (2.8) has a single fixed point p ∞ . For completeness, a continuous form of Markov models with new transition matrix P c is mentioned: dp dt = P c p. (2.9) From the time-continuous form (2.9), the time-discrete one (2.8) can be derived. The opposite is not generally true. In the following, no continuous Markov models are employed.
A CMM of the time-resolved snapshots starts with cluster affiliation (2.1) which can also be considered function of time k(t). We refer to the original paper for the determination of P ij from k(t). The time step ∆t c is a critical design parameter for CMM. A good choice is a value where the transition from one cluster to the next is likely. If the time step is too small, the Markov model idles many times in each cluster for a stochastic number of times before transitioning to the next cluster. The model-based transition time may thus significantly deviate from the deterministic data-driven trajectories through the clusters. If the time step is too large, one may miss intermediate clusters. We choose ∆t c = T /10, where T is the characteristic period of the flow. On a circular limit cycle with uniform rotation, this value is optimal for K = 10 clusters, enforcing the transition from one cluster to the next in each time step.
In figure 2 (left column), the effect of the suboptimal time step is illustrated for the CMM of a uniform rotation u 1 = cos(2πt), u 2 = sin(2πt). Here, 4 clusters and a time step ∆t c = 1/16 are chosen. The probability to stay in the cluster during one time step is P 11 = P 22 = P 33 = P 44 = 3/4 and the transition probability to the next counterclockwise neighbour is P 14 = P 21 = P 32 = P 43 = 1/4. Thus, the probability to stay in one cluster for l steps exponentially decays, P l 11 . In contrast, the uniform rotation commands that the state is exactly three time steps in one cluster before it leaves in the fourth step. This example motivates the proposed cluster-based reduced-order model, foreshaddowed in figure 2 (right column) and explained in the following section.

Cluster-based network model (CNM)
For CMM, the time step ∆t c is, as mentioned, an important design parameter. This design parameter can be avoided by the new proposed Cluster-based Network Model (CNM). The key idea is to abandon the 'stroboscopic' view of CMM and focus on nontrivial transitions from cluster j to cluster i. These transitions are characterized by two parameters: the probability Q ij and a time-scale T ij . Evidently, no time-step is needed for the description and the assumption of a constant transition time is found to be much more aligned with shear flow modes than assuming an exponential decay of residence time. Moreover, it could be relaxed by assuming a probability distribution of transition times.
In the following, the transition probability and transition time are inferred from the cluster affiliation function k(t). The continuous form is convenient for discussion. The time discrete affiliation function k(m) can be made continuous by taking the cluster of the snapshot which is closest in time, The nth transition time t n of the cluster affiliation is recursively defined as the first    discontinuity of k(t) for t > t n−1 . Here, t 0 = −∞. The transition time t n satisfies for any sufficiently small positive ε. For t ∈ (t n , t n+1 ), the data-based trajectory is assumed to stay in cluster k at the averaged time (t n+1 + t n )/2 (see figure 3). The residence time in this cluster is defined by (2.11) Let j and i be the indices of the clusters after t n and t n+1 respectively. Then the transition time from j to i is defined as half of the residence time of both clusters, This definition may appear arbitrary but is the least-biased guess consistent with the available data. The sum of all residence times from a given data set add up to the total investigated time period T 0 . The direct transition probability Q ij and transition time T ij can be inferred from the data. Then, where n ij is the number of transitions from c j to c i and n j the number of transitions departing from c j regardless of the destination point, (2.14) We emphasize that n ii = 0 for i = 1, . . . , K by very definition of a direct transition. The direct transition matrix (DTM) Q = (Q ij ) ∈ R K×K lumps these probabilities into a single entity. Similarly, the direct transition time T ij from cluster j to cluster i is taken to be the average from all values. This average is symbolically denoted by These values are lumped into the matrix T = (T ij ) ∈ R K×K . It should be noted that a given trajectory may pass repeatedly through the same clusters (Voronoi cells) with different residence and transition times. With enough data this variability may be incorporated into the model. Our goal is to compare the Markov model with the most simple network model where constant (averaged) transition times are assumed.
CNM predicts the asymptotic cluster probability p ∞ i in cluster i. Let [0, T 0 ] be a sufficiently long time horizon simulated by the model. Then, the probability to stay in cluster i is the cumulative residence time normalized by the simulation time, We return to introductory example depicted in figure 2. The CNM is seen to accurately describe the uniform rotation (subfigure f) and correctly yields the asymptotic cluster probability p ∞ i = 1/4, i = 1, . . . , 4. In contrast, the prediction horizon of the CMM is limited to roughly one period. After this time, the initial condition is forgotten and the asymptotic distribution is reached-rendering CMM unsuitable for dynamic prediction. However, CMM predicts the asymptotic state faster than the CNM. For this particular example, the CMM could be made equivalent to the CNM by choosing ∆t c = 1/4. However, the Markov model will inevitably diffuse the state with a range of clustertransition times, e.g. for non-uniform rotation or more complex dynamics.

Velocity fields associated with the cluster-based reduced-order models
The CMM describes the cluster popoulation at discrete times t = l∆t c . In the following, this population is considered to be continuous in time, e.g. by using linear or higher-order interpolation. The corresponding velocity field u(x, t) at time t is defined as the expectation value, The CNM is based on centroid visits at discrete times. The clusters k 0 , k 1 , k 2 , . . . are visited at times consistent with the direct transition matrix (Q ij ) and the transition times T ij . A uniform motion is assumed between these visits. In other words, for t ∈ [t n , t n+1 ] the velocity field reads (2.20) We note that a smoother motion may be achieved with splines. The actual flow computations are based on a lossless proper orthogonal decomposition (POD), as elaborated in the appendix A. The interpolations are performed with the mode amplitudes a = (a 1 , . . . , a N ) before transcribed into velocity fields via the POD expansion.

Validation of the cluster-based reduced-order models
Following Protas et al. (2015), the cluster-based model is validated based on the computed and predicted autocorrelation function of the velocity field. The unbiased autocorrelation function reads (2.21) This function reveals the turbulent fluctuation level R(0) and the frequency spectrum. Moreover, the problem of comparing two trajectories with finite dynamic prediction horizons due to increasing phase mismatch is avoided (Pastoor et al. 2005).
In case of the CNM, the modeled autocorrelation functionR is based on the modelled velocity field (2.20). In case of the CMM, the time integration quickly leads to the average flow and is not indicative for the range of possible initial conditions. Hence, K trajectories are considered starting with p k = 1 for each cluster k, or, equivalently, p(t = 0) = (δ 1k , . . . , δ Kk ). These cluster-specific autocorrelation functions are weighted with the cluster probability p ∞ ( 2.22) 2.6. Lorenz system as an illustrating example Following the original CMM paper by Kaiser et al. (2014), the CNM is illustrated for the celebrated Lorenz (1963) system, arguably the first demonstration of chaotic dynamics in low-dimensional dynamics. The Lorenz system is a three-dimensional autonomous system of nonlinear ordinary differential equations. The derivation was inspired by a Galerkin model of Rayleigh-Benard convection, but typical selected parameters clearly exceed the range of model validity Sparrow (1982). The system features non-periodic, deterministic, dissipative dynamics associated with exponential divergence and convergence to a fractal strange attractor. The three coupled nonlinear differential equations read with the system parameters σ = 10, b = 8/3 and r = 28. For these parameters, there are three unstable fixed points at (0, 0, 0) and (± √ 72, ± √ 72, 27), denoted by F + and F − , respectively. The attractor of Lorenz system resembles two butterfly wing around F + and F − in phase space. The trajectory typically oscillates for several periods with increasing amplitude around a fixed point (F + or F − ) before it moves to the other wing. The number of revolutions made on either side varies unpredictably from one cycle to the next.
The Lorenz equations (2.23) are solved employing an explicit fourth-order Runge-Kutta scheme with an initial condition on the attractor. The time-resolved snapshots data x(t m ) with x = [x, y, z] are collected at a sampling time step ∆t = 0.005 corresponding roughly one thousands of a typical oscillation period. The k-meams ++ algorithm partitions M = 1, 000, 000 snapshots into K = 10 clusters. Figure 4(a) displays a phase portrait of the corresponding clusters. The snapshots associated with one cluster are highlighted by the same color. The 10 clusters feature three different subsets: the transition clusters k = 1, 2 between two butterfly wings, the F − wing related cluster k = 3, 4, 5, 6 and clusters k = 7, 8, 9, 10 associated with the F + wing. The wing-related cluster groups represent approximately 90 • phase bins and don't resolve the amplitude. Evidently, the 10 clusters are a coarse representations of the state.
In the following, the dynamics are resolved by the network model of § 2.3. The 10 centroids are considered as nodes in the network. The transition between these centroids define directed edges characterised by direct transition matrix Q and the flight times T . The connectivity is described by the adjacency matrix H(Q) where H denotes the Heaviside function: non-vanishing elements of Q are replaced by unity (Newman 2010). Figure 5, displays the DTM Q (subfigure (a)) and associated transition time matrix T (subfigure (b)). The matrices reveal three distinct cluster groups consistent with the phase diagram of figure 4. Clusters 1 and 2 allow transitions to 3 and 10, i.e. the F − and F + wing, respectively, and have been called flipper clusters by Kaiser et al. (2014). The cluster transition sequence 3 → 4 → 5 → 6 → 1 → 2 → 3 is the dominant cyclic group associated with the F − wing. Another cyclic groups skips cluster 2: 3 → 4 → 5 → 6 → 1 → 3. A cyclic group through the F + wing reads 10 → 9 → 8 → 7 → 1 → 10.  can vary by a large factor and can thus give rise to significant systematic errors. A more accurate transition model may, for instance, include earlier transitions for a more realistic representation of the trajectory. Intriguingly, the CMM has an error of only 0.5% which one order of magnitude lower. Due to the stroboscopic monitoring of the CMM states, no estimates of the transition times are required and one source of systematic errors is excluded by construction. A distinguishing feature of the CNM is the resolution of the temporal dynamics illustrated in figure 7. The evolution of the model-based trajectory is hardly distinguishable from the one obtained by numerical integration. Smoothness of the CNM trajectory has been achieved by splines connecting the states been two consecutive centroid visits. Yet, the oscillatory amplitude growth in both wings cannot be resolved with this low cluster-based resolution.
Finally, the autocorrelation of the simulaton (black solid curve) and the CNM (red dashed curve) is presented for aggregate comparison in figure 8. CNM roughly reproduces the fast oscillatory decay of the autocorrelation function in the first five periods.

Flow Configuration
In this section, the numerical simulation of the incompressible mixing layer is described. The flow configuration of the mixing layer and the employed direct Navier-Stokes solver is described in § 3.1. In § 3.2, the dominant flow features of the mixing layer are presented.

Mixing layer configuration and direct numerical simulation
The two-dimensional imcompressible mixing layer with velocity ratio of 3:1 is considered as the test plant in this paper. The velocity ratio is a common choice in the literature (Comte et al. 1998;Noack et al. 2005;Kaiser et al. 2014). The low-and high-speed streams have velocities U 1 and U 2 , respectively. The convection velocity U c of coherent structure is well approximated by the average velocity (Monkewitz 1988): (3.1) The initial vorticity thickness is denoted by δ 0 . The Newtonian fluid is characterized by the density ρ and kinematic viscosity ν. The flow characteristics are described by the Reynolds number based on the convection velocity Re = U c δ 0 /ν and velocity ratio. In the sequel, all quantities are assumed to be non-dimensionalized with the initial vorticity thickness δ 0 , the low-speed velocity U 1 and the density ρ. the y-axis points in the direction of the high-speed stream. The velocity components in x-and y-direction are denoted by u and v respectively. An in-house direct numerical simulation solver was employed to simulate the incompressible mixing layer. This solver is based on the Finite-Element Method (FEM) with third-order Taylor-Hood elements with implicit third-order time integration. The solver has been used for numerous configurations, like the cylinder wake , the mixing layer (Shaqarin et al. 2018), the fluidic pinball , to name only a few.

Flow features
The incompressible mixing layer exhibits two typical behaviours. First, the initial dynamics is characterized by the roll-up of vorticity originating from the Kelvin-Helmholtz (K-H) instability (see left in figure 10(a)). Second, these vortices pair further downstream as can bee seen at the outlet region of figure 10(b). This vortex pairing contributes to the mixing layer growth. The location of vortex pairing may change significantly in time. Upstream (downstream) vortex pairing is associated with high (low) fluctuation energy.
The time-averaged velocity field in figure 9 shows the mixing layer growth. The velocity thickness is visualized by red solid line and is defined as the distance between transverse locations where the mean streamwise velocity was equal to U 1 − 0.1∆U and U 2 + 0.1∆U . The mixing layer thickness increases significantly between x = 30 and x = 60. Here, vortex pairing leads to this thickness increase.
The temporal dynamics may be inferred from the evolution of the fluctuation energy in figure 10. The fluctuations indicate narrow bandwith oscillatory behaviour. More refined insights may be gained from the correlation function between the flows at time t and t , (3.4) Figure

Cluster-based reduced-order modeling of the mixing layer
In this section, the cluster-based models are applied to a two-dimensional incompressible mixing layer with Kelvin-Helmholtz vortices undergoing vortex pairing. First, the snapshots of incompressible mixing layer are coarse-grained into centroids in

Clustering
Both reduced-order models are based on the direct numerical simulation of the twodimensional incompressible mixing layer described in § 3. M = 10, 000 velocity field snapshots of the post-transient phase are sampled with a time step ∆t = 0.04.
The computational load of clustering is significantly reduced by an effectively lossless POD compression detailed in Appendix A. In fact, all operations are performed on the POD amplitude vector a = [a 1 , a 2 , ..., a N ] T instead of the snapshots.
The M snapshots are clustered with the k-means++ algorithm into K = 10 centroids. This number is small enough to allow for the physical interpretation of all centroids and all transitions but large enough for a meaningful reduced-order model. Figure 12 illustrates the transverse velocity fluctuation of the centroids. The first 7 centroids show the streamwise convection of Kelvin-Helmholtz (K-H) vortices, while the next 3 centroids resolve a vortex pairing (VP) event. In centroid 8, two vortices merge at the beginning of the vortex chain. In the following 2 centroids, the merging is completed and leads to a large vortex. Note that the VP centroids k = 8, 9, 10 have pronounced vortices at similar position as the KH centroids k = 4, 5, 6, respectively.
The centroids represent characteristic stages in the mixing layer dynamics as can be elucidated in a proximity map. This map reflects the configuration matrix D = (D ij ) ∈ R K×K comprising the distance between between two centroids: matrix D in a two-dimension feature space γ ∈ R 2 optimally preserving the relative distances. The proximity map employs Classical Multidimensional Scaling (CMDS) (Mardia et al. 1979). Figure 13(a) displays centroids close to a circle which is characteristic for vortex shedding.

Markov model
The temporal mixing-layer evolution is characterized by the cluster transition matrix P illustrated in figure 13(b). P ij represents the probability of moving from cluster j to i in one forward time step. Here, we choose a time step ∆t c = T /10 = 1 where the T = 10 is the dominant period of the evolved mixing layer. The cluster transition matrix reveals two cyclic groups. The first group 1 → 2 → 3 → 4 → 5 → 6 → 7 → 1 is consistent with the convection process of the K-H vortex shedding observed in the centroid visualization. This periodic process corresponds to a nearly uniform clockwise rotation in the proximity map. The second cyclic group 8 → 9 → 10 → 7 → 1 → 2 → 8 comprises VP centroids k = 8, 9, 10 and shares two centroids with the K-H regime. These dynamics also lead to a nearly uniform clockwise rotation in the feature space. There are also transitions from the VP to K-H regime, e.g. 8 → 4, 9 → 5, 10 → 6 and 10 → 7 and in the opposite direction. All these transitions are between similar centroids of both groups.
The evolution of the cluster population vector p l at t = l∆t c is investigated by iterating equation (2.8). Figure 14 compares the probability distribution of DNS data and the model-based asymptotic vector p ∞ . The agreement is astonishingly good for such a loworder model. The probability vector converges quickly to a unique, stationary probability distribution near t = 20.
In figure 15, the dynamics of CMM is illustrated for the first cluster probability p 1 and the first POD mode amplitude a 1 inferred from the flow state (2.18). Starting point is direct numerical simulation starting at t = 0 close to the first cluster c 1 which corresponds to the probability vector p = (1, 0, 0, 0, 0, 0, 0, 0, 0, 0) T . The probability and POD mode amplitude of CMM show a convergence after around l = 35 iterations or, equivalently t ≈ 35. The solid horizon line denotes q 1 , i.e. the population of the first cluster from DNS data. The POD mode amplitude a 1 performs three strongly damped oscillations before vanishing. Figure 15c shows a oscillating quickly decaying autocorrelation function of CMM which is consistent with the observations for a 1 and p 1 . In contrast, the autocorrelation function associated with the DNS keeps oscillating around with an amplitude around 50% of the average fluctuation level. This level indicates that half of the fluctuation energy resides in repeating oscillatory flow structures while the other half is of non-repeating stochastic nature.

Network model
In this section, a Cluster-based Network Model (CNM) is developed using the same snapshot data and same centroids. Starting point for the dynamic network is the cluster affiliation function k(t). Following § 2.3, the direct cluster transition matrix Q with associated average transition times T are derived. Figure 16 illustrates both matrices. These matrices have the almost same structure as the Markov model except for the diagonal

2b)
H being again the Heaviside function. Vanishing diagonal elements (4.2a) arise from the requirement of non-trivial transitions. Theoretically, the trajectory may terminate in a cluster, like in a stable fixed point of a linear dynamical system. This case is not compatible with the goal to model a well-resolved non-trivial attractor and shall be ignored in this study. Equation (4.2b) requires a sufficiently small time-step of the CMM. Otherwise, the stroboscopic view on the trajectory may miss a crossing of an intermediate cluster. This happens with the transition from 1 → 2 → 3 in one CMM time step ∆t c . Hence, P 31 = 0 while Q 31 = 0. However, this is a rare event as indicated by the small value of Q 31 . An inspection of T reveals that the transition time between K-H and VP centroids is relatively small. This is consistent with the closeness of the corresponding centroids in the proximity map ( figure 13(a)). An exception is the transition between K-H centroid 2 to VP centroid 8 which are well separated in the proximity map. Intriguingly, the transitions within the K-H and VP regime are also strongly correlated with the distances depicted in the proximity map. For instance, the smallest (largest) inner-regime transition from centroid 6 to 7 (8 to 9) is associated with a small (large) distance in the proximity map. In the following, the temporal dynamics of CNM is investigated based on the identified centroids c k , the description of their connectivity DTM Q and their flight times T . Like a POD model, CNM is a grey-box model resolving the temporal dynamics and the associated coherent structures. We choose cluster k = 1 as initial condition for DNS and for the CNM and integrate over l = 20, 000 transitions. In figure 17, the asymptotic cluster population p ∞ from equation (2.16) is compared with q from the DNS.  Like for the Lorenz system, the temporal evolution is smoothed by a spline and does not use the non-smooth uniform motion between two consecutive centroid visits. Figure 19 compares the autocorrelation function of the CNM and the DNS. Intriguingly, the model-based fluctuation level at vanishing time delay is significantly lower than the DNS value but becomes similar to oscillation level for arbitrary larger time horizon. The difference between DNS fluctuation level exhibited at vanishing time delay and the CNM-based one is the unresolved inner-cluster variance. The asymptotic fluctuation level represents coherent structures which are resolved by the chosen centroids and serve as coarse-grained recurrence points of the DNS. The good reproduction autocorrelation function is a posteriori justification for the chosen cluster number.

Conclusions
In the present study, we propose a new data-driven methodology for nonlinear dynamical systems. We trade compatibility with first principles, like with a POD-based Galerkin model, with simplicity and robustness of the modeling. Point of departure is the cluster-based Markov model ) for time-resolved snapshot data. The snapshots are coarse-grained into few representative centroids. The temporal evolution of the state is conceptualized as straight constant velocity movement from one centroid to the next. The average flight time and the transition probabilities are inferred from the data. Thus, the dynamics is modeled by a deterministic-stochastic network model with the centroids as nodes, the straight trajectory segments as edges, the transition time as parameters of the edges and the transition probability characterizing the nodes.
The resulting cluster-based network model (CNM) has several desirable features: (1) The methodology is simple and automatable. (2) The off-line computational load is only slightly larger than a snapshot-based proper-orthogonal decomposition (POD). After the computation of the POD, the clustering and network model requires a tiny fraction of the computational operation. (3) The CNM has the same recurrence properties as the original data: If one cluster is visited multiple times in the data, it will also be a recurrence point of the CNM. (4) Long-term integration will never lead to a divergence-unlike POD models. (5) The framework is very flexible allowing, for instance, to incorporate multiple operating conditions. The simplicity and robustness has a price. On the kinematic side, the vanilla version of CNM does not have the possibility to extrapolate the data, e.g., resolve oscillations at higher amplitudes not contained in the data. On the dynamic side, we lose the relationship to first principles: The network model is purely inferred from the snapshot data, without links to the Navier-Stokes equations. Subsequent generalizations need to overcome these restrictions.
CNM is applied to the Lorenz attractor. A k-means++ algorithm yields 10 centroids from a long time-resolved solution. 4 centroids represent each ear of the attractor and 2 the switching area. Despite the coarseness of the presentation, the temporal dynamics mimics well the oscillations in each ear and the switching between both ears. The agreement is mirrored by the similarity between the autocorrelation functions of the simulation and the CNM. Statistically, the cluster population is predicted with acceptable accuracy. The CNM dramatically outperforms the cluster-based Markov models (CMM)  in terms of predicting the temporal evolution. In contrast, CMM is more accurate for the cluster population. The error source of the CNM can be traced back to the chosen simple model of transition times.
The main demonstration of CNM is performed for a laminar two-dimensional mixing layer featuring Kelvin-Helmholtz (K-H) vortices and occasional vortex pairing. Again, the snapshots are coarse-grained into 10 centroids. One group of centroids can be associated to K-H vortices and a second group to vortex pairing-similar to Kaiser et al. (2014). The CNM resolves well the temporal evolution of vortex formation and pairing, the fluctuation level, the autocorrelation function, and the cluster population.
CNM are found to have a distinct advantage over the departure point, CMM, namely the much longer prediction horizon as evidenced for the autocorrelation function. POD and DMD models may describe the same flow with similar number of modes (Protas et al. 2015). We emphasize that the construction of the CNM could be fully automated in a software package. In contrast, data-driven nonlinear Galerkin models may be designed as insightful least-order representations with interpretable modes. Moreover, the Galerkin dynamics may reveal the interplay between linear and nonlinear terms, as beautifully displayed in mean-field theory (Stuart 1971), self-consistent models (Mantič-Lugo et al. 2014), resolvent operator approaches (Gomez et al. 2016), finite-time thermodynamics (Noack et al. 2008) and criteria for boundedness (Schlegel & Noack 2015). Yet, a functional model requires the careful choice of flow data, potentially shift and other nonstandard modes, subscale closure models and calibration techniques. Thus, cluster-based and POD based models have different niche applications.
CNM opens a novel automatable avenue for nonlinear dynamical modeling and provides a framework for estimation and model-based control (Fernex et al. 2019). The authors actively pursue this direction. number of repetitions then the total number of integrals is L × I × K × M . Typical values are K ∼ 10, I ∼ 10K and L ∼ 100.
The computational load can be significantly reduced by pre-processing the snapshot data with a lossless POD. The most expensive step of a typical snapshot POD is the computation of the correlation matrix with M × (M + 1)/2 area/volume integrals. Thus, the integral for the distance between two velocity fields transforms into the Euclidean norm with (M − 1)-dimensional vectors of POD mode amplitudes. The computational saving reads M × (M + 1)/2 With typical values, the savings are one or two orders of magnitudes. For completeness and self-consistency, the snapshot POD algorithm is described. POD is performed with the whole computational domain Ω. The inner product between two velocity fields v(x), w(x) in the square-integrable Hilbert space L 2 (Ω) reads The corresponding norm is given by v The distance D between two velocity fields is based on this norm, The inner product (A 2) uniquely defines the snapshot POD (see, e.g., Holmes et al. 2012). The mth snapshot is represented by where u 0 denotes the mean flow, u i the ith POD mode and a m i the POD mode amplitude corresponding to the mth snapshot. It may be noted that the maximal number of POD modes is M − 1, e.g., two snapshots define a one-dimensional line, not a plane. Let c i u i be two velocity field representations, e.g., a snapshot and a centroid. Then, their distance is given by Evidently, (A 6) is much quicker to compute than (A 4) assuming the typical case that the number of grid points is much larger than the number of snapshots.