Physics-agnostic and Physics-infused machine learning for thin films flows: modeling, and predictions from small data

Numerical simulations of multiphase flows are crucial in numerous engineering applications, but are often limited by the computationally demanding solution of the Navier-Stokes (NS) equations. Here, we present a data-driven workflow where a handful of detailed NS simulation data are leveraged into a reduced-order model for a prototypical vertically falling liquid film. We develop a physics-agnostic model for the film thickness, achieving a far better agreement with the NS solutions than the asymptotic Kuramoto-Sivashinsky (KS) equation. We also develop two variants of physics-infused models providing a form of calibration of a low-fidelity model (i.e. the KS) against a few high-fidelity NS data. Finally, predictive models for missing data are developed, for either the amplitude, or the full-field velocity and even the flow parameter from partial information. This is achieved with the so-called"Gappy Diffusion Maps", which we compare favorably to its linear counterpart, Gappy POD.


Introduction
The study of multiphase flows is often limited by the computational effort involved in solving the Navier-Stokes equations [1].One such example, the flow of thin films of liquid on inclined planes, has fascinated researchers not only because of the wide range of industrial applications but also because of the interesting dynamics of the liquid-air interface [2].The Navier-Stokes (NS) equations accurately describe the fluid motion and also the evolution of the surface but suffer from high computational cost [3,4].To this end, significant effort has led to several approximate interface evolution equations that are much simpler to solve but are nevertheless valid under specific assumptions and limitations.Beyond their limits of validity, it is often found that they yield nonphysical solutions, or even blow up [2], posing significant restrictions to their applicability.
In order to drastically enable Computational Fluid Dynamics and break new barriers in flow control, uncertainty quantification and shape optimization, it is crucial to develop novel, robust and efficient data-driven/data-assisted models that combine physical and mathematical insight with machine learning strategies.This work presents a methodology for deriving data-driven partial differential equations (PDEs) for the film amplitude, based on a collection of NS simulation data, that are not subject to restrictions and assumptions for the flow and hence are more general.Our work falls in the category of dynamical system identification [5][6][7][8].Recently increased interest in PDE identification has led to the development of alternative algorithmic tools, such as sparse identification of nonlinear dynamical systems using dictionaries [9,10], PDE-net [11], physics-informed neural networks [12], and others [13][14][15] Our algorithmic approach can be implemented on data from detailed PDE simulations [16], agent-based modeling [17,18] or Lattice Boltzmann simulations [19,20] among others.Extensions of PDE identification including gray-box or closure identification (such as those explored in our work) have been studied in the context of various applications [16,17,[21][22][23][24][25][26].In the relevant literature, the Kuramoto-Sivashinsky (KS) equation, selected in this work as a low-fidelity counterpart of the NS equations, has served as a benchmark case study, due to its wealth of dynamic responses and highly nonlinear nature [5,14,[27][28][29].
The results of the learned PDE are compared to the ground-truth NS results and also the results of the KS equation.It is expected that past a certain limit, the KS will perform poorly and produce nonphysical solutions.Yet it is still useful in the context of learning an accurate amplitude PDE, as it will be shown, in two different ways: In the first "Gray box" model approach, an additive correction of the KS, can be learned from NS data as a sort of calibration of the low-fidelity model against high-fidelity data.In this context, the data-driven model provides a measure of the discrepancy between the approximate equation and the ground truth, and serves to inform as to the actual limits of applicability of the KS in terms of the flow parameter, here the Reynolds number.In the second approach, which we call the "functional correction Gray Box model", certain observations of the KS formula, such as the value of its right-hand-side, its derivatives or even values in specific nearby time-instances or nearby points in space, are used as inputs to the learned model.
In addition, reduced representations of the NS data, including full velocity fields and fluid film height, are further exploited for out-of-sample predictions from partial data at the small data limit.Nonlinear manifold learning, specifically Diffusion Maps and linear methods, i.e.Proper Orthogonal Decomposition (POD) are initially implemented, in order to derive a low-order description of the high-dimensional data-set.It is then shown that efficient interpolation in the reduced space can help recover entire sets of data from partial information.Specifically, it will be demonstrated that full velocity profiles, parameter values and film height measurements can be predicted given a handful of values for the film height at specific locations.The advantages of "Gappy" Diffusion Maps over its linear counterpart, Gappy POD, are discussed in relation to the parsimony of the description of the manifold that contains the data and to the location of the known measurements.

Bifurcation diagram: NS vs KS
The NS data necessary for learning the amplitude equations are derived by solving the time dependent equations, as presented in Methods, in an Eulerian frame.We consider the flow over a vertical plane, schematically presented in 1, i.e. θ = 90 • with a domain of dimensionless length, l = L/H N = 95; L denotes the dimensional length of the domain.The value of the Weber number used in the simulations is W=278.We start our simulations from an initial condition corresponding to a flat film perturbed by a sinusoidal perturbation with amplitude 3% of the dimensionless Nusselt film height, H = 1.The height of the film, h(x, t), is collected at each time-step until a steady travelling wave is formed.
The single-equation surrogate of the amplitude, selected in this study, is the Kuramoto-Sivashinsky equation.Assuming the flow over a vertical plane, the KS equation can be written as The KS can be derived by the NS under certain assumptions which are summarized for completeness in Methods.It would be useful to briefly discuss the limitations of the KS equation before proceeding with the presentation of our results.Even though, it is well-known that the KS equation is valid for Reynolds number values of O(1), to the best of our knowledge there is no direct comparison in the literature of the KS to the NS results.To clearly present the limitations of the KS equation, here we plot in Fig. 2 the norm of the amplitude distribution, η, with respect to both the Reynolds number and the KS parameter α.
Both equations predict very similar results for 1 ≤ R ≤ 3.4 (or 4 ≤ α ≤ 13.43).In this range of parameter values, the solution of the KS is a stationary wave (in a co-moving frame with speed c=-3), whereas the NS, solved in an Eulerian frame evolves into a travelling wave with a steady and unchanged shape and speed c=3.Past that point (R=3.4 and α = 13.43), the speed of the wave becomes larger than 3 and hence the solution of the KS is also travelling (with speed c-3).Up to approximately R=4.3 (α = 16.5) the KS start to gradually deviate from the NS solution.For higher values of α, i.e. α > 16.998, the solution of the KS is a so-called "pulsing" wave, as described in detail in [30].The pulsing waves oscillate between two waveforms that are π-periodic in space and are π/2 shifts of each other.Such a solution has not been reported for the NS equation.

Black box model: Learning the PDE
Here our goal is to use data from the NS simulations to learn a PDE of the general form: The function f is approximated by a fully connected neural network.The inputs to the neural network are the amplitude and spatial derivatives of the amplitude, as extracted from the NS simulations.Specifically, the NS model is implemented for 20 parameter values, and snapshots, i.e. time-instances of the film surface evolution are collected in equally sized time-steps (dt = 1 is the dimensionless time unit).The spatial derivatives, up to 4th order are computed using Fourier transforms, in each point is space and time.The time derivative of the amplitude can be extracted directly from the NS code (although it can also be easily computed, e.g with finite differences).
This collection of data is then used to train the neural network to predict the time derivative of the amplitude from the value of the amplitude and a few spatial derivatives.Once this is done, the right-hand-side of the PDE in Eq. 2 can be used in conjunction with any method of integration in time, such as the Runge-Kutta.The attractors that resulted from the integration of the learned PDE are shown in Fig. 3, for a representative selection of parameter values.The attractor of the neural network derived PDE, shown in red, is almost a perfect match with the ground-truth results of the NS (blue line).
For reference and comparison, the KS results (appropriately rescaled) are shown in the same plot (black line).The KS performs well for small values of the R number (R<3.3), but then progressively starts to deviate quantitatively for increasing values of R.This is shown in Fig. 4, where a snapshot of the amplitude derived by the KS, the NS and the Black Box model are shown alongside the corresponding phase portraits.Despite the apparent failure of the KS to capture the wave dynamics accurately, it still yields qualitatively good results.We exploit this further, to infuse physical information into the data-driven amplitude equation.This is discussed in the following paragraphs.

Gray box model I: Learning an additive correction to the KS
Instead of training a neural network to learn the right-hand-side of a PDE as a "Black box", i.e. without any physical intuition about the function, the KS (the approximate analytical model) is used as foundation upon which a correction is added to make it more accurate.This correction is discovered in a data-driven way, using the same data as before (described in the previous paragraph).In this case though, the output of the neural network is not the time derivative of the amplitude but rather, the difference between the actual and the KS time-derivative: this can be thought of then as a "Residual Network", a ResNet [31].This approach maintains the physical insight already offered by the approximate equations, but improves its accuracy in a data-driven fashion.The predictions of this corrected model, referred to as a "Gray box" model, to contrast with the "Black box" model presented before, are visually very close to the ground truth as shown in Fig. 5, for R = 4.2.

Gray Box model II: a Functional Correction.
Exploiting further the physical insight of the KS, even in parameter ranges where it is inaccurate, it is possible to use local, in space and/or in time, values of the KS right-hand-side to approximate the "correct" right-hand-side of the data-driven PDE.Now, a neural network is trained to predict the time derivative of the amplitude, given a few locally nearby values of the KS timederivative, or a few of its derivatives with respect to the dependent variable, η, or a few nearby values of its spatial partial derivatives, e.g.η x , η xx .Several flavours of this approach are implemented: In the last two examples, the subscripts, j, j − 1, j + 1 signify points in space where the value of the the KS right-hand-side, f KS is taken in the same timestep; whereas the subscripts, t, t − 1, t − 2 stand for different nearby points in time, where the value is taken at the same point in space.

Out of sample predictions: A "Gappy DMAP" approach
We now shift our focus to exploiting NS data to recover missing information.Missing data is a critical problem in applications in flow measurement and monitoring.For example, in film flow applications, it is often easy for an experimentalist to measure the film height, whereas being able to evaluate the detailed underlying flow field is a significantly more difficult task, if not impossible in the case of opaque liquids.Our goal here is to exploit NS data to derive a predictive tool, e.g. for the full velocity field or even the flow parameter, the Reynolds number, from only partial information.More importantly, to be able to do so as efficiently as possible and without having to care too much about the sensor positions.The proposed approach is inspired by Gappy POD [34], according to which it is possible to recover missing information from a vector that we know belongs to the subspace spanned by a few predetermined POD modes, by performing efficient interpolation in this reduced subspace.Here, the same concept is demonstrated, but also the notion of deriving a reduced description of the data with nonlinear manifold learning, in this case with Diffusion Maps [35][36][37] (details can be found in Methods).The added benefit is twofold: firstly, Diffusion Maps identify a parsimonious parametrization of the reduced subspace, which requires significantly less modes than POD, especially if the data belong to a curved manifold.The second benefit it related to the fact that in Gappy POD the accuracy of the method is critically influenced by the "location" of the known elements of the vector.The reason is purely numerical, and has to do with the condition number of the Gappy matrix M = (m•Φ) • (m•Φ), with Φ the selected POD basis and m the mask matrix that defines which elements of the vector are known.
The implementation of "Gappy" Diffusion Maps, starts by identifying a parametrization of the manifold, where the data-set belongs to.It is found that three diffusion coordinates are enough to describe any vector in the dataset.Then, a second round of Diffusion Maps is implemented, in conjunction to Geometric Harmonics interpolation (details can be found in Methods), in order to map from any point on the reduced space to the high-dimensional ambient space.Having established the methods for mapping between the ambient and the reduced space, it is now possible, given partial information, to find first the corresponding reduced coordinates; and then the entire ambient vector, including the missing information.The accurate performance of this workflow is demonstrated in Fig. 7, where three cases are examined: (i) 80 points along the interface are known, from which the velocity and parameter value is recovered, with a maximum error of 4%; (ii) 8 points, evenly distributed along the interface are known, from which again the velocity values and the parameter is recovered with a maximum error of 4%; (iii) 40 points are known belonging to only half of the interface shape, leading to prediction of the velocities and the parameter with 4% error.
The same computational experiments are conducted with Gappy POD and the results are shown in Fig. 8. First a POD basis is determined, based on the reconstruction error of the data-set, which leads to a basis with 6 POD vectors.The same points along the interface as before are considered known: (i) with 80 points along the interface, the maximum prediction error for the unknown velocity and parameter values is 10%;(ii) with 8 equidistant points along the interface, the maximum prediction error is approximately the same; (iii) when The blue dots in the top figures signify points where the value of the amplitude is considered known.On the left column, the value of the amplitude in 80 points is considered known and the maximum error is 4%; At the center, the value at 8 equidistant points is considered known and the maximum error is 4%; on the right, the value at 40 points in the first half of the wave are considered known and the maximum error is again close to 4% Fig. 8 Gappy POD; R=3.5;Each column of figures presents the actual velocity contours (top), predicted velocity contours (center) and error(bottom).The blue dots in the top figures signify points where the value of the amplitude is considered known.On the left column, the value of the amplitude in 80 points is considered known and the maximum error is 10%; at the center, the value at 8 equidistant points is considered known and the maximum error is 10%; on the right, the value at 40 points in the first half of the wave are considered known and the maximum error is close to 35% 40 points along half of the interface are considered, then the maximum error soars to 35% and the predicted wave shape and velocity distribution is visibly inaccurate.

Discussion
In this work we presented three different strategies for deriving accurate and economical surrogates of the amplitude evolution of falling thin films.
The first option is purely data-driven and physics-agnostic, and relies on learning, as a "Black box", an amplitude PDE from observed NS data over a range of R values.In essence, the right-hand-side of the PDE is substituted by an Artificial Neural Network, which can then be integrated in time, for various times, different initial conditions and parameter values.
As an alternative, we propose using a low-fidelity model, here the KS equation, in order to infuse physical intuition into the learned model.This is achieved in two different ways: the first, the additive "Gray Box" approach, uses a few high-fidelity data (results of the NS) to calibrate the KS, by learning, by way of a neural network, an additive correction rather that the entire right-hand-side.
The second strategy, the so-called "Functional Gray box" approach, is inspired by Takens' embedding theory, and proposes learning the accurate amplitude dynamics from a few observations of an inaccurate model right-hand side, such as the KS.Four different options are presented, utilizing either the RHS of the KS and some of its derivatives, or the value of the RHS operator at three points in space (at the same time instance) or in time (for the same point in space).This last approach is a demonstration of how a model that is qualitatively close to the ground truth, but quantitatively off, can be leveraged, in the data-driven era, into a more accurate and efficient learned model.
Finally, we presented a "Gappy DMAP" methodology, the nonlinear counterpart of Gappy POD, which allows us to infer quantities that are inaccessible to measuring devices, such as the velocity profile of the fluid below the surface, when only some measurements are known, such as the height of the thin film at certain points.This may be trivial for low R values, since the interface height is "slaved" to the velocity.Nevertheless, for slightly higher R values, the amplitude is no longer a function of just the position (hence surrogate models with more that one equation become necessary in this flow regime) [2,[38][39][40].
The benefits of the proposed approach are twofold: nonlinear manifold learning methods, such as DMAPs, yield a more parsimonious description of the manifold, requiring only a few coordinates to accurately reconstruct the original data.In contrast, we demonstrate that Gappy POD, requires higherdimensional hyper-planes to span the data.The second advantage of Gappy DMAPs is related to the choice of known values: some consideration is necessary for choosing points that contain rich enough information in order to achieve accurate reconstruction; it is, nevertheless, less sensitive to the position of the provided measurements than its linear counterpart.

The Navier-Stokes equations for flow on an vertical plane
The flow of a liquid on an inclined plane is described, in two dimensions, by conservation equations for mass and momentum, written in dimensionless form: where u = (u x , u y ) T and P are the dimensionless velocity vector and pressure, respectively, and ∇ = (∂ x , ∂ y ) denotes the gradient operator for Cartesian coordinates.We also define the unit gravity vector g = (sin θ, − cos θ) T .Using the characteristic Nusselt scales for the velocity vector and all lengths the dimensionless groups that emerge are the Reynolds number R, the Weber number W , and the Stokes number F , defined as Here, ρ, µ and σ are the liquid density, the viscosity and the liquid/air surface tension, respectively, while Q denotes the volumetric flow rate per unit length normal to the cross section.
Along the liquid/air interface, a normal stress balance between capillary force and viscous stress is applied Here, the total stress tensor is defined as T = −PI + ∇u + (∇u) T , n is the unit vector normal to the interface, outward with respect to the film.Note that in Eq. 8 the ambient pressure has been set equal to zero (datum pressure) without loss of generality.The mean curvature is κ = −∇ s • n , with ∇ s = (I − nn) • ∇.The rest of the boundary conditions include the no slip condition at the liquid/solid plane interface (denoted as AB in Fig. 1) and periodic boundary conditions at the lateral domain boundaries (AD and BC in Fig. 1).The kinematic boundary condition, which specifies that the velocity of the interface that is normal to the boundary is equal to the velocity of the fluid that is normal to the boundary, ensuring no mass transfer through the interface, completes the set of governing equations:

Transformations between Navier-Stokes and Kuramoto-Sivashinsky scales
The (x, t) NS frame of reference is mapped on the KS (ξ, τ ) through the following expressions while the interfacial height, h(x, t) is related to the amplitude φ(ξ, τ ) as To be able to compare the results between NS and KS we employ the chain rule and derive the appropriate transformations for the time and spatial derivatives.To transform NS data to the KS formulation the following expressions can be used: Inversely, to map KS data to the NS formulation the following expressions can be used:

Derivation of the KS from the NS
The detailed derivation of KS from NS can be found in [40][41][42] and it is summed up here for completeness.It is based in the following assumptions: • The mean height of the film is much larger than the deviation from the mean Under these assumptions, it is possible to exploit the small parameter and employ a perturbation expansion for all dependent variables, i.e. velocities, pressure and interfacial height; e.g. the interfacial height is given by h ≈ 1+ η+ O( 2), where η denotes the deviation from the mean film height.Restricting to the case of laminar flow with R = O(1), F = O(1) and W = O( −2 ) and neglecting higher order terms, the kinematic condition (written in terms of the deviation, η) may be reduced to: η t + F η x = 0, indicating that waves travel with speed −F .Taking this into account, a new variable ξ = (x − F t) can be introduced to obtain constant shape waves traveling with speed F .Moreover, the amplitude is rescaled according to φ = 15  RF η.Finally, since it is known that the wave amplitude vary on a slow time scale compared with the traveling motion, a change in the time variable is introduced, i.e. τ = 4 W F 12 t.In the end, the KS (Eq. 1) is obtained as a function of the new defined variables.

Diffusion Maps
Diffusion maps [35][36][37] is a framework that can (based upon diffusion processes) facilitate discovering meaningful low-dimensional intrinsic geometric descriptions of data sets, even when the data is high-dimensional, nonlinear and/or corrupted by (relatively small) noise.The method is based on the construction of a Markov transition probability matrix, corresponding to a random walk, on a graph whose vertices are the data points, with transition probabilities being the local similarities between pairs of data points.The leading few eigenvectors of the sparse Markov matrix can be used as data-driven coordinates that provide a reparametrization of the data.
To construct a low-dimensional embedding for a data set X of M individual points (represented as d-dimensional real vectors x 1 , ..., x M ), a similarity measure d ij between each pair of vectors x i , x j is computed.The standard Euclidean distance or the Euclidean norm may be considered to this end.By using this similarity measure,an affinity matrix is constructed.A popular choice is the Gaussian kernel where δ defines a scale hyperparameter which quantifies the local similarity for each data point.To recover a parametrization regardless of the sampling density, the normalization is performed, where P ii = M j=1 W ij and α = 1 to factor out the density effects.A second normalization applied on W, gives a M × M Markov matrix K; where D is a diagonal matrix, collecting the row sums of matrix W. The stochastic matrix K has a set of real eigenvalues 1 = λ 1 ≥ ... ≥ λ M with corresponding eigenvectors φ i .
To check if model (variable) reduction can be achieved, the number of retained eigenvectors has to be appropriately truncated.In practice, it is useful to consider that not all obtained eigenvectors parametrize independent directions, but rather most of them can be considered as spanning the same directions with different frequencies.Eigenvectors that parametrize the same directions in this context are called harmonics and the ones that parametrize independent directions non-harmonics.A minimal representation of the DMAP space is made possible by carefully selecting the non-harmonic coordinates, which do not necessarily correspond to the most dominant eigenmodes of the Markov matrix.This is a stark difference between Diffusion Maps and its linear counterpart, Proper Orthogonal Decomposition or Principal Component analysis, where the dominant modes are retained for the truncated representation of the data.If the number of the non-harmonic eigenvectors is less than the number of the ambient space dimensions then model (variable) reduction is achieved.
A proposed algorithm for identifying the non-harmonic eigenvectors is presented in [43], based on local linear regression.In a nutshell, a local linear function is used in order to fit the DMAP coordinate φ k as a function, f , of the previous vectors Φ k−1 = [φ 1 , φ 2 , ..., φ k−1 ].If φ k can be accurately expressed as function of the other DMAP coordinates, then it does not represent a new direction on the dataset, and is omitted for dimensionality reduction.On the contrary if φ k cannot be expressed as a function of the previous eigenvectors then φ k is a new independent eigendirection that must be retained for a parsimonious representation of the data.To quantify the accuracy of the fit, the following metric is used: A small value of r k is associated with a φ k that is a harmonic function of the previous eigenmodes, whereas a higher value of r k signifies that φ k is a new independent direction on the data manifold.It has been shown in [43] that selecting only the eigenvectors that correspond to higher values of r k leads to a parsimonious representation of the data.Eventually, the vector x i is mapped to a vector whose first component is the i -th component of the first selected nontrivial eigenvector, whose second component is the i -th component of the second selected nontrivial eigenvector, etc.
To map a new point, x new , from the ambient space to DMAP space, a mathematically elegant approach known as the Nyström extension, introduced in [3] is used, summarized here for completeness.The starting point of the Nyström extension is to compute the distances, d(•, x new ), between the new point, x new , and the M data points in the original data set, the same normalizations used for DMAP need to be applied also here.The Nyström extension formula reads where λ j is the j -th eigenvector and φ j (x i ) is the i -th component of the j-th eigenvector.

Geometric Harmonics
Geometric Harmonics was introduced in [35], inspired by the Nyström Extension as a scheme for extending functions defined on data X, f (X) : X → R, for x new / ∈ X.This out-of-sample extension is achieved by using a particular set of basis functions called Geometric Harmonics.Those functions are computed as eigenvectors of the symmetric M × M W matrix.The eigendecomposition of the symmetric and positive semidefinite matrix W leads to a set of orthonormal eigenvectors ψ 1 , ψ 2 , . . ., ψ M with non negative eigenvalues From this set of eigenvectors, to avoid numerical issues, we consider a truncated subset S δ = (α : σ α ≥ δσ

Double Diffusion Maps and their Latent Harmonics
A slight twist of the Geometric Harmonics is presented in this section.As discussed above, Geometric Harmonics constructs an input-output mapping between the ambient coordinates X and a function of interest f defined on X.However, it is possible if the data are lower dimensional, to construct a map in terms of only the non-harmonic eigenvectors.This is achieved similar to the traditional Geometric Harmonics, by firstly constructing an affinity matrix w(i, j) = exp − φ i − φ j 2 .
In this case the affinity matrix is constructed in terms of only the non-harmonic DMAPs coordinates.To distinguish the notation between Geometric Harmonics and Double Diffusion Maps we will use •.As in the traditional Geometric Harmonics the function f is projected to a truncated set of the obtained eigenvectors The extension of f for φ new is achieved by firstly extending the values of the Geometric Harmonic functions Ψ β for φ new , and then estimating the value of f at φ new

Gappy POD
In this section the Gappy POD method is summarized for completeness.Consider a data set X of M vectors (represented as d-dimensional real vectors x 1 , ..., x M ).A POD basis, Φ ∈ N ×M , of X is computed, such that X can be approximated as a linear combination of p vectors: X = p j=1 c j Φ j or, in matrix-vector format: The size of the truncated POD basis Φ is selected based on the error between the actual vector X and the reconstructed approximation X : reconstruction error = X − X Consider now a vector X that is spanned by the same basis Φ and that only m values of this vector are known, so that the partial vector X partial can be defined: The goal is to find coefficients c , such that an approximation X of the vector X can be defined as: X = X • c ; then X partial ≈m•X • c .Finding the values of c that satisfy the above leads to an optimization problem solved through the linear system:

Fig. 1
Fig. 1 Cross-section of a film flowing on plane, inclined with respect to the horizontal by angle θ.H = 1 is the dimensionless Nusselt film height

Fig. 2
Fig. 2 Bifurcation diagram of the Navier-Stokes and the Kuramoto-Sivashinsky

Fig. 3
Fig. 3 Attractors for different R numbers derived by (i)the Navier-Stokes (shown in blue) (ii) the Kuramoto-Sivashinsky (shown in black) and (iii) the NN-derived PDE (red); all results are rescaled in the NS scaling.

Fig. 4
Fig. 4 Left:Attractors Right: time-instance of the amplitude; derived by (i)the Navier-Stokes (shown in blue) (ii) the Kuramoto-Sivashinsky (shown in orange) and (iii) the NNderived PDE (black); Top row: R=1.95; Bottom row: R=4.2; all results are rescaled in the NS scaling.

Fig. 6
Fig. 6 R=4.2;For each case of Functional Gray box model (denoted below each row of figures): Ground truth wave vs Functional Gray box model attractor comparison (left); wave comparison at a specific time-step (center); Absolute error at each point in space and time between "Gray box II" and ground truth.

Fig. 7
Fig.7Double DMAPS; R=3.5;Each column of figures presents the actual velocity contours (top), predicted velocity contours (center) and error(bottom).The blue dots in the top figures signify points where the value of the amplitude is considered known.On the left column, the value of the amplitude in 80 points is considered known and the maximum error is 4%; At the center, the value at 8 equidistant points is considered known and the maximum error is 4%; on the right, the value at 40 points in the first half of the wave are considered known and the maximum error is again close to 4%

˜ 2 ,
1 ) where δ > 0. The extension of f for a new point x new is accomplished by first projecting the function of interest in the (truncated) computed set of eigenvectors f → P δ f = α∈S δ f, ψ α ψ α and then extending the function f forx new / ∈ X (Ef )(x new ) = α∈S δ f, ψ α Ψ α (x new )whereΨ α (x new ) = λ −1 α M i=1 w(x new , x i )ψ α (x i ) and w(x new , x i ) = exp − d i d i = x new − x i2