Hostname: page-component-77c78cf97d-4gwwn Total loading time: 0 Render date: 2026-05-04T10:53:01.547Z Has data issue: false hasContentIssue false

Leveraging scale separation and stochastic closure for data-driven prediction of chaotic dynamics

Published online by Cambridge University Press:  20 April 2026

Ismaël Zighed*
Affiliation:
Institut Jean le Rond d’Alembert, Sorbonne Université, Paris, France ISIR, Sorbonne Université, Paris, France SCAI, Sorbonne Université, Paris, France
Nicolas Thome
Affiliation:
ISIR, Sorbonne Université, Paris, France Institut Universitaire de France, France
Patrick Gallinari
Affiliation:
ISIR, Sorbonne Université, Paris, France Criteo AI lab, France
Taraneh Sayadi
Affiliation:
M2N, Conservatoire National des Arts et Metiers, Paris, France
*
Corresponding author: Ismaël Zighed; Email: ismael.zighed@sorbonne-universite.fr

Abstract

Simulating turbulent fluid flows is a computationally prohibitive task, as it requires the resolution of fine-scale structures and the capture of complex nonlinear interactions across multiple scales. Consequently, extensive research has focused on analysing turbulent flows from a data-driven perspective. However, due to the complex and chaotic nature of these systems, traditional models often become unstable. To overcome these limitations, we propose a purely stochastic approach that separately addresses the evolution of large-scale coherent structures and the closure of high-fidelity statistical data. To this end, the dynamics of the filtered data are learnt using an autoregressive model. This combines a variational-autoencoder (VAE) and Transformer architecture. The VAE projection is probabilistic, ensuring consistency between the model’s stochasticity and the flow’s statistical properties. The mean realisation of stochastically sampled trajectories from our model shows relative $ {L}_1 $ and $ {L}_2 $ distances of 6% and 10%, respectively. Moreover, our framework enables the construction of meaningful confidence intervals, achieving a prediction interval coverage probability of 80% with minimal interval width. To recover high-fidelity velocity fields from the filtered space, Gaussian Process (GP) regression is employed. This strategy has been tested in the context of a Kolmogorov flow exhibiting chaotic behavior. We compare the performance of our model with state-of-the-art probabilistic baselines, including a VAE and a diffusion model. We demonstrate that our Gaussian process-based closure outperforms these baselines in capturing first and second moment statistics in this particular test bed, providing robust and adaptive confidence intervals.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Velocity fields and corresponding kinetic energy signal.

Figure 1

Table 1. Evaluation metricss of the probabilistic forecasts

Figure 2

Figure 2. Time-averaged energy spectrum displayed on a logarithmic scale in the wavenumber space.

Figure 3

Figure 3. Low-pass filter threshold keeping 90% of the energy.

Figure 4

Figure 4. Low-pass filter on the energy spectrum and filtered energy spectrum in the log-scale of the wavenumber space.

Figure 5

Figure 5. Effect of low-pass filter on velocity fields and kinetic energy.

Figure 6

Figure 6. Reduced-order model architecture, used to predict filtered trajectories.

Figure 7

Figure 7. Comparison of predicted and validation trajectories for both flow fields at a given time ($ 0.25\tau $). The plot also shows an ensemble of sampled trajectories generated by the ROM at different locations over $ 2\tau $. The solid red line represents the ROM mean prediction, the dashed black line indicates the validation trajectory, and the shaded areas denote the $ \pm \sigma $ and $ \pm 3\sigma $ confidence intervals.

Figure 8

Figure 8. Probability density functions (PDFs) for fields U and V.

Figure 9

Figure 9. Gaussian process prediction example for a single training and inference point ($ N=M=1 $).

Figure 10

Figure 10. Mapping between low-pass filtered POD space and full-scale POD space.

Figure 11

Figure 11. Gaussian Process Regression (GPR) train and validation over 3 POD coefficients.

Figure 12

Table 2. Evaluation metrics on validation set

Figure 13

Figure 12. Training samples kernel $ {\mathcal{K}}_{XX} $.

Figure 14

Figure 13. Data pipeline.

Figure 15

Figure 14. Averaged energy spectrograms in the wavenumber space.

Figure 16

Figure 15. Energy at a snapshot T = 120.

Figure 17

Figure 16. Space-time evolution of the kinetic energy $ k=\frac{1}{2}\left({u}^2+{v}^2\right) $ averaged along the $ x $-direction. The color scale represents $ {\left\langle k\left(y,t\right)\right\rangle}_x $ as a function of the wall-normal coordinate $ y $ and the dimensionless time. The data distribution for each $ \tau $ interval is compared with the PDF of the test set.

Figure 18

Figure 17. Statistical consistency of long roll-outs, quantified by the Wasserstein distance of energy distributions and the preservation of chaotic dynamics.

Figure 19

Figure 18. Comparison of turbulent energy spectra and dissipative manifolds for reduced bases of 3 and 18 POD modes.

Figure 20

Table 3. Comparison of evaluation metrics between the our Gaussian Process Regression, the Variational Autoencoder (VAE), and diffusion model

Figure 21

Figure A1. Evolution of the dissipative manifold ($ k,\epsilon $) with $ k $ the kinetic energy and $ \epsilon $ the dissipation rate, under spectral filtering, with a cut-off frequency preserving 100% of the total kinetic energy (left), 90% (centre), and 80% (right). Note the transition from a chaotic cloud to a one-dimensional limit cycle, indicating the suppression of unsteady dynamics.

Figure 22

Table A1. Model and data configuration parameters

Submit a response

Comments

No Comments have been published for this article.