Hostname: page-component-89b8bd64d-r6c6k Total loading time: 0 Render date: 2026-05-08T07:03:08.714Z Has data issue: false hasContentIssue false

Long-term stability and generalization of observationally-constrained stochastic data-driven models for geophysical turbulence

Published online by Cambridge University Press:  10 January 2023

Ashesh Chattopadhyay
Affiliation:
Department of Mechanical Engineering, Rice University, Houston, Texas, USA
Jaideep Pathak
Affiliation:
NVIDIA, Santa Clara, California, USA
Ebrahim Nabizadeh
Affiliation:
Department of Mechanical Engineering, Rice University, Houston, Texas, USA
Wahid Bhimji
Affiliation:
Lawrence Berkeley National Laboratory, Berkeley, California, USA
Pedram Hassanzadeh*
Affiliation:
Department of Mechanical Engineering, Rice University, Houston, Texas, USA
*
*Corresponding author. E-mail: ph25@rice.edu

Abstract

Recent years have seen a surge in interest in building deep learning-based fully data-driven models for weather prediction. Such deep learning models, if trained on observations can mitigate certain biases in current state-of-the-art weather models, some of which stem from inaccurate representation of subgrid-scale processes. However, these data-driven models, being over-parameterized, require a lot of training data which may not be available from reanalysis (observational data) products. Moreover, an accurate, noise-free, initial condition to start forecasting with a data-driven weather model is not available in realistic scenarios. Finally, deterministic data-driven forecasting models suffer from issues with long-term stability and unphysical climate drift, which makes these data-driven models unsuitable for computing climate statistics. Given these challenges, previous studies have tried to pre-train deep learning-based weather forecasting models on a large amount of imperfect long-term climate model simulations and then re-train them on available observational data. In this article, we propose a convolutional variational autoencoder (VAE)-based stochastic data-driven model that is pre-trained on an imperfect climate model simulation from a two-layer quasi-geostrophic flow and re-trained, using transfer learning, on a small number of noisy observations from a perfect simulation. This re-trained model then performs stochastic forecasting with a noisy initial condition sampled from the perfect simulation. We show that our ensemble-based stochastic data-driven model outperforms a baseline deterministic encoder–decoder-based convolutional model in terms of short-term skills, while remaining stable for long-term climate simulations yielding accurate climatology.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Figure 1. Time-averaged zonal-mean velocity (over 20,000 days), $ \left\langle \overline{u}\right\rangle $, of the imperfect and perfect system. The difference between $ \left\langle \overline{u}\right\rangle $ of the perfect and imperfect systems indicates a challenge for DDWP models to seamlessly generalize from one system to the other.

Figure 1

Table 1. Number of layers and filters in the convolutional VAE architecture used in this article.

Figure 2

Figure 2. A schematic of the transfer learning framework with a VAE pre-trained on imperfect simulations and transfer learned on noisy observations from the perfect system. Here, $ {x}^m\left(k\Delta t\right) $ are states obtained from the imperfect system and $ {x}^o\left(n\Delta t\right) $ are noisy observations from the perfect system. Here, $ \Delta t\hskip0.35em =\hskip0.35em 40\Delta {t}_n $. Note that, our proposed VAE is convolutional and the schematic is just representative. Details about the architecture are given in Table 1.

Figure 3

Figure 3. Performance of the stochastic convolutional VAE as compared to baseline convolutional encoder–decoder-based model when trained on the imperfect system and predicts from a noisy initial condition sampled from the imperfect system. (a) Anomaly correlation coefficient (ACC) between predicted $ {\psi}_1 $ and true $ {\psi}_1 $ (Murphy and Epstein, 1989) from the imperfect system with $ \eta \hskip0.35em =\hskip0.35em 5\% $ (noise level) for the initial condition. (b) Same as panel (a), but for RMSE. (c) Prediction horizon (number of $ \Delta {t}_n $ until ACC $ \le \hskip0.35em 0.60 $) for different noise levels added to the initial condition. (d) Averaged error over 2 days for different noise levels of initial condition. The shading shows the standard deviation across 100 ensembles generated by the VAE model during inference.

Figure 4

Figure 4. Performance of the stochastic convolutional VAE compared to that of the baseline convolutional encoder–decoder-based model and the imperfect numerical model when trained on the imperfect system, re-trained on noisy observations from the perfect system, and initialized with a noisy initial condition sampled from the perfect system. (a) Anomaly correlation coefficient (ACC) between predicted $ {\psi}_1 $ and true $ {\psi}_1 $ from the perfect system with $ \eta \hskip0.35em =\hskip0.35em 5\% $ (noise level) for the initial condition. (b) Same as panel (a), but for RMSE. (c) Prediction horizon (number of $ \Delta {t}_n $ until ACC $ \le 0.60 $) for different noise levels added to initial condition. (d) Prediction horizon of the models with different sample sizes of noisy observation data (as percentages of original training sample size) from perfect system. Note that the numerical model is not trained on data and hence would show no effect.

Figure 5

Figure 5. Long-term climatology obtained from integrating the models for 20,000 days. $ \left\langle {\overline{\psi}}_1\right\rangle $ is the time mean zonally averaged $ {\psi}_1 $, $ \left\langle \overline{u}\right\rangle $ is the time mean zonally averaged upper-layer velocity, and EOF1 is the first EOF (empirical orthogonal function) of $ \left\langle \overline{u}\right\rangle $. The VAE shows non-drifting physical climate. Baseline encoder–decoder model is not shown since it becomes unstable within 100 days of seamless integration.