Hostname: page-component-77f85d65b8-6c7dr Total loading time: 0 Render date: 2026-03-28T16:17:03.817Z Has data issue: false hasContentIssue false

GenFormer: a deep-learning-based approach for generating multivariate stochastic processes

Published online by Cambridge University Press:  27 February 2025

Haoran Zhao*
Affiliation:
Department of Civil and Environmental Engineering, Cornell University, Ithaca, NY, USA
Wayne Uy
Affiliation:
Center for Applied Mathematics, Cornell University, Ithaca, NY, USA
*
Corresponding author: Haoran Zhao; Email: hz289@cornell.edu

Abstract

Stochastic generators are essential to produce synthetic realizations that preserve target statistical properties. We propose GenFormer, a stochastic generator for spatio-temporal multivariate stochastic processes. It is constructed using a Transformer-based deep learning model that learns a mapping between a Markov state sequence and time series values. The synthetic data generated by the GenFormer model preserve the target marginal distributions and approximately capture other desired statistical properties even in challenging applications involving a large number of spatial locations and a long simulation horizon. The GenFormer model is applied to simulate synthetic wind speed data at various stations in Florida to calculate exceedance probabilities for risk management.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Table 1. Illustrated mapping between the observed rainfall data and the corresponding Markov states for $ m=2 $ locations. Observations are mapped to the Markov states depending on which location experiences rainfall

Figure 1

Table 2. Reshuffling of a hypothetical two-station example with five time stamps. (a) Samples of $ \tilde{\boldsymbol{x}}(t) $ at $ {t}_1,\dots, {t}_5 $ are generated from the resampling step described in Section 2.1.2, with $ {\tilde{\boldsymbol{r}}}_1 $ and $ {\tilde{\boldsymbol{r}}}_2 $ being the corresponding ranks; (b) Synthetic samples $ {\boldsymbol{z}}_1 $ and $ {\boldsymbol{z}}_2 $ are simulated from the marginal distribution at each location; (c) Samples are reshuffled according to the ranks $ {\tilde{\boldsymbol{r}}}_1 $ and $ {\tilde{\boldsymbol{r}}}_2 $

Figure 2

Figure 1. Deep learning model architecture based on the encoder-decoder framework. The model processes inputs through an embedding layer (red block), generating the hidden representation which undergoes further updates in the encoder layers (gray block). The decoder (green block), in conjunction with a linear layer (purple block), utilizes the hidden representation from the encoder for generative inference, yielding the predicted sequence (highlighted in yellow).

Figure 3

Table 3. Illustrated mapping of Markov states to a 3-variate process based on $ K $-means clustering. We have $ {y}_j=1 $ when $ {x}_1\left({t}_j\right) $, $ {x}_2\left({t}_j\right) $, $ {x}_3\left({t}_j\right) $ are near $ 12 $, and $ {y}_j=2 $ when $ {x}_1\left({t}_j\right) $, $ {x}_2\left({t}_j\right) $, $ {x}_3\left({t}_j\right) $ are near $ 2 $

Figure 4

Figure 2. Transformer-based deep learning model with Markov state embedding. The proposed approach includes a Markov state embedding in addition to the value and time embedding present in the embedding layer. The remainder of the model architecture is the same as in Figure 1.

Figure 5

Figure 3. Deep learning model for Markov state sequence generation when Markov order $ p\ge 2 $. We adopt a decoder-only structure without cross attention mechanism. The input of the model is the Markov states in the previous $ p $ time stamps concatenated by a vector of length 1. This is passed to an embedding layer and multiple decoder blocks. The Softmax layer normalizes the weights of Markov states to obtain probabilities which the multinomial random variable generator utilizes to generate synthetic Markov states.

Figure 6

Figure 4. Construction of input–output data pairs. For each sequence of realizations, we apply a sliding window of length $ {q}_{\mathrm{in}}^{\mathrm{enc}}+{q}_{\mathrm{out}} $ to the time series matrix $ \mathbf{\mathcal{X}} $ and the vectors $ \mathbf{\mathcal{Y}} $ and $ \mathbf{\mathcal{T}} $ of Markov state and time sequences. The first $ {q}_{\mathrm{in}}^{\mathrm{enc}} $ components of the window are inputs to the deep learning model while the subsequent $ {q}_{\mathrm{out}} $ components constitute the target output sequence for the model.

Figure 7

Table 4. Standard hyperparameters of the GenFormer model for tunning

Figure 8

Figure 5. Scatter plot of the normalized frequencies of Markov states in the observed and simulated sequences. Generating Markov state sequences by estimating the transition matrix from data is computationally challenging for large Markov order $ p $. This example shows that for large $ p $, the trained deep learning model for Markov state sequence generation can closely reproduce the frequencies of Markov states in the observed Markov state sequence data.

Figure 9

Table 5. Comparison among Transformer-based deep learning model, LSTM, and MLP for Section 3.1.2. The Transformer-based model achieves the lowest loss value in this example

Figure 10

Figure 6. Target versus synthetic time series produced by the deep learning model for inference of $ m $-variate processes. The Transformer-based model produces accurate inference of the target based on the same Markov state sequence.

Figure 11

Figure 7. Target spatial correlation matrix of $ \boldsymbol{X}(t) $ (a), various approximations (b), (c), (d), and analytical spatial correlation matrix of $ \boldsymbol{V}(t) $ (e). The estimate produced by the GenFormer model has relative error that is 9 times more accurate than the estimate obtained by the deep learning model alone without the post-processing steps in this example. This highlights the need for the post-processing procedure as a supplement to the deep learning model in order to capture key statistical properties such as the spatial correlation matrix.

Figure 12

Figure 8. Auto-correlation functions of $ \boldsymbol{X}(t) $ and various approximations. The proposed GenFormer model adequately preserves the second-moment properties of the given realizations.

Figure 13

Figure 9. Marginal densities of $ \boldsymbol{V}(t) $ and various approximations. The reshuffling technique in the GenFormer model reduces the $ {L}_1 $ relative error by 1 order of magnitude in this example. This is because the target marginal distributions are directly sampled from in the reshuffling procedure.

Figure 14

Figure 10. Exceedance probability of $ S(t) $. The relative error in the return period attained by the proposed GenFormer model is approximately an order of magnitude lower than those of the translation model, the MLP, and the LSTM. The GenFormer model can capture higher-order statistical properties of $ \boldsymbol{X}(t) $ beyond the second moment in this example.

Figure 15

Table 6. Weather stations in Florida selected in this work

Figure 16

Figure 11. Scatter plot of the normalized frequencies of Markov states in the observed and simulated sequences. Estimating the transition matrix for large $ p $ is prohibitive since the transition matrix would have dimension $ {300}^{36}\times 300 $. In this example, the trained deep learning model for Markov state sequence generation offers a computationally feasible alternative for producing synthetic Markov state sequences with the occurrence frequency of each Markov state being similar to the observed one.

Figure 17

Table 7. Comparison among Transformer-based deep learning model, LSTM, and MLP for Section 3.1.2. The Transformer architecture most effectively captures the temporal patterns in the time series data in this example

Figure 18

Figure 12. Target versus synthetic time series produced by the deep learning model for inference of transformed wind speed time series. Even though the time series data in this example is higher-dimensional and exhibits more volatility, the autoregressive inference from the Transformer-based deep learning model can still effectively approximate the target time series using a modest computational time. The inferred time series does not diverge from the target despite the long simulation horizon.

Figure 19

Figure 13. Target spatial correlation matrix of collected wind speeds (a) and various approximations (b), (c), (d). The estimate of the spatial correlation matrix due to the GenFormer model is 12 times more accurate than the estimate produced by the deep learning model without post-processing. On the other hand, the transformation based on Cholesky decomposition preserves the spatial correlation matrix irrespective of the number of locations $ m $.

Figure 20

Figure 14. Auto-correlation functions of $ {X}_1(t),{X}_3(t),{X}_6(t) $ and various approximations. The trained deep learning model provides satisfactory approximations to the auto-correlation functions with the post-processing procedure introducing only minimal and visually-indiscernible deviations.

Figure 21

Figure 15. Marginal density estimates of $ {X}_1(t),{X}_3(t),{X}_6(t) $. The model post-processing procedure in the GenFormer model, specifically the reshuffling technique, reduces the $ {L}_1 $ relative error by a factor of 11. The deep learning model alone is unable to produce samples with accurate marginal distributions because the training procedure only penalizes the discrepancy in the inferred values.

Figure 22

Figure 16. Synthetic realizations of $ {X}_1(t),{X}_3(t),{X}_6(t) $ produced by the GenFormer model. The synthetic transformed wind speed records appear realistic and can therefore be used for downstream applications of interest.

Figure 23

Figure 17. Exceedance probability of $ S $. The estimate obtained from the GenFormer model is $ 7.6 $, $ 4.0 $, and $ 2.3 $ times more accurate than the translation, MLP, and LSTM models. The predictive capabilities of the Transformer-based deep learning model coupled with the statistical post-processing techniques enable the GenFormer model to capture high-order statistical properties.

Submit a response

Comments

No Comments have been published for this article.