Hostname: page-component-89b8bd64d-mmrw7 Total loading time: 0 Render date: 2026-05-06T22:48:11.809Z Has data issue: false hasContentIssue false

Principal component density estimation for scenario generation using normalizing flows

Published online by Cambridge University Press:  25 March 2022

Eike Cramer
Affiliation:
Institute of Energy and Climate Research—Energy Systems Engineering (IEK-10), Forschungszentrum Jülich GmbH, Jülich 52425, Germany RWTH Aachen University, Aachen 52062, Germany
Alexander Mitsos
Affiliation:
Institute of Energy and Climate Research—Energy Systems Engineering (IEK-10), Forschungszentrum Jülich GmbH, Jülich 52425, Germany JARA Center for Simulation and Data Sciences, Jülich 52425, Germany Process Systems Engineering (AVT.SVT), RWTH Aachen University, Aachen 52074, Germany
Raúl Tempone
Affiliation:
Chair of Uncertainty Quantification, RWTH Aachen University, Aachen 52062, Germany Computer, Electrical, and Mathematical Sciences & Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
Manuel Dahmen*
Affiliation:
Institute of Energy and Climate Research—Energy Systems Engineering (IEK-10), Forschungszentrum Jülich GmbH, Jülich 52425, Germany
*
*Corresponding author. E-mail: m.dahmen@fz-juelich.de

Abstract

Neural networks-based learning of the distribution of non-dispatchable renewable electricity generation from sources, such as photovoltaics (PV) and wind as well as load demands, has recently gained attention. Normalizing flow density models are particularly well suited for this task due to the training through direct log-likelihood maximization. However, research from the field of image generation has shown that standard normalizing flows can only learn smeared-out versions of manifold distributions. Previous works on normalizing flow-based scenario generation do not address this issue, and the smeared-out distributions result in the sampling of noisy time series. In this paper, we exploit the isometry of the principal component analysis (PCA), which sets up the normalizing flow in a lower-dimensional space while maintaining the direct and computationally efficient likelihood maximization. We train the resulting principal component flow (PCF) on data of PV and wind power generation as well as load demand in Germany in the years 2013–2015. The results of this investigation show that the PCF preserves critical features of the original distributions, such as the probability density and frequency behavior of the time series. The application of the PCF is, however, not limited to renewable power generation but rather extends to any dataset, time series, or otherwise, which can be efficiently reduced using PCA.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. Real non-volume preserving transformation (Dinh et al., 2017) with two coupling layers with alternating identity and affine transformations. Arrows point in generative direction. Compositions may include more than two coupling layers. Functions $ {\mathbf{s}}_{\boldsymbol{\theta}}^I\left({\mathbf{z}}_{1:d}\right) $, $ {\mathbf{t}}_{\boldsymbol{\theta}}^I\left({\mathbf{z}}_{1:d}\right), $$ {\mathbf{s}}_{\boldsymbol{\theta}}^{II}\left({\mathbf{z}}_{d+1:D}^I\right) $, and $ {\mathbf{t}}_{\boldsymbol{\theta}}^{II}\left({\mathbf{z}}_{d+1:D}^I\right) $ are trainable artificial neural networks with parameters θ.

Figure 1

Figure 2. Real non-volume preserving transformation (RealNVP; Dinh et al., 2017) trained on 1D manifold in 2D space (x1,x2). Left: samples of 2D Gaussian (blue) and training data after transformation to Gaussian (orange). Center: samples from trained RealNVP (blue) and true (orange) data distribution. Right: training and validation loss over number of epochs.

Figure 2

Figure 3. Real non-volume preserving transformation (RealNVP; Dinh et al., 2017) trained on 2D kite-shaped distribution in 2D space (x1,x2). Left: samples of 2D Gaussian (blue) and training data after transformation to Gaussian (orange). Center: samples from trained RealNVP (blue) and true (orange) data distribution. Right: training and validation loss over number of epochs.

Figure 3

Figure 4. Principal component flow structure with principal component analysis layer as last layer in generative direction and real non-volume preserving transformation (see Figure 1) as trainable normalizing flow in lower-dimensional space.

Figure 4

Table 1. Number of principal components for cumulative explained variance (CEV). Data with 15-min resolution (96 dimensions) from Open Power Systems Data (2019).

Figure 5

Figure 5. Comparison of the probability density function from kernel density estimation (Parzen, 1962). Top: historical data (target, solid lines), generated data from full-space normalizing flow (dashed lines), and generated data from principal component flow (dotted lines and dash-dotted lines). Bottom: historical data (target, solid lines), generated data from the Copula (dotted lines), and generated data from Wasserstein generative adversarial network (dash-dotted lines). Left: photovoltaic capacity factor. Center: wind capacity factor. Right: demand factor.

Figure 6

Table 2. p-Values (p-value $ \in \left[0,1\right] $) of Kolmogorov–Smirnov test (Hodges, 1958): statistical comparison of historical and generated data. High p-values (≥.1) indicate high probability that generated scenarios have same distribution as the historical scenarios. Results indicating a good match of the distributions are marked with an asterisk *.

Figure 7

Figure 6. Power spectral density with Welch transform (Welch, 1967). Top: historical data (target, solid lines), generated data from full-space normalizing flow (dashed lines), and generated data from PCF (dotted lines and dash-dotted lines). Bottom: historical data (target, solid lines), generated data from the Copula (dotted lines), and generated data from Wasserstein generative adversarial network (dash-dotted lines). Left: photovoltaic capacity factor. Center: wind capacity factor. Right: demand factor.

Figure 8

Figure 7. Five photovoltaic (PV) capacity factor scenarios from target, FSNF, PCF16, PCF62, Copula, and W-GAN, respectively. Time frame between midnight and 4 am. Abbreviations: FSNF, full-space normalizing flow; PCF, principal component flow; W-GAN, Wasserstein generative adversarial network.

Figure 9

Table 3. Marginal mean and variance values for photovoltaic scenarios between midnight and 4 am. Historical scenarios in comparison to FSNF-, PCF16-, and PCF62-generated scenarios as well as Copula- and W-GAN generated scenarios.

Submit a response

Comments

No Comments have been published for this article.