Hostname: page-component-5db58dd55d-l8wb7 Total loading time: 0 Render date: 2026-05-30T22:03:23.570Z Has data issue: false hasContentIssue false

Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet

Published online by Cambridge University Press:  03 January 2025

Harish Baki*
Affiliation:
Faculty of Civil Engineering and Geosciences, TU Delft, Delft, the Netherlands
Sukanta Basu
Affiliation:
Atmospheric Sciences Research Center, University at Albany, Albany, NY, USA Department of Environmental and Sustainable Engineering, University at Albany, Albany, NY, USA
*
Corresponding author: Harish Baki; Email: h.baki@tudelft.nl

Abstract

The growing demand for global wind power production, driven by the critical need for sustainable energy sources, requires reliable estimation of wind speed vertical profiles for accurate wind power prediction and comprehensive wind turbine performance assessment. Traditional methods relying on empirical equations or similarity theory face challenges due to their restricted applicability beyond the surface layer. Although recent studies have utilized various machine learning techniques to vertically extrapolate wind speeds, they often focus on single levels and lack a holistic approach to predicting entire wind profiles. As an alternative, this study introduces a proof-of-concept methodology utilizing TabNet, an attention-based sequential deep learning model, to estimate wind speed vertical profiles from coarse-resolution meteorological features extracted from a reanalysis dataset. To ensure that the methodology is applicable across diverse datasets, Chebyshev polynomial approximation is employed to model the wind profiles. Trained on the meteorological features as inputs and the Chebyshev coefficients as targets, the TabNet more-or-less accurately predicts unseen wind profiles for different wind conditions, such as high shear, low shear/well-mixed, low-level jet, and high wind. Additionally, this methodology quantifies the correlation of wind profiles with prevailing atmospheric conditions through a systematic feature importance assessment.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Description of the meteorological variables adopted from the ERA5 reanalysis

Figure 1

Figure 1. Column 1: an illustration of fourth order Chebyshev polynomials plotted against the normalized height $ z=\left[-1,1\right] $. The remaining figures display the vertical profiles of wind speed from CERRA alongside those approximated by Chebyshev polynomials, for four well-known categories of wind regimes: high shear, low shear/well-mixed, low-level jets (LLJ), and high wind.

Figure 2

Figure 2. (a) Flowchart of the experimental setup used in this study to train the TabNet. (b) Our strategy of splitting the entire dataset into train, validation, and test. (c) Loss curves of one of the trained model, in which the train and validation RMSE values are plotted against the training epochs.

Figure 3

Figure 3. First row: a comparison of Chebyshev coefficients ($ {C}_0,{C}_1,{C}_2,{C}_3 $ and $ {C}_4 $) between the test data and the model predictions using bivariate histograms. The probability of occurrence is represented on a log scale with the color increasing from dark (low probability) to light (high probability). The evaluation scores, namely MAE, $ {R}^2 $, and RMSE for each coefficient are provided in the text boxes. Second row: the combined feature importance of input variables based on the test data.

Figure 4

Figure 4. A comparison of vertical profiles of wind speed from CERRA and the 10 ML model predictions, on four instances of test data for the selected wind regimes. Blue line represents the 50th percentile of the ensemble, darker shade represents the ensemble between 25th and 75th percentiles, and the ligher shade represents the ensemble between 10th and 90th percentile. The wind speed from ERA5 at 10 m ($ {\mathbf{W}}_{10} $) and 100 m ($ {\mathbf{W}}_{100} $) are illustrated using green diamonds. The evaluation scores, RMSE and MAPE are computed between the CERRA and the median profile for each wind regime, are provided in the text boxes.

Figure 5

Figure 5. Same as Figure 4, but a different set of time instances of the test data for the selected wind regimes.

Author comment: Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet — R0/PR1

Comments

To:

The Editor-in-Chief,

Environmental Data Science,

Cambridge University Press.

RE: Submission of a research manuscript

Dear Prof. Monteleoni,

We would like to submit a research manuscript entitled “Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet” to be considered for publication in “Environmental Data Science”, as part of the Climate Informatics 2024 Proceedings. We confirm that this work is original and has not been published elsewhere, nor is it currently under consideration for publication anywhere else.

We believe that the findings of our manuscript would have substantial standing in the field of sustainable wind resource modeling, specifically for wind power prediction and comprehensive wind turbine performance assessment. In this study, we introduced a proof-of-concept methodology utilizing TabNet, an attention-based sequential deep learning model, to estimate vertical wind profiles from coarse-resolution meteorological features extracted from a reanalysis dataset. The methodology has been designed to be applicable across diverse datasets through the utilization of Chebyshev polynomial approximation. To mimic the measure-correlate-predict (MCP) approach, the TabNet model is trained for one year and predictions are obtained for a different year. The model more-or-less accurately predicts unseen wind profiles for different wind conditions, such as high shear, low shear/well mixed, low-level jet, and high wind. Our overall methodology will also be helpful for studies focusing on quantifying the correlation of wind profiles with prevailing atmospheric conditions through a systematic feature importance assessment. We are sure that the communication shall be interesting to the broad readership of “Environmental Data Science.”

Best Regards,

Harish Baki.

On behalf of the manuscript authors.

Review: Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

>Summary: In this section please explain in your own words what problem the paper addresses and what it contributes to solving it.

The paper presented a proof-of-concept method to estimate the wind profile using deep learning, TabNet, which has the potential for wind energy for climate mitigation. The paper shows the utility of deep learning for wind profile estimation with relatively good generalizability which is important for applications.

>Relevance and Impact: Is this paper a significant contribution to interdisciplinary climate informatics?

The paper demonstrates the value of deep learning for climate mitigation applications and addresses the gap in existing methods for estimating wind speed for the renewable energy industry.

>Detailed Comments

The paper is overall well written with clear description of methods and well designed experiments. It is among the top application papers/submissions that I have reviewed in this year’s conference. The results are carefully analyzed and presented to demonstrate the strength and limitations of the proof-of-concept model.

To help improve the paper, I have a few minor comments:

1. For the input variables, I suggest the authors to provide the explanation of variables used as the input for the model if space allow.

2. In equation 3, please define variable “”x“”.

3. Although the target of the model is to estimate the coefficients that can use for the approximation for the profile. Thus, Figure 3 show the comparison between the estimated and reference coefficients. However, it will also be valuable to show the comparison between the performance of the estimated wind profile and the reference wind profile with quantitative metrics in a more comprehensive way than selected examples in Figure 4 and 5.

4. Currently, authors used ~30 variables from ERA-5 to estimate the coefficients. Are all those variables necessary? From the variable importance, the answer seems to be no. Maybe the authors should consider variable selection before feed them as input features which may improve the model explainability and reduce the sensitivity of the model to errors in the input features (eg. Figure 5).

Recommendation: Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet — R0/PR3

Comments

This article was accepted into Climate Informatics 2024 Conference after the authors addressed the comments in the reviews provided. It has been accepted for publication in Environmental Data Science on the strength of the Climate Informatics Review Process.

Decision: Estimating high-resolution profiles of wind speeds from a global reanalysis dataset using TabNet — R0/PR4

Comments

No accompanying comment.