Hostname: page-component-6766d58669-kl59c Total loading time: 0 Render date: 2026-05-14T17:24:50.278Z Has data issue: false hasContentIssue false

Sinh-arcsinh-normal distributions to add uncertainty to neural network regression tasks: Applications to tropical cyclone intensity forecasts

Published online by Cambridge University Press:  15 June 2023

Elizabeth A. Barnes*
Affiliation:
Department of Atmospheric Science, Colorado State University, Fort Collins, CO, USA
Randal J. Barnes
Affiliation:
Department of Civil, Environmental, and Geo- Engineering, University of Minnesota, Minneapolis, MN, USA
Mark DeMaria
Affiliation:
Cooperative Institute for Research in the Atmosphere, Colorado State University, Fort Collins, CO, USA
*
Corresponding author: Elizabeth A. Barnes; Email: eabarnes@colostate.edu

Abstract

A simple method for adding uncertainty to neural network regression tasks in earth science via estimation of a general probability distribution is described. Specifically, we highlight the sinh-arcsinh-normal distributions as particularly well suited for neural network uncertainty estimation. The methodology supports estimation of heteroscedastic, asymmetric uncertainties by a simple modification of the network output and loss function. Method performance is demonstrated by predicting tropical cyclone intensity forecast uncertainty and by comparing two other common methods for neural network uncertainty quantification (i.e., Bayesian neural networks and Monte Carlo dropout). The simple approach described here is intuitive and applicable when no prior exists and one just wishes to parameterize the output and its uncertainty according to some previously defined family of distributions. The authors believe it will become a powerful, go-to method moving forward.

Information

Type
Methods Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Table 1. Summary of the 12 predictors used in the model.

Figure 1

Figure 1. (a) Neural network architecture and (b) example sinh-arcsinh-normal (SHASH) distributions for combinations of parameters.

Figure 2

Figure 2. Schematic showing how the loss function probability $ {p}_i $ is dependent on the predicted probability distribution.

Figure 3

Figure 3. (a) Predicted distributions for testing year 2020. The thick black line denotes the climatological Consensus error distribution across all training and validation samples. (b–e) The example predicted conditional distributions using the SHASH architecture as well as the Best Track verification Consensus forecast. This example is for the Eastern/Central Pacific at 48 hr lead time.

Figure 4

Figure 4. (a) Six example predictions to demonstrate the probability integral transform (PIT) calculation where colored shading denotes different percentile bins in increments of 10%. (b) Final PIT histogram computed for the validation and testing data for the network trained with 2020 as the leave-one-out year. The calibration deviation statistic, $ D $, is printed in gray; the expected calibration deviation for a perfectly calibrated forecast is given in parentheses. These examples are for the Eastern/Central Pacific at 48 hr lead time.

Figure 5

Figure 5. (a) Neural network (NN) mean absolute error versus the network predicted inter-quartile range (IQR). The error is defined as the median of the predicted SHASH distribution minus the Best Track verification. (b) As in (a), but for the Consensus error vs the network-predicted IQR. For both panels, the statistics are computed over the validation and training sets to increase the sample size. This example is for the Eastern/Central Pacific at 48 hr lead time.

Figure 6

Figure 6. SHASH performance metrics across a range of basins and lead times for each of the nine leave-one-out testing years (denoted by dots). The x-axis label convention is such that AL denotes the Atlantic basin and EPCP denotes the combined Eastern and Central Pacific basins. Numbers denote the forecast lead time in hours. Panel (a) shows testing results only, while panels (b)–(d) show results over validation and testing sets to increase the sample size for the computed statistics. See the text for details on the calculation of each metric.

Figure 7

Figure 7. (a) Example case study demonstrating the utility of the SHASH for predicting the probability of rapid intensification (Pr(RI); pink shading). The pink curve denotes the predicted conditional distribution by the SHASH at 48-hr lead time for Hurricane Michael on October 8, 2018. The gray vertical line denotes the storm intensity at the time of the forecast and the vertical dashed line denotes the Best Track verification. (b) Predicted probability of rapid intensification at various lead times for all East/Central Pacific and Atlantic storms for all nine leave-one-out testing years. The gray box plot denotes predicted probabilities for storms that did not undergo rapid intensification, while the pink box plot denotes predicted probabilities for storms that did. (c) As in (b), but the precision-recall curves for RI as a function of lead time. The dashed black line denotes the baseline precision in the case of no skill.

Figure 8

Figure 8. (a) Neural network gradient for the 48-hr SHASH prediction for Hurricane Michael on October 8, 2018. The gradient describes how small increases in each input predictor would have changed the prediction of each of the three SHASH parameters (the tailweight $ \tau $ is fixed to 1.0). (b) As in (a), but for Hurricane Marie on September 30, 2020.

Figure 9

Figure 9. Example conditional distributions predicted by four neural network probabilistic methods. Vertical dashed line denotes the Best Track verification.

Figure 10

Figure 10. As in Figure 6, but for the SHASH, BNN, and MC-Dropout approaches for Atlantic (AL) and Pacific (EPCP) forecasts at 48-hr lead times for all nine leave-one-out testing years.