Hostname: page-component-89b8bd64d-x2lbr Total loading time: 0 Render date: 2026-05-07T02:57:03.905Z Has data issue: false hasContentIssue false

Evaluating probabilistic forecasts for maritime engineering operations

Published online by Cambridge University Press:  09 June 2023

Lachlan Astfalck*
Affiliation:
School of Physics, Mathematics and Computing, The University of Western Australia, Crawley, Western Australia, Australia Oceans Graduate School, The University of Western Australia, Crawley, Western Australia, Australia
Michael Bertolacci
Affiliation:
School of Mathematics and Applied Statistics, University of Wollongong, Wollongong, New South Wales, Australia
Edward Cripps
Affiliation:
School of Physics, Mathematics and Computing, The University of Western Australia, Crawley, Western Australia, Australia
*
Corresponding author: Lachlan Astfalck; Email: lachlan.astfalck@uwa.edu.au

Abstract

Maritime engineering relies on model forecasts for many different processes, including meteorological and oceanographic forcings, structural responses, and energy demands. Understanding the performance and evaluation of such forecasting models is crucial in instilling reliability in maritime operations. Evaluation metrics that assess the point accuracy of the forecast (such as root-mean-squared error) are commonplace, but with the increased uptake of probabilistic forecasting methods such evaluation metrics may not consider the full forecasting distribution. The statistical theory of proper scoring rules provides a framework in which to score and compare competing probabilistic forecasts, but it is seldom appealed to in applications. This translational paper presents the underlying theory and principles of proper scoring rules, develops a simple panel of rules that may be used to robustly evaluate the performance of competing probabilistic forecasts, and demonstrates this with an application to forecasting surface winds at an asset on Australia’s North–West Shelf. Where appropriate, we relate the statistical theory to common requirements by maritime engineering industry. The case study is from a body of work that was undertaken to quantify the value resulting from an operational forecasting product and is a clear demonstration of the downstream impacts that statistical and data science methods can have in maritime engineering operations.

Information

Type
Translational paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices
Open materials
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Figure 1. Graphical interpretation of the CRPS. The distance in the integral in Equation (9) is represented by the shaded area. Forecast $ {F}_t(z) $ is a CDF-valued quantity.

Figure 1

Figure 2. Browse Basin within Australia’s North–West Shelf. Measurements are taken from the site labeled TWC260.

Figure 2

Figure 3. Wind northing and easting measurements from July 4th, 2018 to July 11th, 2018. The red dashed line denotes the time, $ t $, at which the forecast is made, the thick blue line is the deterministic forecast from the physics-based model, the black solid line shows the observed measurements, and the shaded black line shows the as-of-yet unobserved measurements. This forecast corresponds to the NWP model used in evaluating the scoring rules in Figures 8 and 9.

Figure 3

Figure 4. 2D histogram of surface wind easting and northing. The data are of hourly measurements spanning July 17th, 2018 to July 19th, 2019.

Figure 4

Figure 5. ACF and pACF plots of wind easting and northing components.

Figure 5

Figure 6. Example forecast up to a 120-hr prediction horizon from the fit MOS model. The red dashed line denotes the time at which the forecast has been made, the solid blue line is the mean forecast, the shaded region denotes the 80% prediction interval, the dashed blue line is the forecast from the numerical model, and the black line is the data that are to be observed at the time instance. This forecast corresponds to the MOS model used in evaluating the scoring rules in Figures 8 and 9.

Figure 6

Figure 7. Example forecast up to a 48-hr prediction horizon from from the fit VAR model. The red dashed line denotes the time at which the forecast has been made, the solid blue line is the mean forecast, the shaded region denotes the 80% prediction interval, the dashed blue line is the forecast from the numerical model, and the black line is the data that are to be observed at the time instance. This forecast corresponds to the VAR model used in evaluating the scoring rules in Figures 8 and 9.

Figure 7

Figure 8. Calculated scores from the numeric, MOS and VAR models over the 0- to 48-hr prediction horizons. All scores are defined so that lower scores indicate better model performance. For all scores, the VAR model performs best for the first three prediction horizons after which the MOS model performs best. The abbreviation NWP (numerical weather prediction) references the deterministic numeric forecast model.

Figure 8

Figure 9. Probability of best score from the numeric, MOS and VAR models over the 0- to 48-hr prediction horizons. For all scores, the VAR model has the highest probability of scoring best for the first three prediction horizons after which the MOS model performs best. The abbreviation NWP (numerical weather prediction) references the deterministic numeric forecast model.

Submit a response

Comments

No Comments have been published for this article.