Hostname: page-component-89b8bd64d-sd5qd Total loading time: 0 Render date: 2026-05-08T01:05:31.646Z Has data issue: false hasContentIssue false

Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network

Published online by Cambridge University Press:  24 February 2026

Lucas Howard*
Affiliation:
Atmospheric and Oceanic Science, University of Colorado Boulder, USA
Aneesh C. Subramanian
Affiliation:
Atmospheric and Oceanic Science, University of Colorado Boulder, USA
Jithendra Raju Nadimpalli
Affiliation:
Physical Science and Engineering, King Abdullah University of Science and Technology, Saudi Arabia
Donata Giglio
Affiliation:
Atmospheric and Oceanic Science, University of Colorado Boulder, USA
Ibrahim Hoteit
Affiliation:
Physical Science and Engineering, King Abdullah University of Science and Technology, Saudi Arabia
*
Corresponding author: Lucas Howard; Email: lucas.howard@colorado.edu

Abstract

Marine heat waves (MHWs) are prolonged periods of elevated ocean temperatures that can devastate marine ecosystems, fisheries, and coastal communities. Skillfully predicting these events with sufficient lead time is crucial for mitigating their adverse effects. This study presents a probabilistic subseasonal MHW forecast tool using a U-Net-based neural network architecture, with a focus on the Northern Indian Ocean and the Arabian Sea. The model was trained using sea surface temperature and sea surface height reanalysis data. The U-Net-based forecast tool demonstrated significant predictive skill up to 10 weeks in advance across various deterministic and probabilistic skill metrics. The model outperformed persistence and climatology-based benchmarks, especially in the tropical warm pool. Future applications of explainable artificial intelligence (XAI) methods have the potential to identify the sources of predictive skill, inform understanding of underlying dynamics, and improve dynamic subseasonal to seasonal forecast models.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press
Figure 0

Figure 1. Boundaries of the study domain in the Indian Ocean and Arabian Sea are delineated in red. This represents the region in which ECMWF forecast skill will be evaluated, as well as the domain of the ML forecast model.

Figure 1

Figure 2. Schematic of U-Net architecture. The input is SST and SSH at the current time step and the four previous time steps. Output is predicted SST and SST uncertainty, represented as standard deviation, at 1–10 weeks lead times. There are four levels of downsampling via Maxpooling (green arrows), followed by four levels of upsampling via transposed convolution (red arrows). Standard convolutional layers are applied after each up- or downsampling step. After each upsampling layer and before the convolutional layer is applied, data from the downsampling with matching resolution is concatenated (gray arrows).

Figure 2

Figure 3. Model performance during training. Panel (a) shows the CRPS loss, and panel (b) shows the RMSE, each plotted for both the training (black) and test (red) datasets. Decreasing values indicate improving performance, and the convergence of the curves suggests limited overfitting.

Figure 3

Figure 4. Forecast skill metrics for the U-Net as a function of lead time. Panels (a) and (b) show deterministic metrics: RMSE (a) and ACC (b). For comparison, both include a persistence forecast, and the climatology RMSE is also shown in (a). Panel (c) shows the BSS averaged over space and time, and panel (d) shows the SEDI. Ninety-five percent confidence intervals are included for all metrics, and the SEDI significance threshold is shown in (d) as a dashed black line. Metrics in panels (a–c) are based on SST predictions, while panel (d) uses binary MHW forecasts.

Figure 4

Figure 5. BSS (a) and SEDI (b) for the U-Net, shown by season as a function of lead time. Seasons are defined as: December–February (DJF, boreal winter), March–May (MAM, boreal spring), June–August (JJA, boreal summer), and September–November (SON, boreal fall). For panel (a), 95% confidence intervals are shown for each lead time. For panel (b), the significance threshold is indicated by a dashed black line. BSS is computed from SST forecasts, while SEDI is computed from binary MHW forecasts.

Figure 5

Figure 6. Calibration curves for all lead times of U-Net and ECMWF forecasts. For each lead time, predicted probabilities are binned along the x-axis. The actual incidence of these events is then shown on the y-axis. A perfectly calibrated forecast, where events occur at exactly the predicted frequency, is included for reference. A climatology-based prediction is also included in blue for all lead times. 95% confidence intervals are included in both the x and y directions for each point. Curves below the 1:1 line indicate overprediction, while curves above indicate under-prediction.

Figure 6

Figure 7. Maps showing the spatial distribution of the Brier Skill Score for the ECMWF forecast (a) and U-Net forecast (b). Areas with skill scores less than zero are shown in blue, with green shades representing skillful forecasts compared to climatology.

Figure 7

Figure 8. Spatial distribution of the Symmetric Extremal Dependence Index for the ECMWF forecast (a) and U-Net forecast (b). Areas with negative skill (SEDI < 0) are shown in blue. For the U-Net, a binary forecast was generated by applying a 50% threshold to the predicted MHW incidence probability (equation 2.2). Positive SEDI values indicate skillful detection of extreme events relative to climatology.

Supplementary material: File

Howard et al. supplementary material

Howard et al. supplementary material
Download Howard et al. supplementary material(File)
File 34 KB

Author comment: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R0/PR1

Comments

Dear Dr. Monteleoni,

Enclosed is our manuscript entitled “Skillful Subseasonal Forecasts of Marine Heat Waves using Machine Learning”, to be considered for publication in the special collection “Connecting Data-Driven and Physical Approaches: Application to Climate Modeling and Earth System Observation” in Environmental Data Science. We believe that this is one of the first studies to show skillful probabilistic subseasonal to seasonal predictions of marine heat waves using a machine learning forecast method. Marine heatwaves have only recently become a major area of oceanographic and climate research, with many gaps remaining in our knowledge about marine heatwaves. The gaps in understanding and prediction accuracy of marine heatwaves highlight the need for innovative approaches to improve our forecast skill of these extreme events. While traditional forecasting models struggle with S2S predictions of marine heatwaves, machine learning offers a promising path forward for capturing complex, nonlinear interactions that drive these events.

Reliable S2S forecasts of marine heat waves are crucial to help mitigate their detrimental effect on the health of marine ecosystems by providing actionable information for stakeholders in affected regions. Environmental Data Science is an ideal platform for this work due to its interest in publishing data-driven approaches for understanding environmental phenomena. It also fits in the scope of the special collection as extreme event forecasting application that compares a data-driven to a traditional dynamical method. Our methodology can also serve as a framework for other S2S ocean and climate extreme event prediction studies. Hence, this research sets a foundation for broader applications of machine learning based predictions in climate resilience and adaptation.

The manuscript is not under consideration or review in any other journals. The authors further declare no competing interests or known conflicts of interest.

Please feel free to reach out with any questions, concerns, or requests for additional material. Thank you for your consideration.

Sincerely,

Lucas Howard

PhD Candidate, Department of Atmospheric and Oceanic Sciences, University of Colorado, Boulder

Lucas.Howard@Colorado.edu

Review: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

1.Should “Marine Heat Waves” be abbreviated as MHW or MHWs? There are multiple instances in the text where the abbreviation is used both with and without the plural “s”; consistency should be maintained.

2.“With 256 grid points in both the latitudinal and longitudinal directions.” While this is convenient for calculations, differences in the span of latitude and longitude result in variations in the grid distances along the latitudinal and longitudinal directions.

3.In “2.2 Machine Learning Model”, should the domestic and international application cases of U-Net be placed in the introduction?

4.On Page 4, line 40: “we provide SST and SSH at the forecast initialization time as well as at weeks 1-4 before the forecast initialization time, similar to the approach used by Davenport et al for interannual SST forecasts.” The source of SST and SSH at the forecast initialization time is not clearly stated.

5.In Figure 4, please provide the original expansions of the abbreviations such as DJF, MAM, JJA, and SON.

6.From Figure 6, it can be seen that after week 7, the green areas in the U-Net results are unevenly distributed, with blue areas occupying a large proportion. In contrast, in Figure 7, the green distribution is very uniform from week 1 to week 7. Should it also be analyzed that the training method of the U-Net leads to relative independence between grid points, while the ECMWF model has good flow dependence?

Review: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R0/PR3

Conflict of interest statement

Reviewer declares none.

Comments

This paper describes a data-driven subseasonal forecasting systems for sea surface temperature in the North Indian Ocean and the Arabian Sea, used to forecast marine heatwaves. Sea surface temperature and sea surface height from a high-resolution global reanalysis are used to train a U-Net neural network architecture. The resulting forecasts show significant deterministic skill of SST and MHW occurrence several weeks ahead across the domain, with exceptions noted. The proposed system outperforms benchmark models and compares well with a state-of-the-art dynamical system.

The paper is well written and largely logical, and the figures are clear. A suitable range of validation techniques are used to explores geographical patterns and seasonality of skill. The development of data-driven models, particularly for this area, is justified and of worth to the subseasonal forecasting community.

There are some major concerns to be addressed. I am particularly concerned about the probabilistic nature of the forecast system, as the method used to create the ensemble is not well explained in the manuscript and no adequate references.

Major Comments

Probabilistic forecasting. To my knowledge (which may be lacking), a true data-driven forecast ensemble has only been produced for short-term forecasting (Price et al., 2025). A much more detailed explanation is required here on the method used to generate the ensemble. What does this probability distribution of SST represent? How is it calculated? How does the UNet output this? How large is it? This information is crucial to any paper on S2S forecasting.

Price, I., Sanchez-Gonzalez, A., Alet, F. et al. Probabilistic weather forecasting with machine learning. Nature 637, 84–90 (2025). https://doi.org/10.1038/s41586-024-08252-9

More importantly, how does it compare to a dynamical ensemble, which represents uncertainty in initial conditions and chaotic nature of the evolution of the earth system (particularly for atmospheric and sea surface variables on S2S timescales)? If the nature or the size of the ensembles of the data-driven and dynamical models are different, then comparisons are not fair and the skill scores potentially used incorrectly.

Describe the persistence in some more detail. What does “current SST anomaly” refer to – daily, weekly, before start date?

The beginning of the results section is hard to follow as the training, validation and test have not been clearly defined. Moreover, I find the captions could do with more explicit information. Please guide the reader better.

Since the first map does not appear until Figure 6, consider recording the figures or adding a description and definition of the domain.

In the discussion, the authors state ideas for potential improvements (Pg 12, Lines 47-51). The authors should state why they think these are useful next steps. The authors reduced the resolution of the reanalysis training data used; why not show from the beginning the skill of using the original dataset (1/12)?

Please explain the hyperparameter tuning so that it can be reproduced.

Justify choice of training data. Why use a reanalysis for SST and SSH for training over the 1993-2021 period when this is covered by observational data?

Why did the authors regrid the reanalysis data?

I strongly recommend to specify the target domain in the title.

Introduction

Line 4: MHWs are not just mere anomalies.

Line 34: Years missing from references.

Line 48: sources

Line 48: sources on what timescales?

Methods

Pg 3

Line 19: 8km at the equator?

Line 26: This is an unusual way to describe the new resolution. Can you be explicit?

Line 29: It is unclear what you mean by “power of two”.

Line 30: Is the detrending performed individually for each week of the year?

Line 39: One could argue that the “original definition” of MHWs is the fixed climatology used in Hobday 2016. I would recommended to change wording.

Line 42: Introduce NN acronym here.

Line 33: Why were these three architectures chosen? What do they represent?

Line 43: How is the approach similar to Davenport et al?

Pg 5

Line 47: Please explicitly state the training and testing periods. Also, isn’t there also a validation period?

Pg 6

Lines 16-21. This paragraph seems to have lost some sentences and it not clear.

Line 35: AT the end of section 2.3, you state that the training results are not include but here you discuss them. Please clarify.

Line 43: “more accurate”.

Pg 7

Line 47: Is there a more precise term than “proper”?

Pg 9

Lines 40-44: I disagree that skill degradation is consistent. There is much more skill degradation in the U-Net SEDI.

Pg 12

Line 45. Do you mean Figure 5?

Line 48: What becomes more extreme? Unclear.

Pg 13

Line 6: Represented in the SSH, perhaps?

Line 23: Would be good to mention in the methods already that this is a binary forecast.

Pg 14

Line 4: Please be realistic here, as sometimes the skill is worse than the dynamical system e.g. SEDI scores.

Figures

Figure 2: Here, and in general, are we looking at area-averaged skill or skill of area-averages?

Figure 3: Specify if the top panels are for SST and the bottom panels are for MHWs.

Figures 6 & 7: To allow for easier comparison, please considering putting the skill scores for the two systems side-by-side. Also, there is a colorbar missing.

Recommendation: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R0/PR4

Comments

Following reviewers' comments, I recommend major revisions. Nevertheless, please note that the revisions needed are mostly clarifications and elaborations. Especially, regarding the probabilistic prediction, even if the approach is described in referenced papers, it is relevant to give more details on the application for your particular case.

Decision: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R0/PR5

Comments

No accompanying comment.

Author comment: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R1/PR6

Comments

Dear Dr. Monteleoni,

Enclosed is a revised manuscript responsive to reviewer and handling editor suggestions, initially entitled “Skillful Subseasonal Marine Heat Wave Forecasts using a Neural Network", now “Skillful Subseasonal Indian Ocean Marine Heat Wave Forecasts using a Neural Network”, manuscript number EDS-2025-0052. In it we demonstrate skillful probabilistic sea surface temperature forecasts using a neural network at up to 10 weeks in advance, and show it is competitive with an operational dynmaical seasonal forecast model (the ECMWF S2S forecast).

The manuscript has not been submitted elsewhere, and the authors declare no known conflicts of interest. Thank you for considering it for publication in Environmental Data Science, and please let me know if any other information or materials would be helpful during the second phase of review.

Sincerely,

Lucas Howard, PhD Candidate

Department of Atmospheric and Oceanic Sciences

University of Colorado, Boulder

Review: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R1/PR7

Conflict of interest statement

Reviewer declares none.

Comments

All the relevant issues raised in my first-round review have been adopted and revised by the authors. I recommend acceptance for publication.

Recommendation: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R1/PR8

Comments

Thank you for submitting your revised version. As you can see, only one reviewer provided feedback. I have reviewed your responses and the modifications you made to address Reviewer 2’s comments. Based on these, I recommend minor revisions for this version. Please see my comments below (based on Reviewer 2’s assessment):

Use of the term “downscaling”: The term is misleading. Downscaling

generally refers to increasing resolution, which is the opposite of

downsampling. In the manuscript, the two terms seem to be used as

synonyms. Could you please correct this thoughout the manuscript?

P3L29:

“Reanalysis was chosen rather than a data product constructed from observations alone as reanalyses are dynamically constrained and coupled to processes that act as sources of S2S predictability.”

I don’t fully understand this point. Observations represent the real system, which is also dynamically constrained and coupled by nature and contains the true sources of S2S predictability. Do you mean that biases, noise, or insufficient resolution in observations make reanalysis more relevant? Or is there a reference indicating that reanalyses are more accurate than observations?

P4L38: Hyperparameter tuning is not detailed here. You might consider moving the explanation currently in P7L36–37 to this section.

P4L40–43: Your reasoning for using three architectures is very assertive. In theory, all architectures can represent both large and small scales. In practice, differences may exist, but given the small differences between the three architectures, could you consider moving this to the appendix (as an ablation study) and focusing only on UNET in the main paper?

Table 1: Is it referenced in the text? Also, the term “validation” is used, but you acknowledged in your response to Reviewer 2 that you did not have a validation set.

Decision: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R1/PR9

Comments

No accompanying comment.

Author comment: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R2/PR10

Comments

Dear Dr. Monteleoni,

Enclosed is our submission of a revised manuscript for publication in Environmental Data Science, initially entitled “Skillful Subseasonal Marine Heat Wave Forecasts using a Neural Network", now “Skillful Subseasonal Indian Ocean Marine Heat Wave Forecasts using a Neural Network”, manuscript number EDS-2025-0052. Thank you for considering it for publication and please let me know if you need any additional materials.

Sincerely,

Lucas Howard

Recommendation: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R2/PR11

Comments

Thanks for addressing the last minor comments. I recommend the article for publication in EDS. Congratulations! One minor thing I would suggest changing: in the caption of figure 3 (the embedded caption in the figure), the term validation is still indicated, while it would be more consistent to put “Test”

Decision: Skillful subseasonal Indian Ocean marine heatwave forecasts using a neural network — R2/PR12

Comments

No accompanying comment.