Hostname: page-component-77f85d65b8-g4pgd Total loading time: 0 Render date: 2026-03-29T16:36:02.041Z Has data issue: false hasContentIssue false

The challenge of land in a neural network ocean model

Published online by Cambridge University Press:  02 January 2025

Rachel Furner*
Affiliation:
Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, United Kingdom Polar Oceans Team, British Antarctic Survey, Cambridge, United Kingdom
Peter Haynes
Affiliation:
Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, United Kingdom
Dani C. Jones
Affiliation:
Polar Oceans Team, British Antarctic Survey, Cambridge, United Kingdom School for Environment and Sustainability, Cooperative Institute for Great Lakes Research, University of Michigan, Ann Arbor, MI, USA
Dave Munday
Affiliation:
Polar Oceans Team, British Antarctic Survey, Cambridge, United Kingdom
Brooks Paige
Affiliation:
University College London, London, United Kingdom
Emily Shuckburgh
Affiliation:
Department of Computer Science and Technology, University of Cambridge, Cambridge, United Kingdom
*
Corresponding author: Rachel Furner; Email: raf59@cam.ac.uk

Abstract

Machine learning (ML) techniques have emerged as a powerful tool for predicting weather and climate systems. However, much of the progress to date focuses on predicting the short-term evolution of the atmosphere. Here, we look at the potential for ML methodology to predict the evolution of the ocean. The presence of land in the domain is a key difference between ocean modeling and previous work looking at atmospheric modeling. Here, we look to train a convolutional neural network (CNN) to emulate a process-based General Circulation Model (GCM) of the ocean, in a configuration which contains land. We assess performance on predictions over the entire domain and near to the land (coastal points). Our results show that the CNN replicates the underlying GCM well when assessed over the entire domain. RMS errors over the test dataset are low in comparison to the signal being predicted, and the CNN model gives an order of magnitude improvement over a persistence forecast. When we partition the domain into near land and the ocean interior and assess performance over these two regions, we see that the model performs notably worse over the near land region. Near land, RMS scores are comparable to those from a simple persistence forecast. Our results indicate that ocean interaction with land is something the network struggles with and highlight that this is may be an area where advanced ML techniques specifically designed for, or adapted for, the geosciences could bring further benefits.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices
Open materials
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Example 12-hour averaged MITgcm fields for temperature, SSH, and Eastward and Northward velocity components.

Figure 1

Figure 2. Example 12-hour increments to MITgcm fields for temperature, SSH, and Eastward and Northward velocity components.

Figure 2

Table 1. RMS errors for predictions from the neural network and from a persistence forecast, along with the neural network predictions normalized by the persistence forecast

Figure 3

Figure 3. Mean MITgcm fields for temperature, SSH, and Eastward and Northward velocity components.

Figure 4

Figure 4. Standard deviation of temperature, SSH, and Eastward and Northward velocity components from MITgcm dataset.

Figure 5

Figure 5. Schematic of the neural network used.

Figure 6

Figure 6. RMS error over training epochs, for the training and validation datasets, along with the training dataset separated by variable. Here, variables are all normalized using the mean and range, allowing fair comparison between variables.

Figure 7

Figure 7. Log density scatter plots for predicted increments from the test dataset against true increments for temperature (top left), SSH (top right), Eastward component of velocity (bottom left), and Northward component of velocity (bottom right).

Figure 8

Figure 8. Spatial RMS errors averaged over all samples in the test dataset.

Figure 9

Figure 9. Example predicted 12 hour increments to MITgcm fields for temperature, SHH, and Eastward and Northward velocity components, from the test set.

Figure 10

Figure 10. Errors in the above predicted 12-hour increments to MITgcm fields for temperature, SHH, and Eastward and Northward velocity components.

Figure 11

Figure 11. Log density scatter plots for coastal points for predicted increments from the test dataset against true increments for temperature (top left), SSH (top right), Eastward component of velocity (bottom left), and Northward component of velocity (bottom right), for coastal points (a) and ocean interior points (b).

Figure 12

Table 2. RMS errors and RMS errors normalized by persistence for the neural network predictions broken down into coastal points and the ocean interior

Figure 13

Figure A1. Histograms of input variables.

Figure 14

Figure A2. Histograms of target variables—that is, the change in each variable over 12 hours.

Author comment: The challenge of land in a neural network ocean model — R0/PR1

Comments

Dear Sir/Madam,

We wish to submit an original research article entitled “The challenge of land in a neural network ocean model” for consideration by Environmental Data Science.

We confirm that this work is original and has not been published elsewhere, nor is it currently under consideration for publication elsewhere.

In this paper, we report on the challenge of including land in neural network (NN) models. We show that assessing the performance of NN models of the ocean globally hides the performance near land, and that simple methods used within the literature give very poor results near land. This is significant because NN models are being used more and more frequently for atmospheric weather forecasting, and we need to better represent land within these models in order to be able to harness their promise in ocean models.

We have no conflicts of interest to disclose.

Please address all correspondence concerning this manuscript to me at rachel.furner.rf@gmail.com.

Thank you for your consideration of this manuscript.

Sincerely,

Rachel Furner

Review: The challenge of land in a neural network ocean model — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

The authors investigate the use of convolutional neural networks (CNN) for ocean modelling in an idealised setup based on MITgcm simulations. In their results, they show that the CNN outperforms a persistence forecast averaged across all grid points. Digging deeper, they find that for grid points near land the quality of the prediction is only on par with persistence, indicating possible issues with incorporating land grid points into CNNs, for e.g. the ocean.

The manuscript is comprehensive, very well written, and neat. Nevertheless, I have one possibly longer discussion point and few smaller technical remarks, which should be addressed before I can recommend to accept the paper.

General remark:

The raised issues with land grid points are specific to CNNs. Other architectures, like transformer-based or graph NNs can naturally take land grid points by masking into account. To let such architectures learn specific features for specific regions, e.g. near land points, one can give an embedding of the spatial position as additional input to the NN.

If the replacement of the land grid points by zeros is applied after normalization, it acts similar to zero padding. The effects of land grid points are then predominantly defined by the biases of the convolutional layers, which can lead to artifacts (Liu et al., 2018). To avoid these artifacts, one solution can be to replace zero padding by partial convolutions, where the sum within the convolution is renormalised by the number of valid (non-land) surrounding points. Quite recently, this approach has applied to sea-ice surrogate modelling to take land points into account (Durand et al., 2024). CNNs with partial convolutions can be seen like a graph neural network with message passing and a fixed Cartesian mesh, where the land points are masked out. This in fact also imitates how numerical models are applied in the presence of land points.

Since it seems like the field has established around such more advanced methods, even for the atmosphere, I would like to see more reasoning why CNNs are applied in this manuscript. One of the reasons could be for example that convolutions encode a physical inductive bias into the NN, which could help in low data settings.

Additionally, as the manuscript is built as a proof of concept and to expose limitations of CNNs, I would also like to see an extensive discussion on how other architectures might solve the issue of land points, and if not, then why? So, in general, I would appreciate the presentation of more ideas on how this issue might be solved.

These two additional points would improve and complete the manuscript.

Smaller remarks:

1) General: Please use “1 \times 10^{-x}” notation instead of the “1e-x”, it would improve the readability of the manuscript.

2) Page 4: How is the splitting of the training/validation/testing dataset?

Seeing Page 6, it seems like the split is 75%/15%/10%. Does this mean that the first 37.5 years are used for training, the 7.5 years afterwards for validation, and the last 5 years for testing? If yes, please make this clearer. Additionally, why are there more validation years than testing years? Normally it is the other way around.

3) Page 5: I would remind the reader that the salinity is constant and replace “… are the complete simulator variables, with the exception of salinity.” by “… are the prognostic model variables, except the constant salinity.”

4) Page 6: Just out of curiosity: Have you tested the effect of the auto correlation in your samples, e.g., trained without throwing away training data? For sure, the auto-correlated data violates the assumption that the samples are independent from each other. Nevertheless, even auto-correlated samples might provide additional information to the training of the neural network.

5) Page 6: I guess that “… the network is provided with the land-sea mask?” means that the NN gets as additional input the land-sea mask. Please clarify in the manuscript.

6) Page 6: Values at grid points with land are set to zero. Are they set to zero before or after normalization?

7) Page 7: “… MITgcm simulations, and no padding at the North and South boundaries.” Is really no padding applied at these boundaries? This would mean that the output field is much smaller than the input fields. So, I guess that rather zero padding is applied, which is quite often the default?

8) Page 7: How many parameters has the resulting neural network?

9) Page 7: The root-mean-squared error for the different variables indicates that they are differently hard to predict, as seen in Figure 6. Since the mean-squared error with normalization in the training dataset is used, the loss contribution of the different variables is implicitly given by the normalization’s standard deviation component. Consequently, also the resulting root-mean-squared error might be impacted by this implicit weighting.

10) Page 8: “… we see that performance is excellent across all variables, …”, there is an article missing in front of performance. Additionally, it can be added that a persistence forecast would result into constant predictions of 0 in Figure 7 and a one-ot-one line could be added to Figure 7 to make the ideal solution and the performance of the NN clearer.

11) Page 13: “… with temperature giving larger overall errors than other variables.” Please see my comment above about the implicit weighting in 9). What would happen if, e.g., the importance/weight of the temperature in the loss function would be increased? Could we expect that the temperature prediction gets better at the cost of the other variables?

12) Page 13: “… needed to theoretically capture the concept of land, …”. I agree that the NN has all the needed information. However, because of the zero values, the behaviour at grid points near the land is similar to grid points near the boundary with the zero padding, which could introduce artifacts. Please see my large remark at the top, because this is not how land is modelled/represented in numerical models. Consequently, we cannot expect that the neural network properly takes land pixels into account.

13) Page 13: “… physical meaning outside of its use as land representation.” As written in the larger remark, it depends if the value is set before or after normalization.

14) Page 13: “…, and results in poor performance.” Has it been tested? If yes, please indicate so and written something like “…, and results in poor performance (not shown)” to clarify that it has been tested.

15) Page 14: Although I appreciate the discussion about the difficulties to predict the temperature, it might be related to the implicit weighting as in 9) and 11).

16) Page 15: “This work acts …” Please see my large remark, as this proof of concept is specific to convolutional neural networks. Please mention again that CNNs are used in the study.

17) Abbreviations: I guess “IFS” is missing in front of “Integrated Forecasting System”.

Citations:

Durand, C., Finn, T. S., Farchi, A., Bocquet, M., Boutin, G., & Ólason, E. (2024). Data-driven surrogate modeling of high-resolution sea-ice thickness in the Arctic. The Cryosphere, 18(4), 1791–1815. https://doi.org/10.5194/tc-18-1791-2024

Liu, G., Shih, K. J., Wang, T.-C., Reda, F. A., Sapra, K., Yu, Z., Tao, A., & Catanzaro, B. (2018). Partial Convolution based Padding (arXiv:1811.11718). arXiv. https://doi.org/10.48550/arXiv.1811.11718

Review: The challenge of land in a neural network ocean model — R0/PR3

Conflict of interest statement

Reviewer declares none.

Comments

This articles addresses an issue encountered by neural networks specific to ocean simulation, that is the effect of a land mask on the quality of the predictions, which has never been clearly assessed before. To this end, the authors train a CNN to emulate a realistic ocean model, MITgcm, over a simplified topology. This allows them to highlight in a clear fashion the difficulty of the NN to do prediction near land. The article is concise, nice to read and the results are explicit and convincing.

Some questions and remarks came to me while reading the paper. Most of them are more open questions and could be discussed in the Discussion section, as long as the authors feel it does not compromise the clarity of the paper:

-In the introduction, Price et al. (2023) develop an ensemble prediction model based on Gencast, not GraphCast, which is based on diffusion models.

-Do you have any insights on the effect of the architecture (here a U-Net) used with respect to the results? Can another architecture alleviate the error near land (I am thinking that maybe a graph NN could help with irregular geometry)?

-A simple way to better take into account the land mask in the convolution operations would be to use partial convolutions; this has been done for instance in Durand, C., Finn, T. S., Farchi, A., Bocquet, M., and Òlason, E. (2023): Data-driven surrogate modeling of high-resolution sea-ice thickness in the Arctic, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2023-1384.

-Another possibility would be to use an attention mechanism to make the NN focus on near-land dynamics.

-I suspect that using physics-informed neural networks could also help with near-land dynamics. For instance, adding constraints on the conservation of physical quantities could prevent the non-physical values of the land mask from interfering too much.

-The results shown for temperature and velocity are at 2 vertical levels below the surface. Is the error near land dependent on the depth? I am guessing the surface forcings help reduce the error on the surface, so maybe it increases with depth? Also, in real applications, the land mask could expand with the vertical levels. So two things could happen:

-The deeper the layer, the more the land mask there would be, and the more the NN would struggle.

-It would increase the amount of land mask/ocean interactions, which could help the NN learn a more correct dynamic.

-Could this issue be solved simply by increasing the size of the training dataset or the number of epochs (Figure 6 seems to indicate that the error could be further reduced)?

-Instead of giving the whole domain as input, one could think of dividing it into patches. Maybe that could help the NN learn different dynamics on different patches?

-In the case of the paper, it was only necessary to remove columns so that it would have the right size for a U-Net. Sometimes you have to add columns/lines that then act like a supplementary land mask, so the results presented here are also important to take into account in this situation.

-In the conclusion, I find the sentence “Many machine learning techniques, including the CNNs used for this work, have an inherent assumption that data used have a Gaussian distribution” requires a reference, and I do not feel that it correctly explains the struggle of the NN with temperature. CNNs are also perfectly capable of predicting non-Gaussian fields.

Recommendation: The challenge of land in a neural network ocean model — R0/PR4

Comments

Dear Authors,

our apologies for this very late review. There were some unforeseen delays in the reviewing process that took time to be handled. This was by no means related to the quality of your manuscript.

As you can see, the reviewers suggest only minor corrections which consist of technical details and an extended discussion, especially regarding the choice of neural network architecture.

Thanks for your patience.

Decision: The challenge of land in a neural network ocean model — R0/PR5

Comments

No accompanying comment.

Author comment: The challenge of land in a neural network ocean model — R1/PR6

Comments

Following the review of our paper, we submit an updated manuscript and responses to the reviewers comments.

Apologies that this has taken longer than the original deadline.

Many thanks,

Rachel

Review: The challenge of land in a neural network ocean model — R1/PR7

Conflict of interest statement

Reviewer declares none.

Comments

Thank you for the clear and detailed answers.

Review: The challenge of land in a neural network ocean model — R1/PR8

Conflict of interest statement

Reviewer declares none.

Comments

The authors have addressed my previous comments to my full satisfaction. I thank the authors for their responses and for the constructive revision round. From my side, there are no open remarks, and I recommend to accept the manuscript as it is.

Recommendation: The challenge of land in a neural network ocean model — R1/PR9

Comments

Dear author,

Thanks for addressing the reviewers' comments. For a question of timing, it was not possible to incorporate the comments of the last reviewer. So I suggest “Minor Revision” so you can include revisions about the normalization (point 1), the land-sea mask (point 2), the nice discussion about persistence (point 4), and the bathymetry (point 5).

Also, the link to the code in the code availability section is missing.

Decision: The challenge of land in a neural network ocean model — R1/PR10

Comments

No accompanying comment.

Author comment: The challenge of land in a neural network ocean model — R2/PR11

Comments

Please find attached the updated manuscript as discussed,

Rachel Furner

Recommendation: The challenge of land in a neural network ocean model — R2/PR12

Comments

No accompanying comment.

Decision: The challenge of land in a neural network ocean model — R2/PR13

Comments

No accompanying comment.