Hostname: page-component-77f85d65b8-6bnxx Total loading time: 0 Render date: 2026-03-27T01:33:36.046Z Has data issue: false hasContentIssue false

Multimodal learning–based reconstruction of high-resolution spatial wind speed fields

Published online by Cambridge University Press:  07 January 2025

Matteo Zambra*
Affiliation:
IMT Atlantique, Brest, France Lab-STICC, Brest, France
Nicolas Farrugia
Affiliation:
IMT Atlantique, Brest, France Lab-STICC, Brest, France
Dorian Cazau
Affiliation:
Lab-STICC, Brest, France ENSTA Bretagne, Brest, France
Alexandre Gensse
Affiliation:
Naval Group, Toulon, France
Ronan Fablet
Affiliation:
IMT Atlantique, Brest, France Lab-STICC, Brest, France
*
Corresponding author: Matteo Zambra; Email: matteo.zambra1@gmail.com

Abstract

Wind speed at the sea surface is a key quantity for a variety of scientific applications and human activities. For its importance, many observation techniques exist, ranging from in situ to satellite observations. However, none of such techniques can capture the spatiotemporal variability of the phenomenon at the same time. Reanalysis products, obtained from data assimilation methods, represent the state-of-the-art for sea-surface wind speed monitoring but may be biased by model errors and their spatial resolution is not competitive with satellite products. In this work, we propose a scheme based on both data assimilation and deep learning concepts to process spatiotemporally heterogeneous input sources to reconstruct high-resolution time series of spatial wind speed fields. This method allows to us make the most of the complementary information conveyed by the different sea-surface information typically available in operational settings. We use synthetic wind speed data to emulate satellite images, in situ time series and reanalyzed wind fields. Starting from these pseudo-observations, we run extensive numerical simulations to assess the impact of each input source on the model reconstruction performance. We show that our proposed framework outperforms a deep learning–based inversion scheme and can successfully exploit the spatiotemporal complementary information of the different input sources. We also show that the model can learn the possible bias in reanalysis products and attenuate it in the output reconstructions.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Qualitative description of the dataset. Panel (a): Geographical region and buoys positions. Panel (b): In situ pseudo-observations. Panel (c). Original HR fields and the emulated LR fields.

Figure 1

Figure 2. Schematic illustration of the 4DVarNet framework. The observations and the state variable are prepared as concatenations of the input sources available. The symbol $ \boldsymbol{\theta} $ represents the networks $ \Phi $ and $ \Gamma $ parameters and $ n $ denotes the training epoch index. The symbol $ \mathbf{u} $ represents the ground truths. The statement $ {\mathbf{x}}^0\leftarrow \mathbf{y} $ means that the initial guess of the state variable is initialized with the observations. The red box contains the inversion part to solve for the state variable $ \mathbf{x} $, being the parameters $ \boldsymbol{\theta} $ fixed as of the previous training iteration. The green box highlights the parameters training part to solve for $ \boldsymbol{\theta} $, with the state variable $ \mathbf{x} $ fixed as returned by the inversion part.

Figure 2

Table 1. Global overview on the benchmark experimentsTable A—Table of the abbreviations used in the results presentation. The symbol ✘ states that a given input data source is missing. 1 h, 6 h, and 12 h stand, respectively, for observation sampling frequency of 1, 6, and 12 h.

Figure 3

Table B —Benchmark test results. To contextualize our results, the first two rows report the typical reconstruction errors expected when using SAR Sentinel-1A imagery and the wind speed reanalyses of the ECMWF ERA-5 catalog. In the second part of this Table, we report the RMSE scores for our simulations, expressed by Equations 8 and 9. Relative gains are expressed in percentage. The gains are referred to the LDI-SR baseline, marked in orange. The black boldface highlights the best result. We follow the names conventions stated in part A of this Table.

Figure 4

Figure 3. Average gains maps. Left panel: plain 4DVarNet, average gain of 4DVN-SM-C3 vs 4DVN-SM-C1. Right panel: 4DVarNet with the additive trainable observation term in the variational cost, 4DVN-MM-C3 vs 4DVN-MM-C1.

Figure 5

Table 2. Computational effort associated to the models used. Training times refer to the time required to train the ensemble of 10 models. The trainable parameters are the number of parameters of one single model. Memory size refers to the space required to save the 10-members model ensemble

Figure 6

Figure 4. Biased LR field tests. Left panel: random delay. Right panel: random intensity. The suffixes “-rd” and “-ri” identify the models trained in the case of random delay and intensity, respectively.

Author comment: Multimodal learning–based reconstruction of high-resolution spatial wind speed fields — R0/PR1

Comments

Dr. Claire Monteleoni,

Editor-in-Chief

Environmental Data Science

Dear Dr. Monteleoni,

This letter supports the submission of our paper “Multi-Modal

Learning-based Reconstruction of High-Resolution Spatial

Wind Speed Fields” to Environmental Data Science.

Our work addresses the joint exploitation of ocean

remote sensing and in-situ observations in sea surface

state data-driven modeling. We focus in particular on

sea surface wind speed. We propose a hybrid deep learning

and variational data assimilation framework to simultaneously

process heterogeneous and multi-sensor sources of information.

The originality of this work stems from the effectiveness of our

proposed framework to exploit the complementary information

conveyed by the input data. The numerical experiments and the

results detailed in our paper show that our model is competitive

with the baseline provided by the performance level of Numerical

Weather Forecast models, as ECMWF ERA-5 and purely

learning-based schemes.

We think that this work may be of interest for operational applications,

ocean engineering practitioners as well as for researchers in the

field of data-driven ocean surface modeling, particularly on the

topic of multi-modal learning and multi-sensor information

processing and fusion.

The correspondence should be addressed to Matteo Zambra

at matteo.zambra1@gmail.com and to Prof. Ronan Fablet

at ronan.fablet@imt-atlantique.fr.

Thank you for your attention and consideration.

Kindest regards,

Matteo Zambra

Ph.D.

IMT Atlantique, UMR CNRS Lab-STICC

Brest, France

(Previous affiliation)

matteo.zambra1@gmail.com

Ronan Fablet

Full Professor

IMT Atlantique, UMR CNRS Lab-STICC

Brest, France

ronan.fablet@imt-atlantique.fr

Review: Multimodal learning–based reconstruction of high-resolution spatial wind speed fields — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

>>Summary: In this section please explain in your own words what problem the paper addresses and what it contributes to solving it.

The paper proposes an enhancement to an existing data assimilation method, with the specific application of estimating the wind speed at the sea surface. It makes a small modification to 4DVarNet, a neural network-based data assimilation method, in an attempt to improve performance when several different types of observations are available (specifically: satellite observations, point observations from sensors on the surface, and output from numerical weather prediction systems). The paper presents an evaluation on simulated data which shows that their method is able to take advantage of the multi-modal observations to better estimate the wind-speed field.

>>Relevance and Impact: Is this paper a significant contribution to interdisciplinary climate informatics?

The paper is highly relevant. It applies machine learning methods to solve a climate science problem.

Methodological contribution: this is a small modification to 4DVarNet

Application contribution:

- I’m not familiar with the literature on estimation of surface wind speed, so I can’t comment on if a similar approach has been tried before for this problem.

- As an application paper, I think the evaluation should be more thorough. Two baselines are included, a “”vanilla learning-based inversion scheme“” and 4DVarNet, but I don’t think either are in common use for this application. It also seems like the vanilla scheme might have far fewer parameters than the new method.

The paper also needs quite a bit of work to improve clarity and fill in missing details.

I am unsure of the acceptance threshold, but I suspect this paper should be rejected.

Having said this, the results seem promising so I would love to see a more complete version of this work!

>>Detailed Comments:

Possible limitations of the evaluation:

- The results are not put in context for the application. What are reasonable RMSEs for this problem achieved by currently deployed systems?

- Is the comparison to the reconstruction model baseline set up fairly? It has the same architecture as the dynamical prior of 4DVarNet, just part of that model, which presumably means the full 4DVarNet model has quite a few more parameters?

- As the model is trained end-to-end, it would be nice to investigate if the various components actually learn what is expected. For example, does Phi actually learn a single step of the model dynamics?

- Multiple repeats of the experiment are performed and a median taken to calculate the results. However, no measure of spread is given. I think this is particularly important to include as the median hides outliers.

- There’s no discussion of the computational cost of any of the methods.

I found some bits of the paper difficult to understand, or missing details:

- Experiment configuration is vague. What are the architectures/sizes of the networks used?

- Configuration “”SR“”: it says the model has to reconstruct the HR field from the LR field. Why does it reconstruct the HR field rather than the ground truth? In Table 2, does this mean the RMSE given is to the HR field, rather than to the ground truth as for the other methods?

- Section 3.4: “”grad u“” and “”grad hat{x}“” appear in the loss function but what are these gradients with respect to? How are they computed?

- The listed code repository does not exist

- Various notation isn’t defined: M, T, Omega

- I would give the dimensions/types of your variables e.g. what space is x in?

- What does “”x^0 <- y“” mean in Figure 2? Perhaps write out the full algorithm instead/in addition to using this figure.

- Table 2: list the input data in this table, to avoid having to cross-reference C1 etc to Table 1 several pages back

- Table 2: name the methods something other than M_1, M_2. I found these difficult to remember

- “”Table X resumes“”: should be “”Table X [lists/shows/gives/etc]

Recommendation: Multimodal learning–based reconstruction of high-resolution spatial wind speed fields — R0/PR3

Comments

This article was accepted into Climate Informatics 2024 Conference after the authors addressed the comments in the review input provided. It has been accepted in Environmental Data Science on the strength of the Climate Informatics review process.

Decision: Multimodal learning–based reconstruction of high-resolution spatial wind speed fields — R0/PR4

Comments

No accompanying comment.