Hostname: page-component-89b8bd64d-mmrw7 Total loading time: 0 Render date: 2026-05-06T12:54:55.079Z Has data issue: false hasContentIssue false

Scalable data assimilation with message passing

Published online by Cambridge University Press:  08 January 2025

Oscar Key*
Affiliation:
UCL Centre for Artificial Intelligence, University College London, London, United Kingdom
So Takao
Affiliation:
UCL Centre for Artificial Intelligence, University College London, London, United Kingdom Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, United States
Daniel Giles
Affiliation:
UCL Centre for Artificial Intelligence, University College London, London, United Kingdom
Marc Peter Deisenroth
Affiliation:
UCL Centre for Artificial Intelligence, University College London, London, United Kingdom The Alan Turing Institute, London, United Kingdom
*
Corresponding author: Oscar Key; Email: oscar.key.20@ucl.ac.uk

Abstract

Data assimilation is a core component of numerical weather prediction systems. The large quantity of data processed during assimilation requires the computation to be distributed across increasingly many compute nodes; yet, existing approaches suffer from synchronization overhead in this setting. In this article, we exploit the formulation of data assimilation as a Bayesian inference problem and apply a message-passing algorithm to solve the spatial inference problem. Since message passing is inherently based on local computations, this approach lends itself to parallel and distributed computation. In combination with a GPU-accelerated implementation, we can scale the algorithm to very large grid sizes while retaining good accuracy and compute and memory requirements.

Information

Type
Methods Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Surface temperature computed by message passing from satellite observations. The lines show the locations of the observations.

Figure 1

Figure 2. Illustration of node A sending a message to node B. Black arrows indicate incoming messages that are combined to compute the outgoing message in blue.

Figure 2

Figure 3. Illustration of the multigrid implementation, showing the marginal means computed at two levels ($ 128\times 128 $ to $ 256\times 256 $) of resolution on the simulated data.

Figure 3

Table 1. Comparison on simulated data. We give the mean over three ground truths; we do not observe significant variance (therefore omitted). Bold indicates where either 3D-Var or message passing performed better. R-INLA is included to show the minimal achievable error given the prior, as it computes an exact posterior

Supplementary material: File

Key et al. supplementary material

Key et al. supplementary material
Download Key et al. supplementary material(File)
File 1.9 MB

Author comment: Scalable data assimilation with message passing — R0/PR1

Comments

Dear Reviewers & Editors,

Please find attached our paper accepted at Climate Informatics, ready for publication in the special collection of Environmental Data Science.

Let me know if there is any more information you require!

Many thanks,

Oscar (corresponding author)

Review: Scalable data assimilation with message passing — R0/PR2

Conflict of interest statement

Reviewer declares none.

Comments

>Summary: In this section please explain in your own words what problem the paper addresses and what it contributes to solving it.

The paper adresses the problem of data assimilation, i.e reconstructing the full state of a dynamical system given a prior and partial observations. Computing bayesian inversion exactly under Gaussian assumptions scales cubically in the number of observations. The paper proposes to use a message passing algorithm to compute an approximation of the posterior mean with better scaling and parallelization properties.

>Relevance and Impact: Is this paper a significant contribution to interdisciplinary climate informatics?

The field of data assimilation is relevant for climate informatics. Data assimilation is ubiquitous in weather and climate models, to improve models with in-situ observations.

>Detailed Comments

High quality paper. A few comments:

- both the proposed approach and 3D-Var are based on iterative optimisation, hence there is a compute-performance trade-off controlled by the early stopping parameter. I would like to see such curves for 3D-Var and the proposed method, to assess pareto-optimality for experiments in table 1

- table 1 should include an extra significant digit

Recommendation: Scalable data assimilation with message passing — R0/PR3

Comments

This article was accepted into the Climate Informatics 2024 Conference after the authors addressed the comments in the reviews provided. It has been accepted for publication in Environmental Data Science on the strength of the Climate Informatics Review Process.

Decision: Scalable data assimilation with message passing — R0/PR4

Comments

No accompanying comment.