Hostname: page-component-89b8bd64d-mmrw7 Total loading time: 0 Render date: 2026-05-13T23:42:25.882Z Has data issue: false hasContentIssue false

Data fusion of sparse, heterogeneous, and mobile sensor devices using adaptive distance attention

Published online by Cambridge University Press:  05 November 2024

Jean-Marie Lepioufle*
Affiliation:
The Climate and Environmental Research Institute NILU, Kjeller, Norway
Philipp Schneider
Affiliation:
The Climate and Environmental Research Institute NILU, Kjeller, Norway
Paul David Hamer
Affiliation:
The Climate and Environmental Research Institute NILU, Kjeller, Norway
Rune Åvar Ødegård
Affiliation:
The Climate and Environmental Research Institute NILU, Kjeller, Norway
Islen Vallejo
Affiliation:
The Climate and Environmental Research Institute NILU, Kjeller, Norway
Tuan Vu Cao
Affiliation:
The Climate and Environmental Research Institute NILU, Kjeller, Norway
Amir Taherkordi
Affiliation:
Department of Informatics, University of Oslo, Oslo, Norway
Marek Wojcikowski
Affiliation:
Faculty of Electronics, Gdansk University of Technology, Gdansk, Poland
*
Corresponding author: Jean-Marie Lepioufle; Email: jm@jeanmarie.eu

Abstract

In environmental science, where information from sensor devices are sparse, data fusion for mapping purposes is often based on geostatistical approaches. We propose a methodology called adaptive distance attention that enables us to fuse sparse, heterogeneous, and mobile sensor devices and predict values at locations with no previous measurement. The approach allows for automatically weighting the measurements according to a priori quality information about the sensor device without using complex and resource-demanding data assimilation techniques. Both ordinary kriging and the general regression neural network (GRNN) are integrated into this attention with their learnable parameters based on deep learning architectures. We evaluate this method using three static phenomena with different complexities: a case related to a simplistic phenomenon, topography over an area of 196 $ {km}^2 $ and to the annual hourly $ {NO}_2 $ concentration in 2019 over the Oslo metropolitan region (1026 $ {km}^2 $). We simulate networks of 100 synthetic sensor devices with six characteristics related to measurement quality and measurement spatial resolution. Generally, outcomes are promising: we significantly improve the metrics from baseline geostatistical models. Besides, distance attention using the Nadaraya–Watson kernel provides as good metrics as the attention based on the kriging system enabling the possibility to alleviate the processing cost for fusion of sparse data. The encouraging results motivate us in keeping adapting distance attention to space-time phenomena evolving in complex and isolated areas.

Information

Type
Methods Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Figure 1. Sequence of two timesteps of a network of heterogeneous sensors measuring spatial phenomena. A crossed circle represents a sensor device whose measurement is representative at this point. A square represents a sensor whose measure represents an average of the phenomena surrounding this area. The quality of the measurement goes from high (color green), to medium (color orange) and to low (color red). Sensor device might be mobile and not providing measurement at every timestep.

Figure 1

Figure 2. Schematic description of accuracy and precision for three identical sensor devices measuring one phenomenon. The center of each target represents the phenomena to be measured. Case (a) sensor devices with high accuracy and high precision, cases (b) and (c) sensor devices with low accuracy and high precision, and case (d) sensor devices with high accuracy and low precision.

Figure 2

Figure 3. Accuracy-precision diagram representing the measurement quality of three types of sensor devices: a reference station, a medium-cost sensor device, and a low-cost sensor device.

Figure 3

Figure 4. Schematic illustration of a network where each station is identified as $ \left\{{x}_i,{V}_j\right\} $, the location of the prediction is identified as $ {x}_{\ast } $, and its area of similitude is characterized by a radius $ R $.

Figure 4

Figure 5. Schematic illustration of the data fusion architecture. Network $ X $ provides input X and network $ Y $ provides inputs $ K $ and $ V $. Learnable parameters are $ {W}_O $, $ {W}_Q $, and $ {W}_K $. Attention is represented by the symbol $ A $ without any subscript to ease the reading. The prediction at query $ Q $ is symbolized as $ \hat{V} $.

Figure 5

Table 1. Description of the 12 data fusion models, with their name, their attention, and the characteristics of their learnable parameters

Figure 6

Figure 6. Illustration of the static two-dimensional ground truth used in case study “Simplistic.”

Figure 7

Figure 7. Illustration of the static two-dimensional ground truth used in case study “Topography.” It represents a subset of the 25-m spatial resolution Digital Elevation Model EU-DEM v1.1 over an area of 196 $ {km}^2 $ in Norway.

Figure 8

Figure 8. Illustration of the static two-dimensional ground truth used in case study “AH $ {NO}_2 $.” It represents the annual average of hourly nitrogen dioxide concentrations in 2019 over the Oslo metropolitan region simulated with the EPISODE dispersion model.

Figure 9

Table 2. Information about measurement spatial resolution for the three cases studies

Figure 10

Table 3. Parameters related to the three measurement quality types

Figure 11

Table 4. Metrics of the data fusion models for case study “Simplistic”. Bold values represent the best metrics.

Figure 12

Figure 9. Illustration of one prediction for the case study “Simplistic” based on the measurements of 100 sensor devices (left). Each type of sensor device is described as a symbol: circle: $ {R}_H{Q}_H $, triangle down: $ {R}_H{Q}_M $, square: $ {R}_H{Q}_L $, pentagon: $ {R}_L{Q}_H $, star: $ {R}_L{Q}_M $, and diamond: $ {R}_L{Q}_L $. The prediction is based on the model NWNN3 and is carried out at 6400 locations (middle). Ground truth on these 6400 locations is presented on the panel on the right.

Figure 13

Figure 10. Illustration of one prediction for the case study “Topography” based on the measurements of 100 sensor devices (left). Each type of sensor device is described as a symbol: circle: $ {R}_H{Q}_H $, triangle down: $ {R}_H{Q}_M $, square: $ {R}_H{Q}_L $, pentagon: $ {R}_L{Q}_H $, star: $ {R}_L{Q}_M $, and diamond: $ {R}_L{Q}_L $. The prediction is based on the model NWNN3 and is carried out at 6400 locations (middle). Ground truth on these 6400 locations is presented on the panel on the right.

Figure 14

Figure 11. Illustration of one prediction for the case study “AH $ {NO}_2 $” based on the measurements of 100 sensor devices (left). Each type of sensor device is described as a symbol: circle: $ {R}_H{Q}_H $, triangle down: $ {R}_H{Q}_M $, square: $ {R}_H{Q}_L $, pentagon: $ {R}_L{Q}_H $, star: $ {R}_L{Q}_M $, and diamond: $ {R}_L{Q}_L $. The prediction is based on the model krigNN2 and is carried out at 6400 locations (middle). Ground truth on these 6400 locations is presented on the panel on the right.

Figure 15

Figure 12. Accuracy-precision diagram for case study “Simplistic” with both a priori measurement quality related to the 6 types of sensor devices (black circles) and the metrics of the 12 data fusion models (green stars). Some green stars represent several models with identical metrics (e.g., NWNN, krigNN2, and krigNN3).

Figure 16

Figure 13. Accuracy-precision diagram for case study “Topography” with both a priori measurement quality related to the 6 types of sensor devices (black circles) and the metrics of the 12 data fusion models (green stars).

Figure 17

Figure 14. Accuracy-precision diagram for case study “AH $ {NO}_2 $” with both a priori measurement quality related to the 6 types of sensor devices (black circles) and the metrics of the 12 data fusion models (green stars).

Figure 18

Figure 15. 2D-maps of the learnable parameters $ {W}_K $ and $ {W}_O $ for a sensor device of high spatial resolution and high measurement quality. Two sets of learnable are presented corresponding to two krigNN2 models trained using two different sequences of sensor devices evolving on a same network $ X $ and $ Y $. From top to bottom: $ {W}_{K,x} $, $ {W}_{K,y} $, $ {W}_{K, acc} $, $ {W}_{K, prec} $, and $ {W}_O $.

Figure 19

Figure 16. 2D-maps of the learnable parameters $ {W}_K $ and $ {W}_O $ for a sensor device of high spatial resolution and high measurement quality. Two sets of learnable parameters are presented corresponding to two NWNN2 models trained using two different sequences of sensor devices evolving on a same network $ X $ and $ Y $. From top to bottom: $ {W}_{K,x} $, $ {W}_{K,y} $, $ {W}_{K, acc} $, $ {W}_{K, prec} $, and $ {W}_O $.

Figure 20

Figure 17. 2D-maps of the lower dispersion (top) and upper dispersion (bottom). Two sets of dispersion are presented corresponding to two krigNN2 models trained using two different sequences of sensor devices evolving on a same network X and Y.

Figure 21

Figure 18. 2D-maps of the lower dispersion (top) and upper dispersion (bottom). Two sets of dispersion are presented corresponding to two NWNN2 models trained using two different sequences of sensor devices evolving on a same network X and Y.

Figure 22

Figure 19. 2D-maps of metrics RMSE (top) and variance (bottom). Two sets of dispersion are presented corresponding to two krigNN2 models trained using two different sequences of sensor devices evolving on a same network X and Y.

Figure 23

Figure 20. 2D-maps of metrics RMSE (top) and variance (bottom). Two sets of dispersion are presented corresponding to two NWNN2 models trained using two different sequences of sensor devices evolving on a same network X and Y.

Figure 24

Table 5. Metrics of the data fusion models for case study “Topography”. Bold values represent the best metrics.

Figure 25

Table 6. Metrics of the data fusion models for case study “AH $ {NO}_2 $”. Bold values represent the best metrics.