Hostname: page-component-77c78cf97d-d2fvj Total loading time: 0 Render date: 2026-04-24T05:08:50.668Z Has data issue: false hasContentIssue false

Quantifying uncertainty in land cover mappings: An adaptive approach to sampling reference data using Bayesian inference

Published online by Cambridge University Press:  25 October 2022

Jordan Phillipson*
Affiliation:
School of Computing and Communications, Lancaster University, Lancaster, United Kingdom
Gordon Blair
Affiliation:
School of Computing and Communications, Lancaster University, Lancaster, United Kingdom
Peter Henrys
Affiliation:
Lancaster office, UK Centre for Ecology and Hydrology, Lancaster, United Kingdom
*
*Corresponding author. E-mail: j.phillipson@lancaster.ac.uk

Abstract

Mappings play an important role in environmental science applications by allowing practitioners to monitor changes at national and global scales. Over the last decade, it has become increasingly popular to use satellite imagery data and machine learning techniques (MLTs) to construct such maps. Given the black-box nature of many of these MLTs though, quantifying uncertainty in these maps often relies on sampling reference data under stricter conditions. However, practical constraints can sampling such data expensive, which forces stakeholders to make a trade-off between the degree of uncertainty in predictions and the costs of collecting appropriately sampled reference data. Furthermore, quantifying any trade-off is often difficult, as it will depend on many interdependent factors that cannot be fully understood until more data is collected. This paper investigates how a combination of Bayesian inference and an adaptive approach to sampling reference data can offer a generalizable way of managing such trade-offs. The approach is illustrated and evaluated using a woodland mapping of England as a case study in which reference data is collected under constraints motivated by COVID-19 travel restrictions. The key findings of this paper are as follows: (a) an adaptive approach to sampling reference data allows an informed approach when quantifying this trade-off; and (b) Bayesian inference is naturally suited to adaptive sampling and can make use of Monte Carlo methods when dealing with more advanced problems and analytical techniques.

Information

Type
Methods Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. (a) Woodland mapping generated from the 2007 UK land cover map. (b) A mapping of the propensity scores based on the proximity to experts’ homes.

Figure 1

Figure 2. (Left) A mapping of the sampling area for the initial sample (blue). (Right) A scatter plot of mapped woodland area versus the ground truth area from the initial sample.

Figure 2

Figure 3. The key stages of adaptive sampling are represented as an iterative process.

Figure 3

Figure 4. A summation diagram of how Bayesian inference and methods (M1–M3) interact with the four key stages of adaptive sampling.

Figure 4

Figure 5. A summation of how the features in the woodland case study create challenges across the four key stages of adaptive sampling.

Figure 5

Figure 6. A Bayesian kernel machine regression (bkmr) model fitted to the initial sample.

Figure 6

Figure 7. (a) A map of the target area for the initial sample. (b) A map of the current level of precision for woodland area predictions. (c) A map for the estimated aleatoric component of uncertainty, a measure of the maximum level of precision for predictions under this model.

Figure 7

Figure 8. Measures of precision across the predictive features via heat maps. The light-blue points indicate the initial sample.

Figure 8

Figure 9. Measures of precision across the predictive features in 3D space (mapped woodland and propensity score). The black surface represents the current level of precision. The red surfaces represent estimates for the aleatoric components (posterior mode and 95% credible surfaces).

Figure 9

Figure 10. Spatial mappings for the targeted areas under each sample design (design 1: blue, design 2: green, design 3: yellow).

Figure 10

Figure 11. The predicted precision for woodland area predictions under the three proposed sample designs presented spatially.

Figure 11

Figure 12. The predicted precision for woodland area predictions under the three proposed sample designs across the predictive features via heat maps. The light-blue points indicate the initial sample and the colored rectangles display the target areas for the proposed sample designs.

Figure 12

Figure 13. The predicted precision for woodland area predictions under the three proposed sample designs was presented across the predictive features in 3D space (mapped woodland and propensity score).

Figure 13

Figure 14. A summation of how the features in the woodland case study create challenges across the four key stages of adaptive sampling and how methods M1–M3 help in overcoming them.

Figure 14

Table A1. A generalized workflow for the procedures is introduced in Section 3 alongside a worked example.