Hostname: page-component-89b8bd64d-sd5qd Total loading time: 0 Render date: 2026-05-07T20:37:18.134Z Has data issue: false hasContentIssue false

A novel workflow for streamflow prediction in the presence of missing gauge observations

Published online by Cambridge University Press:  04 July 2023

Rendani Mbuvha*
Affiliation:
School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom School of Statistics and Actuarial Science, University of Witwatersrand, Johannesburg, South Africa
Julien Y.P. Adounkpe
Affiliation:
Institut National de l’Eau, Université d’Abomey-Calavi, Cotonou, Benin
Mandela C.M. Houngnibo
Affiliation:
Agence Nationale de la Météorologie du Bénin, Cotonou, Benin
Wilson T. Mongwe
Affiliation:
Summerland Research and Development Centre, Agriculture and Agri-Food Canada, Summerland, BC, Canada
Zahir Nikraftar
Affiliation:
School of Electronic Engineering and Computer Science, Queen Mary University of London, London, United Kingdom
Tshilidzi Marwala
Affiliation:
Summerland Research and Development Centre, Agriculture and Agri-Food Canada, Summerland, BC, Canada
Nathaniel K. Newlands
Affiliation:
School of Electrical and Electronic Engineering, University of Johannesburg, Johannesburg, South Africa
*
Corresponding author: Rendani Mbuvha; Email: r.mbuvha@qmul.ac.uk

Abstract

Streamflow predictions are vital for detecting flood and drought events. Such predictions are even more critical to Sub-Saharan African regions that are vulnerable to the increasing frequency and intensity of such events. These regions are sparsely gaged, with few available gaging stations that are often plagued with missing data due to various causes, such as harsh environmental conditions and constrained operational resources. This work presents a novel workflow for predicting streamflow in the presence of missing gage observations. We leverage bias correction of the Group on Earth Observations Global Water and Sustainability Initiative ECMWF streamflow service (GESS) forecasts for missing data imputation and predict future streamflow using the state-of-the-art temporal fusion transformers (TFTs) at 10 river gaging stations in the Benin Republic. We show by simulating missingness in a testing period that GESS forecasts have a significant bias that results in poor imputation performance over the 10 Beninese stations. Our findings suggest that overall bias correction by Elastic Net and Gaussian Process regression achieves superior performance relative to traditional imputation by established methods. We also show that the TFT yields high predictive skill and further provides explanations for predictions through the weights of its attention mechanism. The findings of this work provide a basis for integrating Global streamflow prediction model data and the state-of-the-art machine learning models into operational early-warning decision-making systems in resource-constrained countries vulnerable to drought and flooding due to extreme weather events.

Information

Type
Application Paper
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press
Figure 0

Figure 1. Benin’s catchments, rivers, and hydrological stations and missing data rate indicated in the map legend.

Figure 1

Figure 2. Our proposed workflow for missing data imputation and streamflow forecasting in poorly gaged areas. Step (1) includes collecting historical in-situ observations and the Group on Earth Observations Global Water and Sustainability Initiative ECMWF streamflow service (GESS) hindcasts for overlapping periods. (2) An elastic net is trained to correct biases of the GESS hindcasts onto in-situ observations to result in a completed dataset in (3). The completed in-situ observational dataset is then augmented with temporal climate variables and static catchment characteristic data. A temporal fusion transformer is trained in (4) to produce streamflow forecasts in (5).

Figure 2

Figure 3. Plot of mean Kling–Gupta efficiency (KGE), and Nash–Sutcliffe efficiency (NSE) measures from each imputation method at varying levels of missingness across the 10 stations. The dashed blue line represents the zero NSE line. It can clearly be seen that the Group on Earth Observations Global Water and Sustainability Initiative ECMWF streamflow service imputation (in red) produces the lowest KGE and NSE values compared with all competing methods.

Figure 3

Table 1. KGE of each imputation method at 20% missingness at each respective gaging station.

Figure 4

Figure 4. The infilled streamflow time series at the Koubéri station using the Group on Earth Observations Global Water and Sustainability Initiative ECMWF streamflow service (GESS) Lookup Bias Correction (a) and Elastic Net Bias Correction (b). The difference in scale of the y-axis between the two figures illustrates a bias in the GESS forecasts of a factor of about six times that of the in-situ data.

Figure 5

Table 2. A summary of the performance of each forecasting model during the testing period at each respective gaging station.

Figure 6

Figure 5. Average attention profiles of the temporal fusion transformer models in the testing period. This indicates that the observation time points are most influential in the making predictions.

Supplementary material: PDF

Mbuvha et al. supplementary material

Mbuvha et al. supplementary material

Download Mbuvha et al. supplementary material(PDF)
PDF 950.5 KB