Hostname: page-component-5db58dd55d-lqwgf Total loading time: 0 Render date: 2026-06-04T06:18:49.601Z Has data issue: false hasContentIssue false

Impact of data accuracy on the evaluation of COVID-19 mitigation policies

Published online by Cambridge University Press:  28 October 2021

Michele Starnini*
Affiliation:
ISI Foundation, via Chisola 5, 10126 Turin, Italy
Alberto Aleta
Affiliation:
ISI Foundation, via Chisola 5, 10126 Turin, Italy
Michele Tizzoni
Affiliation:
ISI Foundation, via Chisola 5, 10126 Turin, Italy
Yamir Moreno
Affiliation:
ISI Foundation, via Chisola 5, 10126 Turin, Italy Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, Zaragoza, Spain Department of Theoretical Physics, University of Zaragoza, Zaragoza, Spain
*
*Corresponding author. E-mail: michele.starnini@gmail.com

Abstract

Evaluating the effectiveness of nonpharmaceutical interventions (NPIs) to mitigate the COVID-19 pandemic is crucial to maximize the epidemic containment while minimizing the social and economic impact of these measures. However, this endeavor crucially relies on surveillance data publicly released by health authorities that can hide several limitations. In this article, we quantify the impact of inaccurate data on the estimation of the time-varying reproduction number $ R(t) $, a pivotal quantity to gauge the variation of the transmissibility originated by the implementation of different NPIs. We focus on Italy and Spain, two European countries among the most severely hit by the COVID-19 pandemic. For these two countries, we highlight several biases of case-based surveillance data and temporal and spatial limitations in the data regarding the implementation of NPIs. We also demonstrate that a nonbiased estimation of $ R(t) $ could have had direct consequences on the decisions taken by the Spanish and Italian governments during the first wave of the pandemic. Our study shows that extreme care should be taken when evaluating intervention policies through publicly available epidemiological data and call for an improvement in the process of COVID-19 data collection, management, storage, and release. Better data policies will allow a more precise evaluation of the effects of containment measures, empowering public health authorities to take more informed decisions.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited
Open Practices
Open data
Copyright
© The Author(s), 2021. Published by Cambridge University Press
Figure 0

Figure 1. Time-series of the incidence per 10,000 individuals in (a) Spain and (b) Italy. In red, the incidence reported by European Centre for Disease and Control (ECDC) based on the daily notification of cases provided by each European region. In light blue, the incidence with updated data of the corresponding Health Ministries on symptom onset date. Time-series are smoothed with a moving average of 7 days. The pointed red lines show the estimated delay (in days) between the two curves (right scale), showing that such delay is not constant in time.

Figure 1

Figure 2. Evolution of mobility during the first wave of COVID-19 in (a) Spain and (b) Italy, represented by the change in workplace mobility as reported by Google’s mobility data. Estimated value of $ R(t) $ during the first wave of COVID-19 in (c) Spain and (d) Italy. In all plots, the hardest-hit region in each country (Madrid and Lombardy) are colored in red, while in light blue we plot the same values obtained with data aggregated at the country level.

Figure 2

Figure 3. Evolution of $ R(t) $ using the old data available until June or the updated datasets in (a) Spain and (b) Italy. Solid lines represent the value of $ R(t) $ estimated using the data that was made publicly available by the authorities of each country, while the dashed lines use the updated data on symptom onset. The lock represents the data when each country went into lockdown. The factory icon with a lock shows when a further closure of some industries was imposed, and the factory without an icon when it was released.

Figure 3

Table 1. First day for which the value of $ R(t) $ is consistently below 1 within a 95% credible interval for each time-series

Submit a response

Comments

No Comments have been published for this article.