Hostname: page-component-89b8bd64d-46n74 Total loading time: 0 Render date: 2026-05-07T14:45:37.168Z Has data issue: false hasContentIssue false

Application of recommender systems and time series models to monitor quality at HIV/AIDS health facilities

Published online by Cambridge University Press:  11 July 2022

Jonathan Friedman*
Affiliation:
Monitoring, Evaluation, Learning, and Analytics (MELA) Department, Palladium, Washington, District of Columbia, USA
Zola Allen
Affiliation:
Monitoring, Evaluation, Learning, and Analytics (MELA) Department, Palladium, Washington, District of Columbia, USA
Allison Fox
Affiliation:
Monitoring, Evaluation, Learning, and Analytics (MELA) Department, Palladium, Washington, District of Columbia, USA
Jose Webert
Affiliation:
United States Agency for International Development, Office of HIV/AIDS, Washington, District of Columbia, USA
Andrew Devlin
Affiliation:
United States Agency for International Development, Office of HIV/AIDS, Washington, District of Columbia, USA
*
*Corresponding author. E-mail: jonathan.friedman@thepalladiumgroup.com

Abstract

The US government invests substantial sums to control the HIV/AIDS epidemic. To monitor progress toward epidemic control, PEPFAR, or the President’s Emergency Plan for AIDS Relief, oversees a data reporting system that includes standard indicators, reporting formats, information systems, and data warehouses. These data, reported quarterly, inform understanding of the global epidemic, resource allocation, and identification of trouble spots. PEPFAR has developed tools to assess the quality of data reported. These tools made important contributions but are limited in the methods used to identify anomalous data points. The most advanced consider univariate probability distributions, whereas correlations between indicators suggest a multivariate approach is better suited. For temporal analysis, the same tool compares values to the averages of preceding periods, though does not consider underlying trends and seasonal factors. To that end, we apply two methods to identify anomalous data points among routinely collected facility-level HIV/AIDS data. One approach is Recommender Systems, an unsupervised machine learning method that captures relationships between users and items. We apply the approach in a novel way by predicting reported values, comparing predicted to reported values, and identifying the greatest deviations. For a temporal perspective, we apply time series models that are flexible to include trend and seasonality. Results of these methods were validated against manual review (95% agreement on non-anomalies, 56% agreement on anomalies for recommender systems; 96% agreement on non-anomalies, 91% agreement on anomalies for time series). This tool will apply greater methodological sophistication to monitoring data quality in an accelerated and standardized manner.

Information

Type
Translational Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2022. Published by Cambridge University Press
Figure 0

Figure 1. Recommender systems approach for anomaly detection. In the illustrative flow, an HIV data set is linked to the tool, which has no previous knowledge of the data. The tool examines data points and learns which variables correlate with other variables. The tool then calculates a covariant value for each that describes the relationship between indicators. Based on the relationship, the tool predicts values based on other values observed for the facility. The tool compares its predictions for each value to the actual value in the original data set and detects instances where the two differ greatly.

Figure 1

Table 1. Covariance matrix with variance heatmap

Figure 2

Table 2. Covariance matrix with covariance heatmap

Figure 3

Table 3. Example output of actual and predicted values

Figure 4

Table 4. Example time series output

Figure 5

Table 5. Results of the experts’ validation of the recommender results

Figure 6

Table 6. Results of the experts’ validation of the time series results

Figure 7

Table 7. Results on validation against DQAs

Submit a response

Comments

No Comments have been published for this article.