Hostname: page-component-77f85d65b8-g98kq Total loading time: 0 Render date: 2026-03-27T19:43:07.500Z Has data issue: false hasContentIssue false

Predictive model for genital tract infections among men and women in Ghana: An application of LASSO penalized cross-validation regression model

Published online by Cambridge University Press:  06 December 2024

Michael Yao Ntumy
Affiliation:
Department of Obstetrics and Gynaecology, University of Ghana Medical School, College of Health Sciences, Accra, Ghana
John Tetteh*
Affiliation:
Department of Community Health, University of Ghana Medical School, College of Health Sciences, Accra, Ghana
Stephen Aguadze
Affiliation:
Research Unit, Korle Bu Teaching Hospital, Accra, Ghana
Swithin M. Swaray
Affiliation:
National Cardiothoracic Centre, Korle Bu Teaching Hospital, Accra, Ghana
Emilia Asuquo Udofia
Affiliation:
Department of Community Health, University of Ghana Medical School, College of Health Sciences, Accra, Ghana
Alfred Edwin Yawson
Affiliation:
Department of Community Health, University of Ghana Medical School, College of Health Sciences, Accra, Ghana
*
Corresponding author: John Tetteh; Email: bigjayasamoah@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

To enhance the capacity for early and effective management of genital tract infections at primary and secondary levels of the healthcare system, we developed a prediction model, validated internally to help predict individual risk of self-reported genital tract infections (sGTIs) at the community level in Ghana. The study involved 32973 men and women aged 15–49 years from three rounds of the Ghana Demographic Health Survey, from 2003 to 2014. The outcomes were sGTIs. We applied the least absolute shrinkage and selection operator (LASSO) penalized regression with a 10-fold cross-validation model to 11 predictors based on prior review of the literature. The bootstrapping technique was also employed as a sensitivity analysis to produce a robust model. We further employed discriminant and calibration analyses to evaluate the performance of the model. Statistical significance was set at P-value <0.05. The mean±standard deviation age was 29.1±9.7 years with female preponderance (60.7%). The prevalence of sGTIs within the period was 11.2% (95% CI = 4.5–17.8) and it ranged from 5.4% (95% CI = 4.8–5.86) in 2003 to 17.5% (95% CI = 16.4–18.7) in 2014. The LASSO regression model retained all 11 predictors. The model’s ability to discriminate between those with sGTIs and those without sGTIs was approximately 73.50% (95% CI = 72.50–74.26) from the area under the curve with bootstrapping technique. There was no evidence of miscalibration from the calibration belt plot with bootstrapping (test statistic = 17.30; P-value = 0.060). The model performance was judged to be good and acceptable. In the absence of clinical measurement, this prediction tool can be used to identify individuals aged 15–49 years with a high risk of sGTIs at the community level in Ghana. Frontline healthcare staff can use this tool for screening and early detection. We, therefore, propose external validation of the model to confirm its generalizability and reliability in different population.

Information

Type
Original Paper
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Components involved in the model-building process

Figure 1

Table 2. Socio-demographic characteristics of men and women aged 15–49 years, GDHS 2003–2014

Figure 2

Figure 1. Prevalence of self-reported genital tract infections among men and women aged 15–49 years, GDHS 2003–2014.

Figure 3

Figure 2. LASSO penalized tenfold cross-validation regression model discrimination and calibration belt: Analysis of receiver operating characteristic (ROC) curve showing area under the curve (AUC) for both cross-validation and calibration belt. The 45° diagonal line represents a model that discriminates by chance (AUC = 50); the x-axis shows the proportion with no sGTI who were incorrectly classified as reporting sGTI (false positive rate or 1- Specificity); the y-axis shows the proportion with STIs who were correctly classified as reporting STIs (true positive rate or Sensitivity). cvAUC = mean cross-validated area under the curve.

Figure 4

Table 3. Predictive model building and checking.

Figure 5

Table 4. Predictors of sexually transmitted infection from LASSO penalized tenfold cross-validation regression model among men and women aged 15–49 years, GDHS 2003–2014

Figure 6

Figure 3. Predicted probability of self-reported genital tract infections among men and women aged 15–49 years, GDHS 2003–2014.

Supplementary material: File

Ntumy et al. supplementary material

Ntumy et al. supplementary material
Download Ntumy et al. supplementary material(File)
File 432.9 KB