Hostname: page-component-6766d58669-nqrmd Total loading time: 0 Render date: 2026-05-24T16:40:50.645Z Has data issue: false hasContentIssue false

Predicting cardiovascular disease in patients with mental illness using machine learning

Published online by Cambridge University Press:  08 January 2025

Martin Bernstorff*
Affiliation:
Department of Affective Disorders, Aarhus University Hospital – Psychiatry, Aarhus, Denmark Department of Clinical Medicine, Aarhus University, Aarhus, Denmark Center for Humanities Computing, Aarhus University, Aarhus, Denmark
Lasse Hansen
Affiliation:
Department of Affective Disorders, Aarhus University Hospital – Psychiatry, Aarhus, Denmark Department of Clinical Medicine, Aarhus University, Aarhus, Denmark Center for Humanities Computing, Aarhus University, Aarhus, Denmark
Kevin Kris Warnakula Olesen
Affiliation:
Department of Cardiology, Aarhus University Hospital, Aarhus, Denmark
Andreas Aalkjær Danielsen
Affiliation:
Department of Affective Disorders, Aarhus University Hospital – Psychiatry, Aarhus, Denmark Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
Søren Dinesen Østergaard
Affiliation:
Department of Affective Disorders, Aarhus University Hospital – Psychiatry, Aarhus, Denmark Department of Clinical Medicine, Aarhus University, Aarhus, Denmark
*
Corresponding author: Martin Bernstorff; Email: manber@rm.dk

Abstract

Background

Cardiovascular disease (CVD) is twice as prevalent among individuals with mental illness compared to the general population. Prevention strategies exist but require accurate risk prediction. This study aimed to develop and validate a machine learning model for predicting incident CVD among patients with mental illness using routine clinical data from electronic health records.

Methods

A cohort study was conducted using data from 74,880 patients with 1.6 million psychiatric service contacts in the Central Denmark Region from 2013 to 2021. Two machine learning models (XGBoost and regularised logistic regression) were trained on 85% of the data from six hospitals using 234 potential predictors. The best-performing model was externally validated on the remaining 15% of patients from another three hospitals. CVD was defined as myocardial infarction, stroke, or peripheral arterial disease.

Results

The best-performing model (hyperparameter-tuned XGBoost) demonstrated acceptable discrimination, with an area under the receiver operating characteristic curve of 0.84 on the training set and 0.74 on the validation set. It identified high-risk individuals 2.5 years before CVD events. For the psychiatric service contacts in the top 5% of predicted risk, the positive predictive value was 5%, and the negative predictive value was 99%. The model issued at least one positive prediction for 39% of patients who developed CVD.

Conclusions

A machine learning model can accurately predict CVD risk among patients with mental illness using routinely collected electronic health record data. A decision support system building on this approach may aid primary CVD prevention in this high-risk population.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of European Psychiatric Association
Figure 0

Figure 1. Extraction of data and outcome, dataset splitting, prediction time filtering, specification of predictors and flattening, model training, testing, and evaluation. (A) Data were extracted from the electronic health records. (B) Potential CVD was identified. (C) The dataset obtained is split geographically into an independent training dataset (85%) and test dataset (15%) with no patient being present in both groups. (D) Prediction times were removed if their lookbehind window extended beyond the start of the dataset or their lookahead extended beyond the end of the dataset. Prediction times were also removed after a patient developed CVD. (E) Predictors were grouped. (F) Predictors for each prediction time were extracted by aggregating the variables within the lookbehind with multiple aggregation functions. As a result, each row in the dataset represents a specific prediction time with a column for each predictor. (G) Predictor layers were added until model performance no longer improved. (H) Models were trained and optimised on the training set using five-fold cross-validation. Hyperparameters were tuned to optimise AUROC. (I) The best candidate model was evaluated on the independent test set. True positive predictions were those with predicted probabilities above the decision threshold and the patient having a CVD event within the lookahead window. False positive predictions were those where the model’s predicted probability was above the decision threshold, but the patient did not have a CVD event within the lookahead window. False negatives had predicted probabilities below the threshold, but the patient had a CVD event within the lookahead window. True negatives had predicted probabilities below the threshold, and the patient did not have a CVD event within the lookahead window.

Figure 1

Table 1. Descriptive statistics for service contacts (A) and patients (B) that were eligible for prediction

Figure 2

Figure 2. Results from model training of all models (A) and on geographically independent (external/test) data (B–E). (A) Results of experiments across aggregation methods (mean vs. min, mean, and max), lookbehinds (730 days vs. 90, 365, and 730 days), predictor layers (1, +2, +3, +4), and hyperparameter tuning. Note that results for each layer also include the features of the prior layers. (B) Receiver operating characteristics (ROC) curve. (C) Confusion matrix. PPV, positive predictive value; NPV, negative predictive value. (D) Sensitivity by months from prediction time to event, stratified by desired predicted positive rate (PPR). Note that the numbers do not match those in Table 1, since all prediction times with insufficient lookahead distance have been dropped. (E) Time (months) from the first positive prediction to the patient developing CVD at a 5% predicted positive rate (PPR).

Figure 3

Table 2. Performance by predicted positive rate for the best performing model (XGBoost) with 5 years of lookahead on the test set

Figure 4

Figure 3. Robustness of the best performing model on geographically independent (external/test) data. Robustness of the model across stratifications. The line is the area under the receiver operating characteristics curve. Bars represent the proportion of prediction times in each bin. Error bars are 95%-confidence intervals from 100-fold bootstrap.

Supplementary material: File

Bernstorff et al. supplementary material

Bernstorff et al. supplementary material
Download Bernstorff et al. supplementary material(File)
File 578.5 KB
Submit a response

Comments

No Comments have been published for this article.