Hostname: page-component-5db58dd55d-pjp64 Total loading time: 0 Render date: 2026-05-25T19:43:33.387Z Has data issue: false hasContentIssue false

When does machine learning outperform clinicians? A comparison of prediction accuracy for PTSD treatment outcomes

Published online by Cambridge University Press:  11 December 2025

Philip Held*
Affiliation:
Rush University Medical Center, USA
Dale L. Smith
Affiliation:
University of Illinois Chicago, USA
Daniel R. Szoke
Affiliation:
Rush University Medical Center, USA
Sarah A. Pridgen
Affiliation:
Rush University Medical Center, USA
*
Corresponding author: Philip Held; Email: philip_held@rush.edu
Rights & Permissions [Opens in a new window]

Abstract

Background

Machine learning (ML) models show promise in predicting post-traumatic stress disorder (PTSD) treatment outcomes, but it is unknown how their predictions compare to those of clinicians. This study directly compared the accuracy of clinicians’ predictions of patient treatment outcomes with those of three ML models.

Methods

Twenty clinicians providing cognitive processing therapy repeatedly predicted outcomes for 194 veterans. We compared their accuracy against three ML models on two key endpoints: clinically meaningful symptom reduction (≥10-point PCL-5 decrease) and posttreatment severity (final PCL-5 < 33). Clinician predictions were compared against a recurrent neural network, a mixed-effects random forest, and a generalized linear mixed-effects model. We analyzed prediction accuracy and the association between clinician confidence and accuracy using logistic mixed-effects models.

Results

ML models were significantly more accurate than clinicians at predicting whether a patient’s posttreatment PCL-5 score would be below 33 (p < .001). However, no significant difference in accuracy was found for predicting a ≥10-point symptom reduction (p = .734). Clinician confidence increased throughout treatment and was significantly associated with greater prediction accuracy for both outcomes (ORs = 1.06, ps < .001).

Conclusions

ML models can outperform clinicians in predicting posttreatment symptom severity, particularly early in treatment, suggesting they could be a useful tool for identifying patients at risk for suboptimal outcomes. However, ML models were not superior in predicting symptom reduction, where clinicians also performed at a high level. Findings support the selective use of ML to enhance, rather than replace, clinical judgment in PTSD treatment.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press or the rights holder(s) must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Prediction of a 10-point PCL-5 reduction. Note: LMM, linear mixed-effects model; MERF, mixed-effects random forest; PCL-5, PTSD Checklist for DSM-5; RNN, recurrent neural network.

Figure 1

Figure 2. Prediction of end-of-treatment scores at or below 33 points on the PCL-5. Note: LMM, linear mixed-effects model; MERF, mixed-effects random forest; PCL-5, PTSD Checklist for DSM-5; RNN, recurrent neural network.

Figure 2

Table 1. Prediction accuracy comparisons between clinician ratings and machine learning approaches

Figure 3

Figure 3. Confidence and accuracy predicting a 10-point PCL-5 reduction. Note: PCL-5, PTSD Checklist for DSM-5.

Figure 4

Figure 4. Confidence and accuracy predicting end-of-treatment scores at or below 33 points on the PCL-5. Note: PCL-5, PTSD Checklist for DSM-5.

Supplementary material: File

Held et al. supplementary material

Held et al. supplementary material
Download Held et al. supplementary material(File)
File 341.5 KB