Hostname: page-component-77f85d65b8-9vb7h Total loading time: 0 Render date: 2026-03-26T11:47:58.950Z Has data issue: false hasContentIssue false

Capitalizing on natural language processing (NLP) to automate the evaluation of coach implementation fidelity in guided digital cognitive-behavioral therapy (GdCBT)

Published online by Cambridge University Press:  02 April 2025

Nur Hani Zainal*
Affiliation:
Department of Psychology, National University of Singapore (NUS), Singapore
Regina Eckhardt
Affiliation:
Technical University of Munich, TUM School of Life Sciences, Freising, Germany
Gavin N. Rackoff
Affiliation:
Department of Psychology, The Pennsylvania State University, University Park, PA, USA
Ellen E. Fitzsimmons-Craft
Affiliation:
Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
Elsa Rojas-Ashe
Affiliation:
Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA
Craig Barr Taylor
Affiliation:
Department of Psychiatry and Behavioral Sciences, Stanford University, Stanford, CA, USA Department of Psychology, Palo Alto University, Palo Alto, CA, USA
Burkhardt Funk
Affiliation:
Department of Information Systems and Data Science, Leuphana University Lüneburg, Lüneburg, Germany
Daniel Eisenberg
Affiliation:
Fielding School of Public Health, University of California at Los Angeles, Los Angeles, CA, USA
Denise E. Wilfley
Affiliation:
Department of Psychiatry, Washington University School of Medicine, St. Louis, MO, USA
Michelle G. Newman
Affiliation:
Department of Psychology, The Pennsylvania State University, University Park, PA, USA
*
Corresponding author: Nur Hani Zainal; Email: hanizainal@nus.edu.sg
Rights & Permissions [Opens in a new window]

Abstract

Background

As the use of guided digitally-delivered cognitive-behavioral therapy (GdCBT) grows, pragmatic analytic tools are needed to evaluate coaches’ implementation fidelity.

Aims

We evaluated how natural language processing (NLP) and machine learning (ML) methods might automate the monitoring of coaches’ implementation fidelity to GdCBT delivered as part of a randomized controlled trial.

Method

Coaches served as guides to 6-month GdCBT with 3,381 assigned users with or at risk for anxiety, depression, or eating disorders. CBT-trained and supervised human coders used a rubric to rate the implementation fidelity of 13,529 coach-to-user messages. NLP methods abstracted data from text-based coach-to-user messages, and 11 ML models predicting coach implementation fidelity were evaluated.

Results

Inter-rater agreement by human coders was excellent (intra-class correlation coefficient = .980–.992). Coaches achieved behavioral targets at the start of the GdCBT and maintained strong fidelity throughout most subsequent messages. Coaches also avoided prohibited actions (e.g. reinforcing users’ avoidance). Sentiment analyses generally indicated a higher frequency of coach-delivered positive than negative sentiment words and predicted coach implementation fidelity with acceptable performance metrics (e.g. area under the receiver operating characteristic curve [AUC] = 74.48%). The final best-performing ML algorithms that included a more comprehensive set of NLP features performed well (e.g. AUC = 76.06%).

Conclusions

NLP and ML tools could help clinical supervisors automate monitoring of coaches’ implementation fidelity to GdCBT. These tools could maximize allocation of scarce resources by reducing the personnel time needed to measure fidelity, potentially freeing up more time for high-quality clinical care.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Figure 1. Schematic diagram of data analytic steps.Note. ML, machine learning; NLP, natural language processing. Please refer to Supplementary Table S1 for more information on the best coach fidelity rubric codes.

Figure 1

Table 1. Examples of high coach-to-user implementation fidelity messages

Figure 2

Table 2. Examples of suboptimal coach-to-user implementation fidelity messages

Figure 3

Figure 2. Top 20 most frequently used words by GdCBT coaches when writing messages to users.Note. GdCBT, digital cognitive-behavioral therapy, n, frequency (word count).

Figure 4

Figure 3. Frequency of emotion sentiment words using the Bing sentiment lexicon.

Figure 5

Table 3. ML predictive performance of sentiment analyses with NLP to predict unique coach implementation fidelity

Figure 6

Table 4. Interpretation of performance metrics in predicting coach implementation fidelity

Figure 7

Table 5. Model performance of various classifiers to automate the evaluation of coach implementation fidelity

Supplementary material: File

Zainal et al. supplementary material

Zainal et al. supplementary material
Download Zainal et al. supplementary material(File)
File 74.3 KB