Hostname: page-component-89b8bd64d-72crv Total loading time: 0 Render date: 2026-05-07T15:07:07.941Z Has data issue: false hasContentIssue false

Validation of a Bayesian Diagnostic and Inferential Model for Evidence-Based Neuropsychological Practice

Published online by Cambridge University Press:  07 April 2022

William F. Goette*
Affiliation:
Division of Psychology, Department of Psychiatry, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
Anne R. Carlew
Affiliation:
Division of Psychology, Department of Psychiatry, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
Jeff Schaffert
Affiliation:
Division of Psychology, Department of Psychiatry, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
Ben K. Mokhtari
Affiliation:
Division of Psychology, Department of Psychiatry, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
C. Munro Cullum
Affiliation:
Division of Psychology, Department of Psychiatry, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA Department of Neurology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA Department of Neurological Surgery, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
*
*Correspondence and reprint requests to: William F. Goette, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, MC9044, Dallas, TX 75390, USA. Email: William.Goette@UTSouthwestern.edu
Rights & Permissions [Opens in a new window]

Abstract

Objective:

Evidence-based diagnostic methods have clinical and research applications in neuropsychology. A flexible Bayesian model was developed to yield diagnostic posttest probabilities from a single person’s neuropsychological score profile by utilizing sample descriptive statistics of the test battery across diagnostic populations of interest.

Methods:

Three studies examined the model’s performance. One simulation examined estimation accuracy of true z-scores. A diagnostic accuracy simulation utilized descriptive statistics from two popular neuropsychological tests, the Wechsler Adult Intelligence Scale–IV (WAIS-IV) and Repeatable Battery for the Assessment of Neuropsychological Status (RBANS). The final simulation examined posterior predictive accuracy of scores to those reported in the WAIS manual.

Results:

The model produced minimally biased z-score estimates (root mean square errors: .02–.18) with appropriate credible intervals (95% credible interval empirical coverage rates: .94–1.00). The model correctly classified 80.87% of simulated normal, mild cognitive impairment, and Alzheimer’s disease cases using a four subtest WAIS-IV and the RBANS compared to accuracies of 60.67–65.60% from alternative methods. The posterior predictions of raw scores closely aligned to percentile estimates published in the WAIS-IV manual.

Conclusion:

This model permits estimation of posttest probabilities for various combinations of neuropsychological tests across any number of clinical populations with the principal limitation being the accessibility of applicable reference samples. The model produced minimally biased estimates of true z-scores, high diagnostic classification rates, and accurate predictions of multiple reported percentiles while using only simple descriptive statistics from reference samples. Future nonsimulation research on clinical data is needed to fully explore the utility of such diagnostic prediction models.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © INS. Published by Cambridge University Press, 2022
Figure 0

Table 1. Sampling distributions under each condition

Figure 1

Table 2. Parameter recovery results

Figure 2

Table 3. Diagnostic accuracy results

Figure 3

Table 4. Comparison of posterior predicted and WAIS-IV-manual reported percentiles of the sum of scaled scores

Figure 4

Table 5. Comparison of posterior predicted and WAIS-IV-manual reported discrepancy scores

Figure 5

Fig. 1. Plotted are the posterior predictions of the WAIS subtests with the full vertical lines reflecting the 95% credible interval, the darker interior interval the 50% interval, and the large circle the mean score. The dark circles plotted over these are the observed sores for the client. These intervals appear large, encompassing nearly the full range of scaled scores in some cases; however, it is important to note that some subtests range much smaller ranges and that the internal 50% credible intervals vary in their boundaries across tests, suggesting that standard qualitative cutoffs for these tests may not be particularly informative (e.g., “average” being between scaled score of 8 and 11). The individual, relative to the estimated population distribution of scores, can be said to have scored above the average on Block Design (BD), Symbol Search (SS), and Coding (CD) but then below average on Information (IN). The remaining scores for Similarities (SI), Digit Span (DS), Matrix Reasoning (MR), Vocabulary (VC), Arithmetic (AR), and Visual Puzzles (VP) may be considered to be within the average range. This interpretation is subject to clinical judgement, of course, with the current interpretations based on whether the scores fall within the 50% credible interval (and thus average) or outside of this range (and thus either above or below average).

Figure 6

Fig. 2. Plotted in this figure is the average subtest scores from the posterior predictions (the histogram) and the observed average subtest score for the client. While this plot is currently limited to just the overall average, similar plots could be create by index to visualize how the client's averages compare to a population-based estimate of averages. Similarly, plots of differences could be created in similar fashion to show the posterior predicted population frequencies of certain discrepancies between average scores (e.g., Vocabulary minus average of subtests in the Verbal Comprehension Index). Any composite measure of the posterior predictions could be generated. For example, a similar plot showing the variability of subtests in the posterior predicted population against the variability of subtests in a particular client.

Supplementary material: File

Goette et al. supplementary material

Goette et al. supplementary material

Download Goette et al. supplementary material(File)
File 2.4 MB