Hostname: page-component-89b8bd64d-ksp62 Total loading time: 0 Render date: 2026-05-08T09:27:57.635Z Has data issue: false hasContentIssue false

Accuracy of diagnostic classification algorithms using cognitive-, electrophysiological-, and neuroanatomical data in antipsychotic-naïve schizophrenia patients

Published online by Cambridge University Press:  18 December 2018

Bjørn H. Ebdrup*
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark Faculty of Health and Medical Sciences, Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
Martin C. Axelsen
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark Cognitive Systems, DTU Compute, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
Nikolaj Bak
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark
Birgitte Fagerlund
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark Department of Psychology, University of Copenhagen, Copenhagen, Denmark
Bob Oranje
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark Faculty of Health and Medical Sciences, Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark Department of Psychiatry, Brain Center Rudolf Magnus, University Medical Center Utrecht, Utrecht, The Netherlands
Jayachandra M. Raghava
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark Department of Clinical Physiology and Nuclear Medicine, Rigshospitalet, University of Copenhagen, Glostrup, Denmark
Mette Ø. Nielsen
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark Faculty of Health and Medical Sciences, Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
Egill Rostrup
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark
Lars K. Hansen
Affiliation:
Cognitive Systems, DTU Compute, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark
Birte Y. Glenthøj
Affiliation:
Centre for Neuropsychiatric Schizophrenia Research & Centre for Clinical Intervention and Neuropsychiatric Schizophrenia Research, Mental Health Centre Glostrup, University of Copenhagen, Copenhagen, Denmark Faculty of Health and Medical Sciences, Department of Clinical Medicine, University of Copenhagen, Copenhagen, Denmark
*
Author for correspondence: Dr Bjørn H. Ebdrup, E-mail: bebdrup@cnsr.dk
Rights & Permissions [Opens in a new window]

Abstract

Background

A wealth of clinical studies have identified objective biomarkers, which separate schizophrenia patients from healthy controls on a group level, but current diagnostic systems solely include clinical symptoms. In this study, we investigate if machine learning algorithms on multimodal data can serve as a framework for clinical translation.

Methods

Forty-six antipsychotic-naïve, first-episode schizophrenia patients and 58 controls underwent neurocognitive tests, electrophysiology, and magnetic resonance imaging (MRI). Patients underwent clinical assessments before and after 6 weeks of antipsychotic monotherapy with amisulpride. Nine configurations of different supervised machine learning algorithms were applied to first estimate the unimodal diagnostic accuracy, and next to estimate the multimodal diagnostic accuracy. Finally, we explored the predictability of symptom remission.

Results

Cognitive data significantly classified patients from controls (accuracies = 60–69%; p values = 0.0001–0.009). Accuracies of electrophysiology, structural MRI, and diffusion tensor imaging did not exceed chance level. Multimodal analyses with cognition plus any combination of one or more of the remaining three modalities did not outperform cognition alone. None of the modalities predicted symptom remission.

Conclusions

In this multivariate and multimodal study in antipsychotic-naïve patients, only cognition significantly discriminated patients from controls, and no modality appeared to predict short-term symptom remission. Overall, these findings add to the increasing call for cognition to be included in the definition of schizophrenia. To bring about the full potential of machine learning algorithms in first-episode, antipsychotic-naïve schizophrenia patients, careful a priori variable selection based on independent data as well as inclusion of other modalities may be required.

Information

Type
Original Articles
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
Copyright © Cambridge University Press 2018
Figure 0

Table 1. Demographical and clinical data. Lifetime use of tobacco, alcohol, cannabis, stimulants, hallucinogens, and opioids were categorized according to an ordinal five-item (0 = never tried/1 = tried few times/2 = use regularly/3 = harmful use/4 = dependency)

Figure 1

Fig. 1. Diagram of the multivariate analysis pipeline. Forty-six patients and 58 healthy controls were included in the baseline analyses. ‘Data’ refer to input variables from cognition, electrophysiology, structural magnetic resonance imaging, and diffusion tensor imaging. For each of the 100 splits, 2/3 of subjects were used for training and 1/3 of subjects were used for testing. Subjects with missing data were not used in test sets. Training data were scaled (zero mean, unit variance), and the test sets were scaled using these parameters. Missing data were imputed using K-nearest neighbor imputation with K = 3 (Bak and Hansen, 2016), and only subjects with complete data were included in the test sets. Finally, nine different configurations of machine learning algorithms were applied to predict diagnosis. CV = cross-validation. See text for details.

Figure 2

Fig. 2. Unimodal diagnostic accuracies for cognition (Cog), electrophysiology (EEG), structural magnetic resonance imaging (sMRI), and diffusion tensor imaging (DTI) for each of the nine different configurations of machine learning algorithms. X-axes show the accuracies (acc), and y-axes show the sum of correct classifications for each of the 100 random subsamples (see Fig. 1). Dotted vertical black line indicates chance accuracy (56%). With cognitive data, all nine configurations of algorithms significantly classified ‘patient v. control’ (p values = 0.001–0.009). No algorithms using EEG, sMRI, and DTI-data resulted in accuracies exceeding chance. The nine different configuration of machine learning algorithms: nB, naïve Bayes; LR, logistic regression without regularization; LR_r, logistic regression with regularization; SVM_l, support vector machine with linear kernel; SVM_h, SVM with heuristic parameters; SVM_o, SVM optimized through cross-validation; DT, decision tree; RF, random forest; AS, auto-sklearn. See text for details.

Figure 3

Fig. 3. (a) Manhattan plot with univariate t tests of all variables along the x-axis [cognition (Cog), electrophysiology (EEG), structural magnetic resonance imaging (sMRI), and diffusion tensor imaging (DTI)] and log-transformed p values along the y-axis. Lower dashed horizontal line indicates significance level of p = 0.05. Upper dashed lines indicate the Bonferroni-corrected p value for each modality. (b) In colored horizontal lines, the fraction of data splits (see Fig. 1), where individual variables were included in the final machine learning model, which determined the diagnostic accuracy (presented in Fig. 2). Specification of variables is provided in online Supplementary Material. Only configurations of the six machine learning algorithms, which included feature selection, are shown. nB, naïve Bayes; LR, logistic regression without regularization; LR_r, logistic regression with regularization; SVM_l, support vector machine with linear kernel; DT, decision tree; RF, random forest.

Supplementary material: File

Ebdrup et al. supplementary material

Ebdrup et al. supplementary material 1

Download Ebdrup et al. supplementary material(File)
File 1.8 MB