Hostname: page-component-89b8bd64d-dvtzq Total loading time: 0 Render date: 2026-05-09T12:04:40.380Z Has data issue: false hasContentIssue false

Evaluating the performance of machine learning models for automatic diagnosis of patients with schizophrenia based on a single site dataset of 440 participants

Published online by Cambridge University Press:  23 December 2021

Lung-Hao Lee
Affiliation:
Department of Electrical Engineering, National Central University, Taoyuan City, Taiwan Department of Medical Humanities and Education, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan Pervasive Artificial Intelligence Research (PAIR) Labs, Hsinchu, Taiwan
Chang-Hao Chen
Affiliation:
Department of Electrical Engineering, National Central University, Taoyuan City, Taiwan Pervasive Artificial Intelligence Research (PAIR) Labs, Hsinchu, Taiwan
Wan-Chen Chang
Affiliation:
Department of Biomedical Engineering, National Yang Ming Chiao Tung University, Taipei, Taiwan Department of Medical Research, Taipei Veterans General Hospital, Taipei, Taiwan Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan
Po-Lei Lee
Affiliation:
Department of Electrical Engineering, National Central University, Taoyuan City, Taiwan Pervasive Artificial Intelligence Research (PAIR) Labs, Hsinchu, Taiwan
Kuo-Kai Shyu
Affiliation:
Department of Electrical Engineering, National Central University, Taoyuan City, Taiwan Pervasive Artificial Intelligence Research (PAIR) Labs, Hsinchu, Taiwan
Mu-Hong Chen
Affiliation:
Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan Department of Psychiatry, Faculty of Medicine, National Yang-Ming Chiao Tung University, Taipei, Taiwan
Ju-Wei Hsu
Affiliation:
Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan Department of Psychiatry, Faculty of Medicine, National Yang-Ming Chiao Tung University, Taipei, Taiwan
Ya-Mei Bai
Affiliation:
Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan Department of Psychiatry, Faculty of Medicine, National Yang-Ming Chiao Tung University, Taipei, Taiwan Institute of Brain Science, National Yang-Ming Chiao Tung University, Taipei, Taiwan
Tung-Ping Su
Affiliation:
Department of Psychiatry, Faculty of Medicine, National Yang-Ming Chiao Tung University, Taipei, Taiwan Institute of Brain Science, National Yang-Ming Chiao Tung University, Taipei, Taiwan Department of Psychiatry, Cheng Hsin General Hospital, Taipei, Taiwan
Pei-Chi Tu*
Affiliation:
Department of Medical Research, Taipei Veterans General Hospital, Taipei, Taiwan Department of Psychiatry, Taipei Veterans General Hospital, Taipei, Taiwan Department of Psychiatry, Faculty of Medicine, National Yang-Ming Chiao Tung University, Taipei, Taiwan Institute of Philosophy of Mind and Cognition, National Yang-Ming Chiao Tung University, Taipei, Taiwan
*
*Author for correspondence: Pei-Chi Tu, E-mail: peichitu@gmail.com

Abstract

Background

Support vector machines (SVMs) based on brain-wise functional connectivity (FC) have been widely adopted for single-subject prediction of patients with schizophrenia, but most of them had small sample size. This study aimed to evaluate the performance of SVMs based on a large single-site dataset and investigate the effects of demographic homogeneity and training sample size on classification accuracy.

Methods

The resting functional Magnetic Resonance Imaging (fMRI) dataset comprised 220 patients with schizophrenia and 220 healthy controls. Brain-wise FCs was calculated for each participant and linear SVMs were developed for automatic classification of patients and controls. First, we evaluated the SVMs based on all participants and homogeneous subsamples of men, women, younger (18–30 years), and older (31–50 years) participants by 10-fold nested cross-validation. Then, we hold out a fixed test set of 40 participants (20 patients and 20 controls) and evaluated the SVMs based on incremental training sample sizes (N = 40, 80, …, 400).

Results

We found that the SVMs based on all participants had accuracy of 85.05%. The SVMs based on male, female, young, and older participants yielded accuracy of 84.66, 81.56, 80.50, and 86.13%, respectively. Although the SVMs based on older subsamples had better performance than those based on all participants, they generalized poorly to younger participants (77.24%). For incremental training sizes, the classification accuracy increased stepwise from 72.6 to 83.3%, with >80% accuracy achieved with sample size >240.

Conclusions

The findings indicate that SVMs based on a large dataset yield high classification accuracy and establish models using a large sample size with heterogeneous properties are recommended for single subject prediction of schizophrenia.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike licence (http://creativecommons.org/licenses/by-nc-sa/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of the European Psychiatric Association
Figure 0

Table 1. Demographic and clinical features of the patients and controls in this study.

Figure 1

Figure 1. Automatic classifications of schizophrenic patients and healthy controls based on brain-wise functional connectivity. Brain-wise functional connectivity was calculated for each participant according to three different parcellations and linear support vector machines were developed and evaluated for performance. AAL-3 = the automated anatomical labeling atlas version 3; AAL-2 = the automated anatomical labeling atlas version 2.

Figure 2

Table 2. The performance of support vector machines based on different parcellations for automatic classifications of patients with schizophrenic disorder and healthy controls.

Figure 3

Table 3. The performance of support vector machines based on different homogeneous subsamples for automatic classifications of patients with schizophrenic disorder and healthy controls.

Figure 4

Figure 2. The effects of demographic homogeneity and training sample sizes on support vector machines (SVMs) performance. (a) The classification accuracy of SVMs based on all participants and those based on homogeneous subsamples of men, women, younger, and older participants were demonstrated. The SVMs based on homogeneous subsamples were also applied to the other participants with different demographic properties to understand their generalizability. (b) The classification accuracy of SVMs based on incremental training sample sizes improved consistently from 72.61 to 83.32% and >81% accuracy were achieved after training sample size >240.

Figure 5

Table 4. The performance of generalization of support vector machines to participants with different demographic characteristics for automatic classifications of patients with schizophrenic disorder and healthy controls.

Figure 6

Table 5. The performance of support vector machines based on different training sample size for automatic classifications of patients with schizophrenic disorder and healthy controls.

Figure 7

Table 6. The functional connectivity features with greatest contributions to single subject classification of patients with schizophrenia.

Figure 8

Figure 3. The cortical and subcortical structures involved in the functional connectivities with greatest contributions to single subject classification of patients with schizophrenia.

Supplementary material: File

Lee et al. supplementary material

Tables S1-S3

Download Lee et al. supplementary material(File)
File 25.6 KB
Submit a response

Comments

No Comments have been published for this article.