Hostname: page-component-5db58dd55d-jnbmb Total loading time: 0 Render date: 2026-05-29T19:43:41.710Z Has data issue: false hasContentIssue false

Age-specific prevalence and predictors of lifetime suicide attempts using machine learning in Chinese adults: a nationwide multi-centre survey

Published online by Cambridge University Press:  20 October 2025

Yu Wu
Affiliation:
Department of Population Health and Aging Science, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
Yihao Zhao
Affiliation:
Department of Population Health and Aging Science, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
Panliang Zhong
Affiliation:
Department of Population Health and Aging Science, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
Chen Chen
Affiliation:
Department of Population Health and Aging Science, School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
Yibo Wu
Affiliation:
School of Public Health, Peking University, Beijing, China
Xiaoying Zheng*
Affiliation:
China Center for Environmental and Energy Economics (C2E3), Peking University, Chengze Yuan, Beijing, China
*
Corresponding author: Xiaoying Zheng; Email: zhengxiaoying@sph.pumc.edu.cn
Rights & Permissions [Opens in a new window]

Abstract

Aims

The epidemiology and age-specific patterns of lifetime suicide attempts (LSA) in China remain unclear. We aimed to examine age-specific prevalence and predictors of LSA among Chinese adults using machine learning (ML).

Methods

We analyzed 25,047 adults in the 2024 Psychology and Behavior Investigation of Chinese Residents (PBICR-2024), stratified into three age groups (18–24, 25–44, ≥ 45 years). Thirty-seven candidate predictors across six domains—sociodemographic, physical health, mental health, lifestyle, social environment, and self-injury/suicide history—were assessed. Five ML models—random forest, logistic regression, support vector machine (SVM), Extreme Gradient Boosting (XGBoost), and Naive Bayes—were compared. SHapley Additive exPlanations (SHAP) were used to quantify feature importance.

Results

The overall prevalence of LSA was 4.57% (1,145/25,047), with significant age differences: 8.10% in young adults (18–24), 4.67% in adults aged 25–44, and 2.67% in older adults (≥45). SVM achieved the best test-set performance across all ages [area under the curve (AUC) 0.88–0.94, sensitivity 0.79–0.87, specificity 0.81–0.88], showing superior calibration and net clinical benefit. SHAP analysis identified both shared and age-specific predictors. Suicidal ideation, adverse childhood experiences, and suicide disclosure were consistent top predictors across all ages. Sleep disturbances and anxiety symptoms stood out in young adults; marital status, living alone, and perceived stress in mid-life; and functional limitations, poor sleep, and depressive symptoms in older adults.

Conclusions

LSA prevalence in Chinese adults is relatively high, with a clear age gradient peaking in young adulthood. Risk profiles revealed both shared and age-specific predictors, reflecting distinct life-stage vulnerabilities. These findings support age-tailored suicide prevention strategies in China.

Information

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Copyright
© The Author(s), 2025. Published by Cambridge University Press.
Figure 0

Fig. 1. Flow diagram of sample selection.

Figure 1

Fig. 2. Age-specific prevalence and subgroup differences in lifetime suicide attempts (LSA) among adults in China.

Note: (A) LSA prevalence by age group in the overall population; (B) Number of LSA cases by age group; (C) LSA prevalence by age group and sex; (D) LSA prevalence by age group and residential location (urban vs. rural). LSA prevalence differed by age, with all pairwise comparisons significant (all p 
Figure 2

Table 1. Basic characteristics of participants with lifetime suicide attempts across age groups

Figure 3

Fig. 3. Model performance in predicting lifetime suicide attempts across age groups.

Note: (A–C) Receiver Operating Characteristic (ROC) curves and Area Under the Curve (AUC) for 18–24y, 25–44y, and ≥ 45y age groups; (D–F) Calibration curves comparing predicted versus observed probabilities for each model across the three age groups; (G–I) Decision Curve Analyses showing the net benefit of using Support Vector Machine (SVM) models across a range of threshold probabilities. Models included: Logistic Regression (LR), Random Forest (RF), Support Vector Machine (SVM), Extreme Gradient Boosting (XGBoost), and Naive Bayes
Figure 4

Table 2. Comparison of model performance in predicting lifetime suicide attempts (LSA) on the test set data

Figure 5

Fig. 4a. Top 10 features identified by SHAP using the best-performing model (SVM) in the 18-24y group.

Note: 1. Left panel: SHAP bar plot of mean absolute SHAP values (global importance; features ordered by mean SHAP). Right panel: SHAP summary plot (each dot = one participant). Colors encode feature values (red = higher, blue = lower). Positive SHAP values indicate an increase, and negative values a decrease, in the model-predicted probability of LSA. SHAP values were computed on the test set; 2. Abbreviations: SHAP = SHapley Additive exPlanations; SVM = support vector machine; LSA = lifetime suicide attempts; SI = suicidal ideation; ACEs = adverse childhood experiences; AS = anxiety symptoms; SP = suicide plan
Figure 6

Fig. 4b. Top 10 features identified by SHAP using the best-performing model (SVM) in the 25-44y group.

Note: 1. Left panel: SHAP bar plot of mean absolute SHAP values (global importance; features ordered by mean SHAP). Right panel: SHAP summary plot (each dot = one participant). Colors encode feature values (red = higher, blue = lower). Positive SHAP values indicate an increase, and negative values a decrease, in the model-predicted probability of LSA. SHAP values were computed on the test set; 2. Abbreviations: SHAP = SHapley Additive exPlanations; SVM = support vector machine; LSA = lifetime suicide attempts; SI = suicidal ideation; ACEs = adverse childhood experiences; IPV = intimate partner violence; ADL = activities of daily living
Figure 7

Fig. 4c. Top 10 features identified by SHAP using the best-performing model (SVM) in the ≥ 45y group.

Note: 1. Left panel: SHAP bar plot of mean absolute SHAP values (global importance; features ordered by mean SHAP). Right panel: SHAP summary plot (each dot = one participant). Colors encode feature values (red = higher, blue = lower). Positive SHAP values indicate an increase, and negative values a decrease, in the model-predicted probability of LSA. SHAP values were computed on the test set; 2. Abbreviations: SHAP = SHapley Additive exPlanations; SVM = support vector machine; LSA = lifetime suicide attempts; SI = suicidal ideation; ACEs = adverse childhood experiences; ADL = activities of daily living; DS = depressive symptoms; AS = anxiety symptoms; IPV = intimate partner violence; NSSI = non-suicidal self-injury
Supplementary material: File

Wu et al. supplementary material

Wu et al. supplementary material
Download Wu et al. supplementary material(File)
File 11.6 MB