Hostname: page-component-5db58dd55d-lqwgf Total loading time: 0 Render date: 2026-06-01T02:13:56.848Z Has data issue: false hasContentIssue false

Early prediction of ADHD symptoms from perinatal characteristics: A machine learning study

Published online by Cambridge University Press:  10 November 2025

Yee-Lam Ho*
Affiliation:
Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
Bonnie Auyeung
Affiliation:
Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
Aja Murray
Affiliation:
Department of Psychology, School of Philosophy, Psychology and Language Sciences, University of Edinburgh, Edinburgh, UK
*
Corresponding author: Yee-Lam Ho; Email: elimhylacademic@gmail.com
Rights & Permissions [Opens in a new window]

Abstract

Early identification of risk for attention-deficit hyperactivity disorder (ADHD) symptoms can enable more timely interventions and improve long-term outcomes. While previous research has linked various maternal and perinatal factors to ADHD, few studies have examined these predictors collectively in a single comprehensive analysis. This study aimed to assess whether later ADHD symptoms can be predicted from information available at birth, specifically ethnicity, maternal metabolic markers, mental health, and socioeconomic status. It additionally aimed to identify the most influential predictors. Using data from the Born in Bradford (BiB) study, we applied multiple linear regression (LR) and machine learning techniques to predict ADHD symptoms as measured by the Hyperactivity/Inattention subscale of the Strengths and Difficulties Questionnaire (SDQ). A 10-fold cross-validated LR model explained 6.97% of the variance in SDQ scores. In the random forest model, infant male sex and maternal smoking during pregnancy emerged as the top predictors. These findings provide proof of principle for early identification of children at risk of ADHD. Future models may benefit from incorporating additional perinatal data to improve predictive accuracy.

Information

Type
Regular Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press
Figure 0

Table 1. Descriptive statistics of continuous variables

Figure 1

Table 2. Descriptive statistics of categorical variables

Figure 2

Table 3. Multiple LR results

Figure 3

Table 4. Accuracy metrics of the MR regression, CART, and RF models

Figure 4

Figure 1. BarplotoftheFeatureImportancefor the Multiple linear regression (LR) model.

Figure 5

Figure 2. Bar plot of the feature importance for the classification and regression trees (CART) model.

Figure 6

Figure 3. Bar plot of the feature importance for the random forest (RF) model.

Supplementary material: File

Ho et al. supplementary material

Ho et al. supplementary material
Download Ho et al. supplementary material(File)
File 16.9 MB