Hostname: page-component-89b8bd64d-r6c6k Total loading time: 0 Render date: 2026-05-08T03:52:25.909Z Has data issue: false hasContentIssue false

Machine learning for identifying randomised controlled trials when conducting systematic reviews: Development and evaluation of its impact on practice

Published online by Cambridge University Press:  21 March 2025

Xuan Qin
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Minghong Yao
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Xiaochao Luo
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Jiali Liu
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Yu Ma
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Yanmei Liu
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Hao Li
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Ke Deng
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Kang Zou
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Ling Li*
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Xin Sun*
Affiliation:
Institute of Integrated Traditional Chinese and Chinese Evidence-based Medicine Center and Cochrane China Center and MAGIC China Center, West China Hospital, Sichuan University, Chengdu, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China Sichuan University West China College of Public Health/West China Fourth Hospital, Chengdu, China
*
Corresponding authors: Xin Sun and Ling Li; Emails: sunxin@wchscu.cn; liling@wchscu.cn
Corresponding authors: Xin Sun and Ling Li; Emails: sunxin@wchscu.cn; liling@wchscu.cn
Rights & Permissions [Opens in a new window]

Abstract

Machine learning (ML) models have been developed to identify randomised controlled trials (RCTs) to accelerate systematic reviews (SRs). However, their use has been limited due to concerns about their performance and practical benefits. We developed a high-recall ensemble learning model using Cochrane RCT data to enhance the identification of RCTs for rapid title and abstract screening in SRs and evaluated the model externally with our annotated RCT datasets. Additionally, we assessed the practical impact in terms of labour time savings and recall improvement under two scenarios: ML-assisted double screening (where ML and one reviewer screened all citations in parallel) and ML-assisted stepwise screening (where ML flagged all potential RCTs, and at least two reviewers subsequently filtered the flagged citations). Our model achieved twice the precision compared to the existing SVM model while maintaining a recall of 0.99 in both internal and external tests. In a practical evaluation with ML-assisted double screening, our model led to significant labour time savings (average 45.4%) and improved recall (average 0.998 compared to 0.919 for a single reviewer). In ML-assisted stepwise screening, the model performed similarly to standard manual screening but with average labour time savings of 74.4%. In conclusion, compared with existing methods, the proposed model can reduce workload while maintaining comparable recall when identifying RCTs during the title and abstract screening stages, thereby accelerating SRs. We propose practical recommendations to effectively apply ML-assisted manual screening when conducting SRs, depending on reviewer availability (ML-assisted double screening) or time constraints (ML-assisted stepwise screening).

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Open Practices
Open data
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of The Society for Research Synthesis Methodology
Figure 0

Figure 1 Model Frame.

Figure 1

Figure 2 Overview of workflow and labour time in different screening scenario.

Figure 2

Table 1 Algorithmic evaluation of models on the internal test from Cochrane Crowd RCT dataset

Figure 3

Table 2 Single Model Performance in RCT Identifying: Algorithmic evaluation and practical evaluation on the internal and external test datasets

Figure 4

Table 3 Comparative analysis of epidemiological metrics and labour time in screening scenarios for practical evaluation

Figure 5

Figure 3 Comparative analysis of labor time saved (%) compared to standard manual screening across different screening scenarios.

Figure 6

Table 4 Recommendations for SR reviewers to apply ML-assisted manual screening to identify RCTs for rapid title and abstract screening in SRs

Supplementary material: File

Qin et al. supplementary material

Qin et al. supplementary material
Download Qin et al. supplementary material(File)
File 20 KB