Hostname: page-component-89b8bd64d-sd5qd Total loading time: 0 Render date: 2026-05-07T15:59:12.129Z Has data issue: false hasContentIssue false

Uncovering key clinical trial features influencing recruitment

Published online by Cambridge University Press:  04 September 2023

Betina Idnay
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Yilu Fang
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Alex Butler
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Joyce Moran
Affiliation:
Department of Neurology, Columbia University Irving Medical Center, NY Research, New York, NY, USA
Ziran Li
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Junghwan Lee
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Casey Ta
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Cong Liu
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Chi Yuan
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Huanyao Chen
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Edward Stanley
Affiliation:
Compliance Applications, Information Technology, Columbia University, New York, NY, USA
George Hripcsak
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
Elaine Larson
Affiliation:
School of Nursing, Columbia University Irving Medical Center, New York, NY, USA New York Academy of Medicine, New York, NY, USA
Karen Marder
Affiliation:
Department of Neurology, Columbia University Irving Medical Center, NY Research, New York, NY, USA
Wendy Chung
Affiliation:
Department of Pediatrics, Columbia University Irving Medical Center, New York, NY, USA
Brenda Ruotolo
Affiliation:
Institutional Review Board for Human Subjects Research, Columbia University, New York, NY, USA
Chunhua Weng*
Affiliation:
Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, USA
*
Corresponding author: Chunhua Weng, PhD; Email: chunhua@columbia.edu
Rights & Permissions [Opens in a new window]

Abstract

Background:

Randomized clinical trials (RCT) are the foundation for medical advances, but participant recruitment remains a persistent barrier to their success. This retrospective data analysis aims to (1) identify clinical trial features associated with successful participant recruitment measured by accrual percentage and (2) compare the characteristics of the RCTs by assessing the most and least successful recruitment, which are indicated by varying thresholds of accrual percentage such as ≥ 90% vs ≤ 10%, ≥ 80% vs ≤ 20%, and ≥ 70% vs ≤ 30%.

Methods:

Data from the internal research registry at Columbia University Irving Medical Center and Aggregated Analysis of ClinicalTrials.gov were collected for 393 randomized interventional treatment studies closed to further enrollment. We compared two regularized linear regression and six tree-based machine learning models for accrual percentage (i.e., reported accrual to date divided by the target accrual) prediction. The outperforming model and Tree SHapley Additive exPlanations were used for feature importance analysis for participant recruitment. The identified features were compared between the two subgroups.

Results:

CatBoost regressor outperformed the others. Key features positively associated with recruitment success, as measured by accrual percentage, include government funding and compensation. Meanwhile, cancer research and non-conventional recruitment methods (e.g., websites) are negatively associated with recruitment success. Statistically significant subgroup differences (corrected p-value < .05) were found in 15 of the top 30 most important features.

Conclusion:

This multi-source retrospective study highlighted key features influencing RCT participant recruitment, offering actionable steps for improvement, including flexible recruitment infrastructure and appropriate participant compensation.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of The Association for Clinical and Translational Science
Figure 0

Figure 1. Overall study methodology.

Figure 1

Figure 2. Randomized clinical trials (RCT) selection in research compliance and administration system (RASCAL) and clinicalTrials.gov registries. Each box illustrates the number of RCTs after applying the inclusion and exclusion criteria. AACT = aggregate analysis of clinicaltrials.gov.

Figure 2

Table 1. Features of the included RCTs

Figure 3

Table 2. Target clinical domain for the included rcts according to medical subject headings (MeSH) category extracted from AACT (n = 393)

Figure 4

Table 3. Performances of the eight regression models for accrual percentage prediction

Figure 5

Figure 3. Tree SHapley additive exPlanations (SHAP) summary plot with the Top 30 Most important features associated with RCT recruitment success. The SHAP values have been log scaled. *Features are continuous variables, whereas the others are binary variables. C16: congenital, hereditary, and neonatal diseases and abnormalities. CO4 = neoplasms; C10 = nervous system diseases; C06 = digestive system diseases; C19 = endocrine system diseases; C15 = hemic and lymphatic diseases.

Figure 6

Table 4. Features with a significant difference (Corrected P-value < .05) between the best and worst recruitment group under different cutoffs (i.e., ≥ 90% v. ≤ 10%, ≥ 80% vs ≤ 20%, and ≥ 70% vs ≤ 30%) among the top 30 most important features

Supplementary material: File

Idnay et al. supplementary material 1
Download undefined(File)
File 2.2 MB
Supplementary material: File

Idnay et al. supplementary material 2
Download undefined(File)
File 148 KB