Hostname: page-component-8448b6f56d-m8qmq Total loading time: 0 Render date: 2024-04-23T20:43:57.463Z Has data issue: false hasContentIssue false

OP218 Searching Preprint Repositories For COVID-19 Therapeutics Using A Semi-Automated Text-Mining Tool

Published online by Cambridge University Press:  03 December 2021

Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.
Introduction

The COVID-19 pandemic led to a significant surge in clinical research activities in the search for effective and safe treatments. Attempting to disseminate early findings from clinical trials in a bid to accelerate patient access to promising treatments, a rise in the use of preprint repositories was observed. In the UK, NIHR Innovation Observatory (NIHRIO) provided primary horizon-scanning intelligence on global trials to a multi-agency initiative on COVID-19 therapeutics. This intelligence included signals from preliminary results to support the selection, prioritisation and access to promising medicines.

Methods

A semi-automated text mining tool in Python3 used trial IDs (identifiers) of ongoing and completed studies selected from major clinical trial registries according to pre-determined criteria. Two sources, BioRxiv and MedRxiv are searched using the IDs as search criteria. Weekly, the tool automatically searches, de-duplicates, excludes reviews, and extracts title, authors, publication date, URL and DOI. The output produced is verified by two reviewers that manually screen and exclude studies that do not report results.

Results

A total of 36,771 publications were uploaded to BioRxiv and MedRxiv between March 3 and November 9 2020. Approximately 20–30 COVID-19 preprints per week were pre-selected by the tool. After manual screening and selection, a total of 123 preprints reporting clinical trial preliminary results were included. Additionally, 50 preprints that presented results of other study types on new vaccines and repurposed medicines for COVID-19 were also reported.

Conclusions

Using text mining for identification of clinical trial preliminary results proved an efficient approach to deal with the great volume of information. Semi-automation of searching increased efficiency allowing the reviewers to focus on relevant papers. More consistency in reporting of trial IDs would support automation. A comparison of accuracy of the tool on screening titles/abstract or full papers may help to support further refinement and increase efficiency gains.

This project is funded by the NIHR [(HSRIC-2016-10009)/Innovation Observatory]. The views expressed are those of the author(s) and not necessarily those of the NIHR or the Department of Health and Social Care.

Type
Oral Presentations
Copyright
Copyright © The Author(s), 2021. Published by Cambridge University Press