Hostname: page-component-89b8bd64d-rbxfs Total loading time: 0 Render date: 2026-05-10T02:26:53.509Z Has data issue: false hasContentIssue false

Using automated text classification to explore uncertainty in NICE appraisals for drugs for rare diseases

Published online by Cambridge University Press:  05 January 2024

Lea Wiedmann*
Affiliation:
Department of Health Services Research and Policy, Faculty of Public Health and Policy, London School of Hygiene & Tropical Medicine, UK
Jack Blumenau
Affiliation:
Department of Political Science, Faculty of Social & Historical Sciences, University College London, UK
Orlagh Carroll
Affiliation:
Department of Health Services Research and Policy, Faculty of Public Health and Policy, London School of Hygiene & Tropical Medicine, UK
John Cairns
Affiliation:
Department of Health Services Research and Policy, Faculty of Public Health and Policy, London School of Hygiene & Tropical Medicine, UK
*
Corresponding author: Lea Wiedmann; Email: lea.wiedmann@lshtm.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

Objective

This study examined the application, feasibility, and validity of supervised learning models for text classification in appraisals for rare disease treatments (RDTs) in relation to uncertainty, and analyzed differences between appraisals based on the classification results.

Methods

We analyzed appraisals for RDTs (n = 94) published by the National Institute for Health and Care Excellence (NICE) between January 2011 and May 2023. We used Naïve Bayes, Lasso, and Support Vector Machine models in a binary text classification task (classifying paragraphs as either referencing uncertainty in the evidence base or not). To illustrate the results, we tested hypotheses in relation to the appraisal guidance, advanced therapy medicinal product (ATMP) status, disease area, and age group.

Results

The best performing (Lasso) model achieved 83.6 percent classification accuracy (sensitivity = 74.4 percent, specificity = 92.6 percent). Paragraphs classified as referencing uncertainty were significantly more likely to arise in highly specialized technology (HST) appraisals compared to appraisals from the technology appraisal (TA) guidance (adjusted odds ratio = 1.44, 95 percent CI 1.09, 1.90, p = 0.004). There was no significant association between paragraphs classified as referencing uncertainty and appraisals for ATMPs, non-oncology RDTs, and RDTs indicated for children only or adults and children. These results were robust to the threshold value used for classifying paragraphs but were sensitive to the choice of classification model.

Conclusion

Using supervised learning models for text classification in NICE appraisals for RDTs is feasible, but the results of downstream analyses may be sensitive to the choice of classification model.

Information

Type
Method
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Characteristics of analyzed RDT appraisals (2011–2023) and their corresponding paragraphs (stemmed DFM, base case threshold of 0.5)

Figure 1

Figure 1. Proportion of uncertainty paragraphs per appraisal over time (stemmed DFM, base case threshold of 0.5). DFM, document-feature matrix; HST, highly specialized technology appraisal guidance; TA, technology appraisal guidance.

Figure 2

Figure 2. Average proportion of uncertainty paragraphs per decile (stemmed DFM, base case threshold of 0.5). ATMP, advanced therapy medicinal product; DFM, document-feature matrix; HST, highly specialized technology appraisal guidance.

Figure 3

Table 2. Multivariable logistic regression model with uncertainty paragraphs as dependent variable (stemmed DFM, base case threshold of 0.5, N = 4958)

Supplementary material: File

Wiedmann et al. supplementary material
Download undefined(File)
File 1.2 MB