Hostname: page-component-6766d58669-7fx5l Total loading time: 0 Render date: 2026-05-20T23:59:17.530Z Has data issue: false hasContentIssue false

Application of natural language processing to predict final recommendation of Brazilian health technology assessment reports

Published online by Cambridge University Press:  12 April 2024

Marilia Mastrocolla de Almeida Cardoso*
Affiliation:
Health Technology Assessment Unit, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil Laboratory of Data Science and Predictive Analysis in Health, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil
Juliana Machado-Rugolo
Affiliation:
Health Technology Assessment Unit, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil Laboratory of Data Science and Predictive Analysis in Health, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil
Lehana Thabane
Affiliation:
Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada Biostatistics Unit, St Joseph’s Healthcare Hamilton, Hamilton, ON, Canada Faculty of Health Sciences, University of Johannesburg, Johannesburg, South Africa
Naila Camila da Rocha
Affiliation:
Laboratory of Data Science and Predictive Analysis in Health, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil
Abner Mácula Pacheco Barbosa
Affiliation:
Laboratory of Data Science and Predictive Analysis in Health, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil Department of Ophthalmology, Otorhinolaryngology and Head and Neck Surgery, Medical School (FMB) of São Paulo State University, Botucatu, Brazil
Denis Satoshi Komoda
Affiliation:
Department of Collective Health, University of Campinas, Campinas, Brazil
Juliana Tereza Coneglian de Almeida
Affiliation:
Health Technology Assessment Unit, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil Laboratory of Data Science and Predictive Analysis in Health, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil
Daniel da Silva Pereira Curado
Affiliation:
Department of Management and Incorporation of Health Technologies, Ministry of Health, Brasilia, Distrito Federal, Brazil
Silke Anna Theresa Weber
Affiliation:
Health Technology Assessment Unit, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil Department of Ophthalmology, Otorhinolaryngology and Head and Neck Surgery, Medical School (FMB) of São Paulo State University, Botucatu, Brazil
Luis Gustavo Modelli de Andrade
Affiliation:
Laboratory of Data Science and Predictive Analysis in Health, Hospital das Clínicas da Faculdade de Medicina de Botucatu, Botucatu, Brazil Department of Internal Medicine, Medical School (FMB) of São Paulo State University, Botucatu, Brazil
*
Corresponding author: Marilia Mastrocolla de Almeida Cardoso; Email: marilia.cardoso@unesp.br
Rights & Permissions [Opens in a new window]

Abstract

Introduction

Health technology assessment (HTA) plays a vital role in healthcare decision-making globally, necessitating the identification of key factors impacting evaluation outcomes due to the significant workload faced by HTA agencies.

Objectives

The aim of this study was to predict the approval status of evaluations conducted by the Brazilian Committee for Health Technology Incorporation (CONITEC) using natural language processing (NLP).

Methods

Data encompassing CONITEC’s official report summaries from 2012 to 2022. Textual data was tokenized for NLP analysis. Least Absolute Shrinkage and Selection Operator, logistic regression, support vector machine, random forest, neural network, and extreme gradient boosting (XGBoost), were evaluated for accuracy, area under the receiver operating characteristic curve (ROC AUC) score, precision, and recall. Cluster analysis using the k-modes algorithm categorized entries into two clusters (approved, rejected).

Results

The neural network model exhibited the highest accuracy metrics (precision at 0.815, accuracy at 0.769, ROC AUC at 0.871, and recall at 0.746), followed by XGBoost model. The lexical analysis uncovered linguistic markers, like references to international HTA agencies’ experiences and government as demandant, potentially influencing CONITEC’s decisions. Cluster and XGBoost analyses emphasized that approved evaluations mainly concerned drug assessments, often government-initiated, while non-approved ones frequently evaluated drugs, with the industry as the requester.

Conclusions

NLP model can predict health technology incorporation outcomes, opening avenues for future research using HTA reports from other agencies. This model has the potential to enhance HTA system efficiency by offering initial insights and decision-making criteria, thereby benefiting healthcare experts.

Information

Type
Assessment
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. Performance (accuracy, ROC AUC, precision, and recall) of each model, Brazil, 2023

Figure 1

Figure 1. The top twenty important features are determined by SHAP values for the XGBoost model. The mean absolute SHAP values on the left side show global feature importance, while the local explanation summary on the right side indicates the relationship between a variable and the process outcome. Positive SHAP values indicate approval, while negative values indicate non-approval.

Figure 2

Figure 2. Composition of groups 0 and 1 by proponent: government, industry, and society.

Figure 3

Figure 3. Composition of groups 0 and 1 by type: drugs, procedures, and products.

Figure 4

Figure 4. Composition of groups 0 and 1 by action: expansion, incorporation, and modification.

Supplementary material: File

Cardoso et al. supplementary material

Cardoso et al. supplementary material
Download Cardoso et al. supplementary material(File)
File 56.1 KB