Hostname: page-component-76fb5796d-dfsvx Total loading time: 0 Render date: 2024-04-27T14:31:16.810Z Has data issue: false hasContentIssue false

OP86 Chatbot-Based Symptom-Checkers: A Systematic Review

Published online by Cambridge University Press:  23 December 2022

Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.
Introduction

Symptom-checkers are digital health applications (DHA) with diagnostic algorithms. These symptom-checkers claim to improve the diagnostic process and patient guidance. After asking the user to describe the symptoms using a chatbot interface, the symptom-checkers offer a list of potential diagnoses, and/or give recommendations for appropriate action (self-care, doctor’s visit, or emergency care). Because of the growing number and increasing use of these diagnostic DHA, there is a need to evaluate the evidence.

Methods

We updated a British evidence synthesis on symptom-checkers from the National Institute for Health Research (NIHR, 2019). For the systematic update search, we selected four databases. The following endpoints were selected: effectiveness, safety, diagnostic accuracy, triage accuracy, organizational and patient-relevant endpoints. For accuracy studies included from the update search, we assessed the risk of bias (RoB) using the quality assessment tool of diagnostic accuracy studies (QUADAS-2).

Results

The NIHR-report included 27 studies. We added 14 additional studies via update search. One randomized-controlled-trial (RCT) reported a prolonged illness duration when using symptom-checkers (statistically non-significant). No harms when using symptom-checkers were identified (six observational studies). The diagnostic accuracy ranged from 14-84.3 percent (ten observational studies), the triage accuracy ranged from 33-100 percent (eleven observational studies). For organizational endpoints, the results were inconsistent (one RCT, six observational studies). The patient perspective indicates a high usability for symptom-checkers, but the limited description of symptoms and the missing verbal interaction with health personnel were mentioned as hindering factors (nine survey-studies). The QUADAS-2 assessment for RoB was low in one, and high in seven studies.

Conclusions

The studies were often conducted using fictitious case-vignettes, limiting the validity of the evidence. Therefore, the results for the diagnostic and triage accuracy are insufficient to demonstrate a benefit in real-world settings. Additionally, there is a concern for misdiagnosis and overdiagnosis. We recommend a continuous monitoring of these diagnostic DHA, using high-quality studies.

Type
Oral Presentations
Copyright
© The Author(s), 2022. Published by Cambridge University Press