Hostname: page-component-89b8bd64d-9prln Total loading time: 0 Render date: 2026-05-07T17:44:42.978Z Has data issue: false hasContentIssue false

Walking Backward to Ensure Risk Management of Large Language Models in Medicine

Published online by Cambridge University Press:  27 June 2025

Daria Onitiu*
Affiliation:
Oxford Internet Institute, University of Oxford , United Kingdom Hasso Plattner Institute, Potsdam, Germany
Sandra Wachter
Affiliation:
Oxford Internet Institute, University of Oxford , United Kingdom Hasso Plattner Institute, Potsdam, Germany
Brent Mittelstadt
Affiliation:
Oxford Internet Institute, University of Oxford , United Kingdom
*
Corresponding author: Daria Onitiu; Email: daria.onitiu@oii.ox.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

This paper examines in what way providers of specialized Large Language Models (LLM) pre-trained and/or fine-tuned on medical data, conduct risk management, define, estimate, mitigate and monitor safety risks under the EU Medical Device Regulation (MDR). Using the example of an Artificial Intelligence (AI)-based medical device for lung cancer detection, we review the current risk management process in the MDR entailing a “forward-walking” approach for providers articulating the medical device’s clear intended use, and moving on sequentially along the definition, mitigation, and monitoring of risks. We note that the forward-walking approach clashes with the MDR requirement for articulating an intended use, as well as circumvents providers reasoning around the risks of specialised LLMs. The forward-walking approach inadvertently introduces different intended users, new hazards for risk control and use cases, producing unclear and incomplete risk management for the safety of LLMs. Our contribution is that the MDR risk management framework requires a backward-walking logic. This concept, similar to the notion of “backward-reasoning” in computer science, entails sub-goals for providers to examine a system’s intended user(s), risks of new hazards and different use cases and then reason around the task-specific options, inherent risks at scale and trade-offs for risk management.

Information

Type
Independent Articles
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of American Society of Law, Medicine & Ethics
Figure 0

Figure 1. An illustration of the risk management lifecycle is based on the manufacturer’s articulation of the intended purpose and use in the MDR. We describe this approach as “forward-walking” to emphasize that risk assessment stems from a clearly articulated intended use and progresses through the stages of risk management to ensure the safety and performance of the device. These stages reinforce each other, constituting an iterative process and a feedback loop for ensuring patient safety.

Figure 1

Figure 2. An outline — non-exhaustive enumeration — of the types of concerns that could arise based on the articulation of the intended purpose and use which in turn, require a set of different actions. We contend that these actions form three deviations from a risk policy. As a result, medical LLMs pose issues for complete specifications for risk management, while undermining the feedback loop on estimation, mitigation, and monitoring of risks.

Figure 2

Figure 3. A revised logic of the MDR risk management framework using a “backward-walking” approach. Providers will use the model “general capabilities,” such as how well the medical LLM summarizes medical knowledge and engages in medical question-answering to define and reevaluate an intended use. The backward-waking logic works alongside the different deviations — intended users, new hazards, and potential use cases — to identify common and connecting factors. These are for providers to identify task-specific options and trade-offs, and to consider inherent risks at scale.