The LOOSE-Rescue Paradox: How Standard Clinical-AI Adjunctness Scoring Mechanically Rescues Documented Complexity on 697,021 Synthetic Encounters

01 June 2026, Version 1
This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

Background. Clinical AI systems routinely score notes against curator-authored canonical-symptom lists. The dominant rule, k=1 set-membership matching, is rarely evaluated under encounter-complexity stratification. We pre-registered a substrate-level test of the Inverse Care Law direction at the data layer: high-complexity encounters were predicted to show higher misalignment than low-complexity ones. Methods. We computed a per-encounter complexity-density score across 697,021 synthetic encounters (Synthea seed_42_n10000) and stratified adjunctness outcomes by complexity decile against a pre-registered falsification protocol. Adjunctness was scored under two Mapping-Curator lookups (hand-curated SNOMED, n=73; MEDLINEPLUS-derived, n=66) and four locked weight regimes including a zero-history sensitivity probe. The pre-registration locked the score, decile boundaries, four falsification triggers, and nine predicted bands. Results. The predicted direction was falsified. SNOMED MISALIGNED rate fell from 63.56 % at decile 1 to 19.98 % at decile 10 (Spearman rho = -1.000; Delta = -43.58 pp). MEDLINEPLUS fell from 81.19 % to 58.73 % (rho = -0.903; Delta = -22.46 pp). Three of four triggers fired in the inverse direction. The mechanism is condition-side LOOSE-rescue: encounters with more documented conditions accumulate more rescue paths, so notes that would otherwise score MISALIGNED are reclassified as LOOSE. The gradient is robust across all four weight regimes. Conclusions. Standard set-membership symptom-adjunctness scoring rewards documented complexity by mechanically expanding the rescue pool, producing inverted Tudor-Hart algorithmic-forgiveness inequity at the data layer before any human decision. The finding generalises to any k=1 construct whose canonical-symptom-set size grows with comorbidity count. We recommend complexity-stratified evaluation with pre-registered directional hypotheses before deployment.

Keywords

clinical AI evaluation
symptom adjunctness
set-membership scoring
complexity stratification
synthetic cohort
pre-registration
falsification
algorithmic forgiveness

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.