Artificial Epidemiology: How self-evolving clinical AI manufactures disease prevalence from administrative coding artifacts

02 April 2026, Version 1
This content is an early or alternative research output and has not been peer-reviewed by Cambridge University Press at the time of posting.

Abstract

Self-evolving AI agents in clinical documentation may manufacture distorted disease prevalence at population scale. This paper formalizes the risk as a partially observable decision process (POMDP) where AI observes only administrative state (coded data) while clinical state remains hidden, creating a structural reward asymmetry: administrative rewards are immediately measurable while clinical outcomes are delayed and noisy. We introduce the artificial epidemiology divergence D(C) = P_a(C) - P_c(C), measuring the gap between administrative and clinical prevalence, and propose five experimentally testable markers for detecting population-level distortion. A computational pre-study on synthetic electronic health records (Synthea v3.x, n = 11,475 patients, 415,464 SNOMED-CT coded conditions) operationalizes D(C) for five sentinel conditions. Results show substantial baseline divergence: D(C) = +0.378 for diabetes (4,334 patients coded without supporting laboratory evidence), D(C) = -0.024 for hypertension (1,741 patients with clinical evidence but no code), and D(C) = +0.101 for obesity. Comorbidity co-occurrence ratios exceed expected values by 1.5x to 2.4x across all sentinel condition pairs. A documentation-action gap of 75.7% is observed for coded diabetes (diagnosed but pharmacologically untreated). A complementary governance simulation demonstrates that the reification feedback loop amplifies coding distortion by 17x to 21x over five iterative cycles under business-first and equilibrium governance scenarios. All data are synthetic; these results establish measurement methodology and construct operationalization, not clinical evidence. Validation on real-world EHR data with linked clinical registries is the necessary next step.

Keywords

artificial epidemiology
self-evolving AI
reification feedback loop
POMDP
clinical coding
synthetic EHR
health data governance
EHDS

Supplementary materials

Title
Description
Actions
Title
Results: Complementary Corpus Experiments
Description
JSON output of seven additional analyses on the same Synthea dataset: drift sentinel validation (JSD trajectory), cost of drift (4.32x cost ratio), terminology coverage gap (95%), semantic entropy (H=5.35 bits), and cross-organisation coding variance (CV=0.67).
Actions
Title
Analysis Script: D(C) Divergence Computation
Description
Python script computing artificial epidemiology divergence D(C) for five sentinel conditions, upcoding simulation, comorbidity inflation, documentation-action gap, and temporal drift analysis on Synthea v3.x synthetic EHR data (n=11,475).
Actions
Title
Results: D(C) Divergence Pre-Study
Description
Complete JSON output of the D(C) analysis: divergence values for five sentinel conditions, upcoding simulation at 10/20/30% shift, yearly temporal drift trajectory (2000-2026), comorbidity co-occurrence ratios, and documentation-action gap rates.
Actions

Comments

Comments are not moderated before they are posted, but they can be removed by the site moderators if they are found to be in contravention of our Commenting and Discussion Policy [opens in a new tab] - please read this policy before you post. Comments should be used for scholarly discussion of the content in question. You can find more information about how to use the commenting feature here [opens in a new tab] .
This site is protected by reCAPTCHA and the Google Privacy Policy [opens in a new tab] and Terms of Service [opens in a new tab] apply.
Comment number 1, Lance Jill: Apr 29, 2026, 12:41

For years, many people doubted that I would recover from Parkinson’s disease. Even when others lost hope, I refused to give up. I followed my neurologist’s recommendations and used medications such as Carbidopa, Sinemet, and Levodopa. While these treatments provided some temporary relief, my condition eventually worsened, especially by the fifth year. At that point, I decided to explore an alternative approach and tried a herbal program from EarthCure Herbal Clinic( www.earthcureherbalclinic.com). I used their treatment for four months, and over that period, I experienced a remarkable improvement in my symptoms. Today, I am grateful for the progress I have made and for discovering another option that worked for me in curing and reversing my PD and all its symptoms completely. I share my experience for those who may be considering alternative paths alongside conventional treatment or without any conventional treatment. Send them a message on "info@earthcureherbalclinic.com" to get your own treatment for any disease or virus .