Hostname: page-component-6bb9c88b65-xjl2h Total loading time: 0 Render date: 2025-07-22T07:21:30.513Z Has data issue: false hasContentIssue false

P.184 Synthetic neurosurgical data generation using large language models

Published online by Cambridge University Press:  10 July 2025

AA Barr
Affiliation:
(Calgary)*
E Guo
Affiliation:
(Calgary)
E Sezgin
Affiliation:
(Columbus)
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Background: Use of neurosurgical data for research and machine learning model development is often constrained by privacy regulations, small sample sizes, and resource-intensive data preprocessing. We explored the feasibility of using the large language model (LLM) GPT-4o to generate synthetic neurosurgical data. Methods: A plain-language prompt instructed GPT-4o to generate synthetic data based on univariate and bivariate statistical properties of 12 perioperative parameters from a real-world open-access neurosurgical dataset (n = 139). The prompt was input over independent trials to generate 10 datasets matching the reference size (n = 139), followed by an additional dataset representing a ten-fold amplification (n = 1390). Fidelity was assessed using t-tests, two-sample proportion tests, Jensen-Shannon divergence, two-sample Kolmogorov-Smirnov, and Pearson’s product-moment correlation. Results: Generated data preserved distributional characteristics and relationships between desired parameters. In all generations, at least 11/12 (91.67%) parameters showed no statistically significant differences in means and proportions from real data, including the amplified dataset. Five of the synthetic datasets showed no significant differences in all 12 parameters. Conclusions: The findings demonstrate that a zero-shot prompting approach can generate synthetic neurosurgical data and amplify sample sizes with consistent high fidelity compared to real-world data. This underscores LLMs’ potential in addressing data availability challenges for neurosurgical research.

Information

Type
Abstracts
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Canadian Neurological Sciences Federation