Hostname: page-component-89b8bd64d-j4x9h Total loading time: 0 Render date: 2026-05-05T14:49:51.960Z Has data issue: false hasContentIssue false

A method to enable clinical and translational research teams with custom real-world data from electronic health record systems

Published online by Cambridge University Press:  02 January 2026

Thomas R. Campion Jr.*
Affiliation:
Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY, USA Department of Pediatrics, Weill Cornell Medicine, New York, NY, USA
Evan T. Sholle
Affiliation:
Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
Xiaobo Fuld
Affiliation:
Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
Cindy Chen
Affiliation:
Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
Marcos A. Davila
Affiliation:
Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
Vinay I. Varughese
Affiliation:
Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA
Curtis L. Cole
Affiliation:
Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA Information Technologies & Services Department, Weill Cornell Medicine, New York, NY, USA Clinical & Translational Science Center, Weill Cornell Medicine, New York, NY, USA Department of Medicine, Weill Cornell Medicine, New York, NY, USA
*
Corresponding author: T. R. Campion Jr.; Email: thc2015@med.cornell.edu
Rights & Permissions [Opens in a new window]

Abstract

Introduction:

Custom transformations of real-world data (RWD) from electronic health record (EHR) systems are necessary to define study variables describing health and disease statuses differently among physicians in multiple specialties and basic scientists from a variety of disciplines . To increase RWD use, we hypothesized that a solution supporting three workflows – discovery, collection, and analysis – using existing rather than novel tools and requiring financial commitment from investigators would scale to meet the needs of clinical and translational research teams and ensure regulatory compliance at an academic medical center.

Materials and methods:

Weill Cornell Medicine (WCM) implemented custom research data repositories (RDRs) consisting of i2b2 for discovery, REDCap for collection, and Microsoft SQL Server for analysis. WCM subsidized the central information technology (IT) department to manage RDRs and required investigators to commit $50,000 for RDR startup and $7500 for annual maintenance.

Results:

From 2013 through 2025, WCM launched more than 17 custom RDRs for pediatrics, myeloproliferative neoplasms, obstetrics and gynecology, pulmonary and critical care, chronic kidney disease, and ophthalmology among other areas. Custom RDRs enabled academic output (e.g., publications, grants) as well as local quality improvement activities.

Discussion:

Custom RDRs facilitated delivery of fit-for-purpose data sets derived from EHR systems and other RWD sources. Over time, RDRs have evolved from an infrastructure product delivered by central IT to a data partnership between investigators and IT.

Conclusion:

Custom RDRs and data partnerships may help increase the use of RWD from EHR and other sources by clinical and translational research teams.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press on behalf of Association for Clinical and Translational Science
Figure 0

Figure 1. A custom research data repository (RDR) aggregates data from disparate sources, transforms data into research-ready formats, and supports three workflows using off-the-shelf tools.

Figure 1

Table 1. Research data repository (RDR) activities by investigator group

Figure 2

Figure 2. Spectrum of transformation of real-world data from electronic health record systems to enable analytics. OMOP = Observational Medical Outcomes Partnership; CDM = common data model.