Hostname: page-component-89b8bd64d-z2ts4 Total loading time: 0 Render date: 2026-05-07T14:25:21.573Z Has data issue: false hasContentIssue false

Sociome Data Commons: A scalable and sustainable platform for investigating the full social context and determinants of health

Published online by Cambridge University Press:  07 November 2023

Sandra Tilmon*
Affiliation:
Pediatrics, University of Chicago, Chicago, IL, USA
Sharmilee Nyenhuis
Affiliation:
Pediatrics, University of Chicago, Chicago, IL, USA Medicine, University of Chicago, Chicago, IL, USA
Anthony Solomonides
Affiliation:
NorthShore University Health System, Research Institute, Evanston, IL, USA
Bruno Barbarioli
Affiliation:
Computer Science, University of Chicago, Chicago, IL, USA
Ankur Bhargava
Affiliation:
Chicago Medicine, Chicago, IL, USA
Suzi Birz
Affiliation:
Pediatrics, University of Chicago, Chicago, IL, USA
Kathryn Bouzein
Affiliation:
Pediatrics, University of Chicago, Chicago, IL, USA
Celine Cardenas
Affiliation:
Wake Forest University, Winston-Salem, NC, USA
Bradley Carlson
Affiliation:
Pritzker School of Medicine, University of Chicago, Chicago, IL, USA
Ellen Cohen
Affiliation:
Pediatrics, University of Chicago, Chicago, IL, USA
Emily Dillon
Affiliation:
Psychiatry and Behavioral Sciences, Rush University Medical Center, Chicago, IL, USA
Brian Furner
Affiliation:
Pediatrics, University of Chicago, Chicago, IL, USA
Zhong Huang
Affiliation:
Pritzker School of Medicine, University of Chicago, Chicago, IL, USA
Julie Johnson
Affiliation:
Clinical Research Informatics, University of Chicago, Chicago, IL, USA
Nivedha Krishnan
Affiliation:
University of Illinois at Chicago, Chicago, IL, USA
Kevin Lazenby
Affiliation:
Pritzker School of Medicine, University of Chicago, Chicago, IL, USA
Kaitlyn Li
Affiliation:
University of Chicago, Chicago, IL, USA
Sonya Makhni
Affiliation:
Chicago Medicine, Chicago, IL, USA
Doriane Miller
Affiliation:
Medicine, University of Chicago, Chicago, IL, USA
Jonathan Ozik
Affiliation:
Decision and Infrastructure Sciences Division, Argonne National Laboratory, Lemont, IL, USA
Carlos Santos
Affiliation:
Internal Medicine, Rush University Medical Center, Chicago, IL, USA
Marc Sleiman
Affiliation:
Pritzker School of Medicine, University of Chicago, Chicago, IL, USA
Julian Solway
Affiliation:
Medicine, University of Chicago, Chicago, IL, USA
Sanjay Krishnan
Affiliation:
Computer Science, University of Chicago, Chicago, IL, USA
Samuel Volchenboum
Affiliation:
Pediatrics, University of Chicago, Chicago, IL, USA
*
Corresponding author: S. Tilmon, MS, MPH; Email: stilmon@bsd.uchicago.edu
Rights & Permissions [Opens in a new window]

Abstract

Background/Objective:

Non-clinical aspects of life, such as social, environmental, behavioral, psychological, and economic factors, what we call the sociome, play significant roles in shaping patient health and health outcomes. This paper introduces the Sociome Data Commons (SDC), a new research platform that enables large-scale data analysis for investigating such factors.

Methods:

This platform focuses on “hyper-local” data, i.e., at the neighborhood or point level, a geospatial scale of data not adequately considered in existing tools and projects. We enumerate key insights gained regarding data quality standards, data governance, and organizational structure for long-term project sustainability. A pilot use case investigating sociome factors associated with asthma exacerbations in children residing on the South Side of Chicago used machine learning and six SDC datasets.

Results:

The pilot use case reveals one dominant spatial cluster for asthma exacerbations and important roles of housing conditions and cost, proximity to Superfund pollution sites, urban flooding, violent crime, lack of insurance, and a poverty index.

Conclusion:

The SDC has been purposefully designed to support and encourage extension of the platform into new data sets as well as the continued development, refinement, and adoption of standards for dataset quality, dataset inclusion, metadata annotation, and data access/governance. The asthma pilot has served as the first driver use case and demonstrates promise for future investigation into the sociome and clinical outcomes. Additional projects will be selected, in part for their ability to exercise and grow the capacity of the SDC to meet its ambitious goals.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NC
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of The Association for Clinical and Translational Science
Figure 0

Figure 1. Sociome Data Commons (SDC): guiding principles (left), interface showing four datasets for illustration (right).

Figure 1

Table 1. Sociome Data Commons (SDC) standards

Figure 2

Table 2. University of Chicago asthma visits for Chicago pediatric patients, 2017–2019: A. All patients residing in Chicago; B. All patients residing on the South Side of Chicago; and C. All patients residing in spatial cluster 1

Figure 3

Figure 2. Asthma visits 2017–2019 by census tract. a. Asthma visit counts (continuous); b. Exacerbations as a proportion of all asthma visits; and c. Spatial clustering for exacerbations. The University of Chicago hospital is in red and its 5-mile perimeter is represented with a dashed red line.

Figure 4

Figure 3. Sociome Data Commons (SDC): data pipeline.

Figure 5

Figure 4. Variable importance: a. Top 10 variables for the full south side; b. Top 10 variables for cluster 1; c. Histogram of gain for all variables, full south side; and d. Histogram of gain for all variables, cluster 1 only.

Figure 6

Figure 5. Select south side maps: a. Average housing age; b. Median rent; c. Urban flood susceptibility; d. Violent crime rate; e. Proximity to superfund sites; and f. Poverty principal component analysis (PCA). Cluster 1 census tracts are outlined in white, and the University of Chicago hospital is in red with its 5-mile perimeter represented with a dashed red line.

Supplementary material: File

Tilmon et al. supplementary material

Tilmon et al. supplementary material
Download Tilmon et al. supplementary material(File)
File 342.9 KB