Hostname: page-component-77f85d65b8-jkvpf Total loading time: 0 Render date: 2026-03-28T18:59:34.326Z Has data issue: false hasContentIssue false

Multifaceted Neuroimaging Data Integration via Analysis of Subspaces

Published online by Cambridge University Press:  16 June 2025

Andrew Ackerman
Affiliation:
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Zhengwu Zhang
Affiliation:
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan Hannig
Affiliation:
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jack Prothero
Affiliation:
Statistical Engineering Division, National Institute of Standards and Technology, Boulder, CO, USA
J. S. Marron*
Affiliation:
Department of Statistics and Operations Research, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
*
Corresponding author: J. S. Marron; marron@unc.edu
Rights & Permissions [Opens in a new window]

Abstract

Neuroimaging studies, such as the Human Connectome Project (HCP), often collect multifaceted data to study the human brain. However, these data are often analyzed in a pairwise fashion, which can hinder our understanding of how different brain-related measures interact. In this study, we analyze the multi-block HCP data using data integration via analysis of subspaces (DIVAS). We integrate structural and functional brain connectivity, substance use, cognition, and genetics in an exhaustive five-block analysis. This gives rise to the important finding that genetics is the single data modality most predictive of brain connectivity, outside of brain connectivity itself. Nearly 14% of the variation in functional connectivity (FC) and roughly 12% of the variation in structural connectivity (SC) is attributed to shared spaces with genetics. Moreover, investigations of shared space loadings provide interpretable associations between particular brain regions and drivers of variability. Novel Jackstraw hypothesis tests are developed for the DIVAS framework to establish statistically significant loadings. For example, in the (FC, SC, and substance use) subspace, these novel hypothesis tests highlight largely negative functional and structural connections suggesting the brain’s role in physiological responses to increased substance use. Our findings are validated on genetically relevant subjects not studied in the main analysis.

Information

Type
Application and Case Studies - Original
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Psychometric Society
Figure 0

Figure 1 Schematic representation of five preprocessed HCP-YA data blocks submitted to DIVAS. We present the transpose of each data block, to preserve vertical space. Each block is represented by a different color and lists its number of observations (bottom left corner), number of variables (bottom right corner), and range of values that this data type realizes (centered above the block). For example, FC has 375 observations of 3591 variables taking values between -0.87 and 0.67. A black frame is provide at the same vertical height within each colored box to illustrate that the blocks are linked through common human participants (rows in this transpose orientation).

Figure 1

Table 1 FC/SC variational decomposition

Figure 2

Table 2 Cog/use variational decomposition

Figure 3

Figure 2 DIVAS diagnostic plot for five-block run on FC, SC, Cognition (Cog), Substance-Use (Use), and Genetics (Gene). Rank of each subspace is presented within the colored box corresponding to this subspace. Gray boxes indicate that no variation of that subtype is distinguished. For example, the rank 1 FC-SC-Use partially shared space will be investigated in Section 4.2.

Figure 4

Table 3 Genetics variational decomposition

Figure 5

Figure 3 FC and SC loadings adjacency matrix corresponding to the rank 1 FC-SC-Use subspace. Rows 1–19 represent subcortical (subcort) regions. Rows 20–53 and 54–87 represent the left cortical and right cortical regions, respectively. The upper triangular represents the FC loadings, and the lower triangular represents the SC loadings. Hence, this matrix is not symmetric. SC is more sparse than FC, but both FC and SC appear to be driven by predominantly negative loadings.

Figure 6

Figure 4 FC (left) and SC (right) significant connections in rank 1 FC-SC-Use subspace. FC regions are reordered to correspond to SC regions. These regions correspond to the adjacency matrix in Figure 3. Abbreviations are used to denote brain regions as in frontal lobe (FL), parietal lobe (PL), occipital lobe (OL), and temporal lobe (TL). Left and right hemispheres are denoted by “-l” and “-r,” respectively, and the subcortical regions are distinguished from the cortical regions by “Subcort.” Observe that the vast majority of both FC and SC significant loadings are negative.

Figure 7

Figure 5 Substance use loadings corresponding to rank 1 FC-SC-Use partially shared space. Bars have been color-coded accorded to type of substance use. For example, marijuana use traits are all depicted in yellow. Jackstraw significant traits will have full opacity while insignificant traits are made translucent. Substance use loadings appear to be predominately driven by alcohol use measures.

Figure 8

Table 4 Comprehensive principal angle analysis across original and validation run

Figure 9

Figure B1 DIVAS loadings (left) and scores (right) diagnostic plot corresponding to discovery run. Blocks are ordered top-to-bottom as FC-SC-Cog-Use-Gene. Within each row, two angles are presented. The perturbation angle bound is denoted by the dashed line, and the random direction bound is denoted by the dot-dashed line.

Figure 10

Figure B2 DIVAS loadings (left) and scores (right) diagnostic plot corresponding to validation run. This figure shows the extent to which results from the discovery set are reproduced in the validation set. Within each row, two angles are presented. The perturbation angle bound is denoted by the dashed line, and the random direction bound is denoted by the dot-dashed line.

Figure 11

Figure B3 Histogram and overlaid kernel density estimate of common normalized scores associated with rank 1 FC-SC-Use partially shared space. Shows unimodal structure and no outliers.