Hostname: page-component-89b8bd64d-7zcd7 Total loading time: 0 Render date: 2026-05-06T13:10:36.014Z Has data issue: false hasContentIssue false

Extended correlation functions for spatial analysis of multiplex imaging data

Published online by Cambridge University Press:  15 February 2024

Joshua A. Bull*
Affiliation:
Wolfson Centre for Mathematical Biology, Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK
Eoghan J. Mulholland
Affiliation:
Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
Simon J. Leedham
Affiliation:
Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK Translational Gastroenterology Unit, John Radcliffe Hospital, University of Oxford, Oxford OX3 9DU, UK Oxford NIHR Biomedical Research Centre, John Radcliffe Hospital, University of Oxford, Oxford OX3 9DU, UK
Helen M. Byrne
Affiliation:
Wolfson Centre for Mathematical Biology, Mathematical Institute, University of Oxford, Oxford OX2 6GG, UK Ludwig Institute for Cancer Research, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7DQ, UK
*
Corresponding author: Joshua A. Bull; Email: bull@maths.ox.ac.uk
Rights & Permissions [Opens in a new window]

Abstract

Imaging platforms for generating highly multiplexed histological images are being continually developed and improved. Significant improvements have also been made in the accuracy of methods for automated cell segmentation and classification. However, less attention has focused on the quantification and analysis of the resulting point clouds, which describe the spatial coordinates of individual cells. We focus here on a particular spatial statistical method, the cross-pair correlation function (cross-PCF), which can identify positive and negative spatial correlation between cells across a range of length scales. However, limitations of the cross-PCF hinder its widespread application to multiplexed histology. For example, it can only consider relations between pairs of cells, and cells must be classified using discrete categorical labels (rather than labeling continuous labels such as stain intensity). In this paper, we present three extensions to the cross-PCF which address these limitations and permit more detailed analysis of multiplex images: topographical correlation maps can visualize local clustering and exclusion between cells; neighbourhood correlation functions can identify colocalization of two or more cell types; and weighted-PCFs describe spatial correlation between points with continuous (rather than discrete) labels. We apply the extended PCFs to synthetic and biological datasets in order to demonstrate the insight that they can generate.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press
Figure 0

Table 1. List of markers and opals used in the multiplex panel

Figure 1

Figure 1. Obtaining point cloud data from a multiplex image. (a) 1 mm × 1 mm ROI from a multiplex IHC image of murine colorectal carcinoma (blue – DAPI; orange – CD4; green – CD68; magenta – Ly6G; maroon – FoxP3; red – CD8; white – E-cadherin). The epithelial cells (E-cadherin+) are cancer cells which form dense “tumor nests” that are surrounded by stromal regions. Immune cells are largely restricted to the stroma between tumor nests, so the region shows spatial correlation between immune cell subtypes (particularly macrophage, neutrophil, and T helper cell) within the stroma, and anticorrelation between immune cells and epithelial cells. (b) Cell segmentation (HALO) for the region in panel a. The edges of E-cadherin positive cells are shown in pink to aid comparison with panel a. (c) Pixel intensity from the color channel corresponding to the CD4 stain only. (d) Composite point cloud formed by classifying each cell type stained in panel a, with points placed at the centroids of segmented cells. Lower row: Magnified $ 500\hskip0.4em \mu \mathrm{m}\times 500\hskip0.4em \mu \mathrm{m} $ zoom from the upper panels.

Figure 2

Table 2. Cell types present in the ROI, with markers and number of cells present. Note that all cells must also contain sufficient DAPI staining to be classified as a cell. Due to low numbers of cytotoxic T cells and regulatory T cells, we exclude them from subsequent analyses

Figure 3

Figure 2. Motivating example I: Cross-PCF and topographical correlation map. (a) Synthetic dataset I: a synthetic point pattern involving two cell types, with labels $ {C}_1 $ and $ {C}_2 $. For $ 0\le x\le 500 $, points with labels $ {C}_1 $ and $ {C}_2 $ cluster together; for $ 500, points of types $ {C}_1 $ and $ {C}_2 $ form distinct, homogeneous clusters. (b) The cross-PCFs $ {g}_{C_1{C}_2}(r) $ and $ {g}_{C_2{C}_1}(r) $ for the point pattern in panel a. The cross-PCF detects the short range clustering between cells of types $ {C}_1 $ and $ {C}_2 $, which is present for $ 0\le x\le 500 $. The cross-PCFs are almost identical, differing only for large $ r $ because of boundary correction terms. (c) Function used to linearize the mark $ {m}_{C_1{C}_2} $ in Equation (6), used to calculate the TCM, for $ \alpha =5 $. Dashed lines represent $ {m}_{C_1{C}_2}=1/\alpha, 1,\alpha $, which correspond to the maximum detectable exclusion, CSR, and the maximum detectable clustering. (d, e) TCMs $ {\Gamma}_{C_1{C}_2}\left(r=50,\mathbf{x}\right) $ and $ {\Gamma}_{C_2{C}_1}\left(r=50,\mathbf{x}\right) $. The TCM identifies colocalization between cells of types $ {C}_1 $ and $ {C}_2 $ in $ 0\le x\le 500 $, and distinguishes between the dense cluster in the top left quadrant and smaller clusters in the bottom left quadrant. The TCM also identifies exclusion between the two cell populations in $ 500\le x\le 1000 $ and shows this to be less pronounced than the clustering in $ 0\le x\le 500 $. Note that while the regions of positive correlation are similar between panels d and e, the regions of negative correlation differ.

Figure 4

Figure 3. Motivating example II: Neighbourhood Correlation Function. (a, e) Synthetic dataset II: point patterns in which three cell types are spatially correlated pairwise (a) or in triplets (e). In (a), each cluster contains only two cell types, so that all three cell types are never in close proximity. In (e), all three cell types are in close proximity in each cluster. Hence, in both point patterns, there is positive correlation between pairwise combinations of cell types, but the three-way correlations differ between the panels. (b, f) Cross-PCFs for the point patterns in panels a and e, respectively. These cross-PCFs appear identical, showing strong short-range correlation between the cell types (inside a cluster), exclusion from $ r=0.2 $ to $ r=0.4 $, and a second peak of correlation around $ r=0.6 $ (between clusters). (c, g) Minimum enclosing circle for every combination of three points with marks $ {C}_1 $, $ {C}_2 $, and $ {C}_3 $ (up to circles with a radius of $ r=0.3 $). Circles with small radii arise when all three cell types are in close proximity (panel g). Circles are colored according to their radius. (d, h) NCFs for the point patterns in panels a and e, respectively. The NCF in panel d correctly identifies short-range exclusion between the three cell types in panel a, while the NCF in panel h identifies strong short-range correlation between the three cell types.

Figure 5

Figure 4. Motivating example III: weighted-PCF. (a) Synthetic dataset I: the same point pattern from Figure 2, now shown with the continuous mark $ m $ associated with cells of type $ {C}_2 $. Recall cells of type $ {C}_2 $ with $ 0\le x\le 500 $ have $ 0\le m< $0.5, while those with $ 500 have $ 0.5\le m\le 1 $. (b) The wPCF, $ wPCF\left(r,{C}_1,m\right) $, for the point pattern in panel a identifies differences in clustering between cells of type $ {C}_1 $ and cells of type $ {C}_2 $ with marks above or below $ m=0.5 $. (c) Cross sections of the wPCF in panel b. These plots distinguish the strong clustering of cells of type $ {C}_1 $ with cells of type $ {C}_2 $ that have $ m<0.5 $ and their weak exclusion from cells of type $ {C}_2 $ that have $ m>0.5 $.

Figure 6

Figure 5. Cross-PCFs for pairwise combinations of cell types in the ROI. Cross-PCFs for pairs of cell types from the ROI. We observe exclusion between epithelium and all immune cell subtypes, and strong pairwise correlation with macrophages, neutrophils, and T helper cells on short length scales. Results involving regulatory and cytotoxic T Cells are omitted as their cell counts are low in this ROI.

Figure 7

Figure 6. PCF and TCM for positively correlated cell types. (a) Locations of T helper cells (CD4+, orange) and macrophages (CD68+, green) in the ROI (with DAPI, blue). These cell types colocalize in the tissue between epithelial cell islands. (b) Cell centers identified as T helper cells (orange) and macrophages (green). (c) Cross-PCF for T helper cells to macrophages, $ {g}_{ThM}(r) $. These cell types are spatially colocated over a wide range of distances, that is, $ {g}_{ThM}(r)>1 $ for $ 0\hskip0.35em \lesssim \hskip0.35em r\hskip0.35em \lesssim \hskip0.35em 75\hskip0.3em \mu \mathrm{m} $. (d) TCM for T helper cells to macrophages, $ {\Gamma}_{ThM} $, for $ r=50\hskip0.3em \mu \mathrm{m} $. Red regions indicate colocalization of the cell types in stromal regions, while blue regions correspond to isolated T helper cells.

Figure 8

Figure 7. PCF and TCM for negatively correlated cell types. (a) Locations of T helper (CD4+, orange) and epithelial cells (E-cadherin+, white) in the ROI (with DAPI, blue). Epithelial cells exist in clumped “nests,” with T helper cells restricted to the stromal regions between them. (b) Cell centers of T helper cells (orange) and epithelial cells (blue). (c) PCF for T helper cells to epithelial cells, $ {g}_{ThE}(r) $. We observe strong spatial exclusion, as $ {g}_{ThE}(r)<1 $ for $ r\hskip0.35em \lesssim \hskip0.35em 75 $. (d) TCM for T helper cells to epithelial cells, $ {\Gamma}_{ThE}\left(r,\mathbf{x}\right) $, for $ r=50\hskip0.3em \mu \mathrm{m} $. The blue regions showing strong exclusion indicate subregions of the ROI which are devoid of epithelial cells. The strongest signals occur where T helper cells are organized in large clusters, while regions with few T helper cells do not contribute significantly to the cross-PCF.

Figure 9

Figure 8. The NCF identifies spatial colocalization between three cell types. (a) Locations of T helper cells (orange), macrophages (green), and neutrophils (purple) extracted from the ROI. All three cell types are found in stromal regions, while macrophages and neutrophils are more likely to be observed within the epithelial islands (e.g., in the top left corner). (b) Expected and observed numbers of circles of radius $ r $. (c) NCF obtained by computing the ratio of the curves in panel b, $ {\mathrm{NCF}}_{ThMN}(r) $. For $ r\hskip0.35em \lesssim \hskip0.35em 75\hskip0.3em \mu \mathrm{m} $, neutrophils, macrophages, and T helper cells are colocalized within a circle of radius $ r $ more often than would be expected under CSR.

Figure 10

Figure 9. The wPCF identifies correlation between epithelial cells and cells with different CD4 expression levels. (a) Epithelial cell centers. (b) Cell centers labeled according to the average CD4 stain intensity within each cell. (c) wPCF ($ r $, $ E $, CD4), showing clear qualitative and quantitative differences in colocalization with epithelial cells as CD4 expression levels vary. (d) Cross sections of the wPCF in panel c. Points with low CD4 expression have a different pattern of correlation than those with higher expression. The profile for cells with high CD4 expression corresponds to the cross-PCF $ {g}_{ThE}(r) $, calculated for cells which have been manually classified as T helper cells (red dashed line). Cells with low CD4 intensity colocalize with epithelial cells, likely due to many epithelial cells having low CD4 expression. Cells with higher expression of CD4 are anticorrelated with epithelial cells for $ 0\le r\hskip0.35em \lesssim \hskip0.35em 75 $.

Figure 11

Figure 10. wPCF identifies correlation between epithelial cells and pixels with varying CD4 expression. The results from Figure 9 are recovered when the wPCF is calculated from points sampled from the original multiplex image using a regular $ 5\hskip0.3em \mu \mathrm{m} $ lattice, showing that the spatial correlation between T helper cells and epithelial cells can be identified without segmentation or classification. (a) Pixel intensities of the Opal 520 marker (associated with CD4), sampled across the ROI on a regular $ 5\hskip0.3em \mu \mathrm{m} $ lattice. (b) Pixels marked as Opal 780 positive (associated with epithelial cells), determined via thresholding, sampled across the ROI on a regular $ 5\hskip0.3em \mu \mathrm{m} $ lattice. (c) wPCF describing correlation between pixels positive for Opal 780 and the pixel intensity of Opal 520. (d) Cross sections of the wPCF in panel c have the same shape as the cross-PCF in panel d for pixels with high CD4 intensity.

Supplementary material: File

Bull et al. supplementary material

Bull et al. supplementary material
Download Bull et al. supplementary material(File)
File 18.2 MB