| Literature DB >> 34983062 |
Coleman R Harris1, Eliot T McKinley2,3, Joseph T Roland2,4, Qi Liu1,5, Martha J Shrubsole6, Ken S Lau2,3, Robert J Coffey2,7, Julia Wrobel8, Simon N Vandekar1.
Abstract
MOTIVATION: Multiplexed imaging is a nascent single-cell assay with a complex data structure susceptible to technical variability that disrupts inference. These in situ methods are valuable in understanding cell-cell interactions, but few standardized processing steps or normalization techniques of multiplexed imaging data are available.Entities:
Year: 2022 PMID: 34983062 PMCID: PMC8896603 DOI: 10.1093/bioinformatics/btab877
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Summary of normalization procedures implemented
| None | ComBat | Registration (fda) | |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note: Transformations (rows) and normalization (columns) performed on the data. Here, y is the median cell intensity values for an arbitrary marker channel c, and μ is the slide mean for slide i of the median cell intensity values for marker channel c.
Fig. 1.Visual comparison of vimentin marker densities for each transformation method. Density plots for the median cell intensity of the marker vimentin, where each color represents a different slide in the dataset. Each row is aligned with the scale transformations present in Table 1, where each column also matches with the normalization algorithms in Table 1. The ticks on the x-axis represent the Otsu thresholds for each slide for that transformed data, where the color again corresponds to the slide (such that the colors are one-to-one between threshold and density plot). Anderson–Darling test statistics for the marker vimentin are presented for each method in the top right corner
Quantitative metrics comparing normalization methods
| Method | Mean AD test statistic | Mean Otsu discordance score | Adj. Rand index (slide ID) | Mean variance proportion (slide ID) |
|---|---|---|---|---|
| None; None | 275.019 | 0.085 | 0.033 | 0.138 |
|
| 225.413 | 0.134 | 0.083 | 0.301 |
|
| 291.900 | 0.138 | 0.089 | 0.000 |
|
| 217.649 | 0.110 | 0.037 | 0.232 |
| Mean division; None | 138.774 | 0.041 | 0.007 | 0.000 |
| Mean division; ComBat | 247.612 | 0.109 | 0.064 | 0.000 |
| Mean division; Registration | 174.933 | 0.164 | 0.120 | 0.333 |
| Mean division | 114.653 | 0.055 | 0.010 | 0.046 |
| Mean division | 321.810 | 0.132 | 0.071 | 0.000 |
| Mean division | 104.330 | 0.049 | 0.018 | 0.081 |
Note: Results from the k-samples Anderson–Darling test statistic, the threshold discordance score, and the variance proportion at the slide level from the random effects modeling, all averaged across marker channels, as well as the adjusted Rand index for the slide identifiers comparing the raw data to the normalized data. For each of these metrics, small values indicate better performance for a given method.
Fig. 2.Threshold discordance and accuracy. (A) Otsu thresholds were calculated at the slide-level for each marker and compared to a global Otsu threshold for each marker to calculate a discordance score to compare transformation methods. The mean difference of the slide-level Otsu thresholds and the global Otsu threshold is then calculated for each marker, and presented as a point for each of the nine markers, with the white diamond representing the mean discordance score across all markers for a given method. Given that this is a discordance score, lower values indicate better agreement across slides. (B) Otsu thresholds were calculated across slides for each marker to determine marker positive cells, which were then compared to the manual labels for the markers CD3 and CD8 to determine the accuracy of defining a cell as marker positive. This is presented as the accuracy rate of recapitulating the ground truth labels—given that this is a measurement of accuracy, higher values indicate better agreement between the normalized data and labels. Note also that for each of these plots, the top row indicates the results from the raw, unadjusted data
Fig. 3.Proportion of variance present at slide-level in random effects model. Scatter plots that denote the proportion of variance at the slide-level for each normalization method for each of the marker channels in this dataset. Variance proportions were calculated using a random effects model with a random intercept for slide—methods that perform well should reduce the slide level variance. Note also that the top row indicates the results from the raw, unadjusted data
Fig. 4.UMAP embedding of data for each transformation method. UMAP embedding of the transformed data with points colored by slide identifier (A) and tissue type (B). The rectangle in (B) denotes the mixing of tissue classes present in the raw, unadjusted data UMAP embedding. Adjusted Rand index values for each embedding are presented in the top right corner