| Literature DB >> 35273328 |
Benjamin L Walker1,2, Zixuan Cang1,2, Honglei Ren1,2, Eric Bourgain-Chang2, Qing Nie3,4,5.
Abstract
The rapid development of spatial transcriptomics (ST) techniques has allowed the measurement of transcriptional levels across many genes together with the spatial positions of cells. This has led to an explosion of interest in computational methods and techniques for harnessing both spatial and transcriptional information in analysis of ST datasets. The wide diversity of approaches in aim, methodology and technology for ST provides great challenges in dissecting cellular functions in spatial contexts. Here, we synthesize and review the key problems in analysis of ST data and methods that are currently applied, while also expanding on open questions and areas of future development.Entities:
Mesh:
Year: 2022 PMID: 35273328 PMCID: PMC8913632 DOI: 10.1038/s42003-022-03175-5
Source DB: PubMed Journal: Commun Biol ISSN: 2399-3642
List of software packages.
| Name | Summary | Platform | Reference |
|---|---|---|---|
| Identifying spatially variable genes | |||
| Trendsceek | Statistical testing on spatial hypothesis (non-parametric) | R | [ |
| SpatialDE | Gaussian process regression | Python | [ |
| SPARK | Statistical testing - generalized linear spatial model | R | [ |
| SOMDE | Self-organizing neural map + Gaussian process regression | Python | [ |
| Sepal | Assessing spatial variance by length of time to equalize under diffusion | Python | [ |
| scGCO | Graph cuts to divide based on spatial expression | Python | [ |
| SpaGCN | Graph convolutional network, joint detection of regions | Python | [ |
| Region Segmentation | |||
| stLearn | Histology-based smoothing + clustering | Python | [ |
| Seurat | Non-spatial clustering combined with spatial visualization | R | [ |
| SmfishHmrf (Giotto) | Combining Gaussian expression model with hidden Markov random field | R | [ |
| SpaGCN | Graph convolutional network, joint detection of SVGs | Python | [ |
| BayesSpace | Fully Bayesian expression model, hyper-resolution segmentation | R | [ |
| SEDR | Deep auto-encoder based embedding for clustering | R | [ |
| Identifying cell-cell interactions | |||
| SpaOTsc | Optimal transport to match ligand and receptor expression | Python | [ |
| Spatial Variance Component Analysis | Gaussian process model including interaction term | Python | [ |
| Misty | Multi-component linear model including interaction term, random forest | R | [ |
| Node-centric Expression Modeling | Graph neural network combining expression data over various length scales | Python | [ |
| GCNG | Supervised training of graph neural network then allows for identification of novel interactions | Python | [ |
| Mapping cells to spatial locations | |||
| Seurat | Alignment for a variety of data modalities including spatial data by pairing a subset of cells as anchors | R | [ |
| SpaOTsc | Optimal transport mapping between spatial and single cell data | Python | [ |
| DistMap | Matthews correlation coefficient computed on binarized expression | R | [ |
| DeepSC | Neural network learns to predict locations of cells in space | Python | [ |
| GLISS | Uses graph-based measure based on similarity of landmark genes | Python | [ |
| Tangram | Aligns gene expression while also accounting for spatial cell density distribution | Python | [ |
| Cell type deconvolution/enrichment scores | |||
| Giotto | Several algorithms for computing enrichment scores | R | [ |
| SPOTLight | Non-negative matrix factorization using known marker genes for initialization | R | [ |
| SpatialDWLS | Dampened weighted least squares for matrix factorization | R | [ |
| RCTD | Statistical fitting of combination of Poisson distribution models | R | [ |
| DSTG | Graph neural network to learn cell types and deconvolution from data | Python | [ |
Fig. 1Spatial Transcriptomics Data: Collection and Resolutions.
ST data can be collected with various methods and resolutions. a Illustration of spatial barcoding, in which spatially-identified barcodes are arranged and then used to tag RNA molecules in tissue. Compare with c, but note these methods are not restricted to multi-cell resolution. b Illustration of sequential fluorescent imaging, where RNA molecules are sequentially tagged with different color fluorescent probes and the color sequences are used to identify RNA species. In general, this data is collected at sub-cellular resolution, as show in e, but is frequently combined with cell segmentation to create single-cell data, as in d. c Multi-cell resolution spots, in which measured expression at one spatial location is collected across a number of possibly heterogeneous cells. d In single-cell resolution data, each spatial location corresponds to one cell. This allows for spatial analysis of cell identity and a single-cell understanding of tissue structure and cell-cell communications. e One type of sub-cellular resolution data is single-molecule imaging. Note the presence of information both in the number of distinct RNA molecules of one type in a cell, and also the localization of those molecules within the cell. Sub-cellular resolution data may be combined with cell segmentation to produce single-cell data to facilitate corresponding analysis.
Fig. 2Illustration of different traits that can separate spatial regions.
a–d Dotted line indicates division between two regions. Red and blue cells indicate groups with consistent expression across some set of spatially variable genes. a Regions are characterized by different gene expression, equivalent to the groups identified by cluster analysis such as the Louvain algorithm on non-spatial data. b Regions are not entirely homogeneous, but instead differ in distribution of observed expression. c Regions have similar distributions, but differ in the spatial patterning of gene expression. d Red lines connect interacting cells. Beyond cell type indicated by gene expression, regions may be distinguished by higher-level properties such as patterns of cell-cell interactions. Performing region identification downstream of other analyses could allow for detecting variance in such properties.
Fig. 3Illustration of techniques in extracting cell-cell interactions from ST data.
a Cell-cell interactions occur when transfer of a ligand from a sender cell to a receiver cell triggers a downstream response, ultimately leading to changes in gene expression in the receiver cell. b Common techniques identify co-expression of known L-R pairs in cells adjacent in a spatial proximity network, and use this to mark interactions between cells. c Alternatively, some methods probabilistically capture different sources explaining variance in spatial gene expression, including terms capturing intra- and inter-cellular effects. When inter-cellular effects dominate a particular gene’s expression, it is indicative of cell-cell interaction. d Insights made from CCI analysis of spatial data include the ability to determine interactions of a particular cell by filtering out spurious long-range connections, and investigations into the relationship between L-R interactions, and mechanistic interactions and cell proximity.
Fig. 4Enhancing spatial transcriptomics data with scRNA-seq data.
This analysis step augments ST data using scRNA-seq data. a By using scRNA-seq data onto the spatial dataset, the composition of individual spots can be understood in terms of single cells, such as by computing enrichment scores, which measure the expression of certain gene sets (such as marker genes from a particular cell type) relative to the norm, or through deconvolution, which decomposes the overall expression data from a spatial spot into a combination of contributions from several cell types. b scRNA-seq data can be used to increase resolution of multi-cell ST data, by mapping cells to spatial locations, producing a spatial dataset at single-cell resolution. The primary choices in such methods are the computation of similarity scores between cells and spots, and the method by which matching is computed from the similarity matrix.