| Literature DB >> 32207520 |
Philipp Angerer1,2, David S Fischer1,2, Fabian J Theis1, Antonio Scialdone1,3,4, Carsten Marr1.
Abstract
MOTIVATION: Dimensionality reduction is a key step in the analysis of single-cell RNA-sequencing data. It produces a low-dimensional embedding for visualization and as a calculation base for downstream analysis. Nonlinear techniques are most suitable to handle the intrinsic complexity of large, heterogeneous single-cell data. However, with no linear relation between gene and embedding coordinate, there is no way to extract the identity of genes driving any cell's position in the low-dimensional embedding, making it difficult to characterize the underlying biological processes.Entities:
Year: 2020 PMID: 32207520 PMCID: PMC7520047 DOI: 10.1093/bioinformatics/btaa198
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.The gene relevance concept. (a) A gene expression matrix from a scRNA-seq experiment is (b) reduced to a low-dimensional embedding s, with each dot representing a cell, and the color representing the expression x of gene in cell c. (c) Expression changes are calculated from estimates of partial derivatives with respect to the embedding, which results in one value per cell×gene×dimension combination. (d) We score the relevance of each gene in each cell according to the partial derivatives’ euclidean norm. This score indicates how relevant each gene is within its neighborhood. (e) For local gene relevance scores, we subdivide the embedding into bins and determine the fraction of cells per bin for which a given gene is among the (e.g. 10) most relevant genes (indicated by ‘%top rank’ in the figure legend). We color bins of the embedded cells according to their local gene relevance score, and fade them according to the number of cells they contain (indicated by ‘#cells’ in the legend). (f) For the global gene relevance score, we determine local relevance for all cells instead of a bin. In our illustrative example, Gene B has been ranked among the top 10 genes in 5.4% of all cells. (g) A gene relevance map indicates all cells where a given gene has the largest norm of partial derivatives (with or without a smoothing step—see Section 2). Such cells mark the areas where that gene has high local relevance
Fig. 2.Gene relevance automatically detects drivers of embryonic blood development. (a) Diffusion map of 271 single hematopoietic progenitor cells from mostly Day 7.5 and 7.75 mouse embryos, profiled in Scialdone . (b) Global gene relevance identifies Hbb-bh1 and Hba-x as genes that change most dramatically during hematopoietic development. (c) A gene relevance map identifies the contribution of relevant genes in specific regions of the process and the corresponding code to create it. The genes corresponding to each color are shown in panel (b). (d) A local gene relevance plot details the areas where the contribution of genes is highest. Alox5ap shows a high local relevance in the top region of the diffusion map and has been implicated with early blood development Ibarra-Soria