Literature DB >> 30865271

Probabilistic count matrix factorization for single cell expression data analysis.

Ghislain Durif1,2,3, Laurent Modolo1,4,5, Jeff E Mold5, Sophie Lambert-Lacroix6, Franck Picard1.   

Abstract

MOTIVATION: The development of high-throughput single-cell sequencing technologies now allows the investigation of the population diversity of cellular transcriptomes. The expression dynamics (gene-to-gene variability) can be quantified more accurately, thanks to the measurement of lowly expressed genes. In addition, the cell-to-cell variability is high, with a low proportion of cells expressing the same genes at the same time/level. Those emerging patterns appear to be very challenging from the statistical point of view, especially to represent a summarized view of single-cell expression data. Principal component analysis (PCA) is a most powerful tool for high dimensional data representation, by searching for latent directions catching the most variability in the data. Unfortunately, classical PCA is based on Euclidean distance and projections that poorly work in presence of over-dispersed count data with dropout events like single-cell expression data.
RESULTS: We propose a probabilistic Count Matrix Factorization (pCMF) approach for single-cell expression data analysis that relies on a sparse Gamma-Poisson factor model. This hierarchical model is inferred using a variational EM algorithm. It is able to jointly build a low dimensional representation of cells and genes. We show how this probabilistic framework induces a geometry that is suitable for single-cell data visualization, and produces a compression of the data that is very powerful for clustering purposes. Our method is competed against other standard representation methods like t-SNE, and we illustrate its performance for the representation of single-cell expression data.
AVAILABILITY AND IMPLEMENTATION: Our work is implemented in the pCMF R-package (https://github.com/gdurif/pCMF). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Mesh:

Year:  2019        PMID: 30865271     DOI: 10.1093/bioinformatics/btz177

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  9 in total

1.  Alignment and integration of spatial transcriptomics data.

Authors:  Ron Zeira; Max Land; Alexander Strzalkowski; Benjamin J Raphael
Journal:  Nat Methods       Date:  2022-05-16       Impact factor: 47.990

2.  Elucidating transcriptomic profiles from single-cell RNA sequencing data using nature-inspired compressed sensing.

Authors:  Zhuohan Yu; Chuang Bian; Genggeng Liu; Shixiong Zhang; Ka-Chun Wong; Xiangtao Li
Journal:  Brief Bioinform       Date:  2021-09-02       Impact factor: 11.622

3.  Algorithmic approaches to clonal reconstruction in heterogeneous cell populations.

Authors:  Wazim Mohammed Ismail; Etienne Nzabarushimana; Haixu Tang
Journal:  Quant Biol       Date:  2019-12-07

4.  A fast and efficient count-based matrix factorization method for detecting cell types from single-cell RNAseq data.

Authors:  Shiquan Sun; Yabo Chen; Yang Liu; Xuequn Shang
Journal:  BMC Syst Biol       Date:  2019-04-05

Review 5.  Eleven grand challenges in single-cell data science.

Authors:  David Lähnemann; Johannes Köster; Ewa Szczurek; Davis J McCarthy; Stephanie C Hicks; Mark D Robinson; Catalina A Vallejos; Kieran R Campbell; Niko Beerenwinkel; Ahmed Mahfouz; Luca Pinello; Pavel Skums; Alexandros Stamatakis; Camille Stephan-Otto Attolini; Samuel Aparicio; Jasmijn Baaijens; Marleen Balvert; Buys de Barbanson; Antonio Cappuccio; Giacomo Corleone; Bas E Dutilh; Maria Florescu; Victor Guryev; Rens Holmer; Katharina Jahn; Thamar Jessurun Lobo; Emma M Keizer; Indu Khatri; Szymon M Kielbasa; Jan O Korbel; Alexey M Kozlov; Tzu-Hao Kuo; Boudewijn P F Lelieveldt; Ion I Mandoiu; John C Marioni; Tobias Marschall; Felix Mölder; Amir Niknejad; Lukasz Raczkowski; Marcel Reinders; Jeroen de Ridder; Antoine-Emmanuel Saliba; Antonios Somarakis; Oliver Stegle; Fabian J Theis; Huan Yang; Alex Zelikovsky; Alice C McHardy; Benjamin J Raphael; Sohrab P Shah; Alexander Schönhuth
Journal:  Genome Biol       Date:  2020-02-07       Impact factor: 13.583

6.  netNMF-sc: leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression analysis.

Authors:  Rebecca Elyanow; Bianca Dumitrascu; Barbara E Engelhardt; Benjamin J Raphael
Journal:  Genome Res       Date:  2020-01-28       Impact factor: 9.043

7.  Exponential-Family Embedding With Application to Cell Developmental Trajectories for Single-Cell RNA-Seq Data.

Authors:  Kevin Z Lin; Jing Lei; Kathryn Roeder
Journal:  J Am Stat Assoc       Date:  2021-02-08       Impact factor: 5.033

8.  scPNMF: sparse gene encoding of single cells to facilitate gene selection for targeted gene profiling.

Authors:  Dongyuan Song; Kexin Li; Zachary Hemminger; Roy Wollman; Jingyi Jessica Li
Journal:  Bioinformatics       Date:  2021-07-12       Impact factor: 6.937

9.  Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis.

Authors:  Shiquan Sun; Jiaqiang Zhu; Ying Ma; Xiang Zhou
Journal:  Genome Biol       Date:  2019-12-10       Impact factor: 13.583

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.