Literature DB >> 29036318

DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.

Zhe Sun1, Ting Wang2, Ke Deng3, Xiao-Feng Wang4, Robert Lafyatis5, Ying Ding1, Ming Hu4, Wei Chen1,2.   

Abstract

Motivation: Single cell transcriptome sequencing (scRNA-Seq) has become a revolutionary tool to study cellular and molecular processes at single cell resolution. Among existing technologies, the recently developed droplet-based platform enables efficient parallel processing of thousands of single cells with direct counting of transcript copies using Unique Molecular Identifier (UMI). Despite the technology advances, statistical methods and computational tools are still lacking for analyzing droplet-based scRNA-Seq data. Particularly, model-based approaches for clustering large-scale single cell transcriptomic data are still under-explored.
Results: We developed DIMM-SC, a Dirichlet Mixture Model for clustering droplet-based Single Cell transcriptomic data. This approach explicitly models UMI count data from scRNA-Seq experiments and characterizes variations across different cell clusters via a Dirichlet mixture prior. We performed comprehensive simulations to evaluate DIMM-SC and compared it with existing clustering methods such as K-means, CellTree and Seurat. In addition, we analyzed public scRNA-Seq datasets with known cluster labels and in-house scRNA-Seq datasets from a study of systemic sclerosis with prior biological knowledge to benchmark and validate DIMM-SC. Both simulation studies and real data applications demonstrated that overall, DIMM-SC achieves substantially improved clustering accuracy and much lower clustering variability compared to other existing clustering methods. More importantly, as a model-based approach, DIMM-SC is able to quantify the clustering uncertainty for each single cell, facilitating rigorous statistical inference and biological interpretations, which are typically unavailable from existing clustering methods. Availability and implementation: DIMM-SC has been implemented in a user-friendly R package with a detailed tutorial available on www.pitt.edu/∼wec47/singlecell.html. Contact: wei.chen@chp.edu or hum@ccf.org. Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com

Entities:  

Mesh:

Year:  2018        PMID: 29036318      PMCID: PMC6454475          DOI: 10.1093/bioinformatics/btx490

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  13 in total

1.  SAME-clustering: Single-cell Aggregated Clustering via Mixture Model Ensemble.

Authors:  Ruth Huh; Yuchen Yang; Yuchao Jiang; Yin Shen; Yun Li
Journal:  Nucleic Acids Res       Date:  2020-01-10       Impact factor: 16.971

2.  Celda: a Bayesian model to perform co-clustering of genes into modules and cells into subpopulations using single-cell RNA-seq data.

Authors:  Zhe Wang; Shiyi Yang; Yusuke Koga; Sean E Corbett; Conor V Shea; W Evan Johnson; Masanao Yajima; Joshua D Campbell
Journal:  NAR Genom Bioinform       Date:  2022-09-13

3.  SECANT: a biology-guided semi-supervised method for clustering, classification, and annotation of single-cell multi-omics.

Authors:  Xinjun Wang; Zhongli Xu; Haoran Hu; Xueping Zhou; Yanfu Zhang; Robert Lafyatis; Kong Chen; Heng Huang; Ying Ding; Richard H Duerr; Wei Chen
Journal:  PNAS Nexus       Date:  2022-08-19

4.  Dirichlet process mixture models for single-cell RNA-seq clustering.

Authors:  Nigatu A Adossa; Kalle T Rytkönen; Laura L Elo
Journal:  Biol Open       Date:  2022-04-04       Impact factor: 2.422

5.  A Bayesian mixture model for clustering droplet-based single-cell transcriptomic data from population studies.

Authors:  Zhe Sun; Li Chen; Hongyi Xin; Yale Jiang; Qianhui Huang; Anthony R Cillo; Tracy Tabib; Jay K Kolls; Tullia C Bruno; Robert Lafyatis; Dario A A Vignali; Kong Chen; Ying Ding; Ming Hu; Wei Chen
Journal:  Nat Commun       Date:  2019-04-09       Impact factor: 14.919

6.  VPAC: Variational projection for accurate clustering of single-cell transcriptomic data.

Authors:  Shengquan Chen; Kui Hua; Hongfei Cui; Rui Jiang
Journal:  BMC Bioinformatics       Date:  2019-05-01       Impact factor: 3.169

7.  Semisoft clustering of single-cell data.

Authors:  Lingxue Zhu; Jing Lei; Lambertus Klei; Bernie Devlin; Kathryn Roeder
Journal:  Proc Natl Acad Sci U S A       Date:  2018-12-26       Impact factor: 11.205

8.  Single-Cell Transcriptome Data Clustering via Multinomial Modeling and Adaptive Fuzzy K-Means Algorithm.

Authors:  Liang Chen; Weinan Wang; Yuyao Zhai; Minghua Deng
Journal:  Front Genet       Date:  2020-04-17       Impact factor: 4.599

9.  Benchmark and Parameter Sensitivity Analysis of Single-Cell RNA Sequencing Clustering Methods.

Authors:  Monika Krzak; Yordan Raykov; Alexis Boukouvalas; Luisa Cutillo; Claudia Angelini
Journal:  Front Genet       Date:  2019-12-11       Impact factor: 4.599

10.  BREM-SC: a bayesian random effects mixture model for joint clustering single cell multi-omics data.

Authors:  Xinjun Wang; Zhe Sun; Yanfu Zhang; Zhongli Xu; Hongyi Xin; Heng Huang; Richard H Duerr; Kong Chen; Ying Ding; Wei Chen
Journal:  Nucleic Acids Res       Date:  2020-06-19       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.