Literature DB >> 32971193

Model-based autoencoders for imputing discrete single-cell RNA-seq data.

Tian Tian1, Martin Renqiang Min2, Zhi Wei3.   

Abstract

Deep neural networks have been widely applied for missing data imputation. However, most existing studies have been focused on imputing continuous data, while discrete data imputation is under-explored. Discrete data is common in real world, especially in research areas of bioinformatics, genetics, and biochemistry. In particular, large amounts of recent genomic data are discrete count data generated from single-cell RNA sequencing (scRNA-seq) technology. Most scRNA-seq studies produce a discrete matrix with prevailing 'false' zero count observations (missing values). To make downstream analyses more effective, imputation, which recovers the missing values, is often conducted as the first step in pre-processing scRNA-seq data. In this paper, we propose a novel Zero-Inflated Negative Binomial (ZINB) model-based autoencoder for imputing discrete scRNA-seq data. The novelties of our method are twofold. First, in addition to optimizing the ZINB likelihood, we propose to explicitly model the dropout events that cause missing values by using the Gumbel-Softmax distribution. Second, the zero-inflated reconstruction is further optimized with respect to the raw count matrix. Extensive experiments on simulation datasets demonstrate that the zero-inflated reconstruction significantly improves imputation accuracy. Real data experiments show that the proposed imputation can enhance separating different cell types and improve the accuracy of differential expression analysis.
Copyright © 2020 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Deep learning; Imputation; scRNA-seq

Mesh:

Year:  2020        PMID: 32971193      PMCID: PMC8592282          DOI: 10.1016/j.ymeth.2020.09.010

Source DB:  PubMed          Journal:  Methods        ISSN: 1046-2023            Impact factor:   4.647


  24 in total

Review 1.  Single-cell sequencing-based technologies will revolutionize whole-organism science.

Authors:  Ehud Shapiro; Tamir Biezuner; Sten Linnarsson
Journal:  Nat Rev Genet       Date:  2013-07-30       Impact factor: 53.242

2.  Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells.

Authors:  Allon M Klein; Linas Mazutis; Ilke Akartuna; Naren Tallapragada; Adrian Veres; Victor Li; Leonid Peshkin; David A Weitz; Marc W Kirschner
Journal:  Cell       Date:  2015-05-21       Impact factor: 41.582

3.  An omnibus test for differential distribution analysis of microbiome sequencing data.

Authors:  Jun Chen; Emily King; Rebecca Deek; Zhi Wei; Yue Yu; Diane Grill; Karla Ballman; Oliver Stegle
Journal:  Bioinformatics       Date:  2018-02-15       Impact factor: 6.937

4.  Integrating single-cell transcriptomic data across different conditions, technologies, and species.

Authors:  Andrew Butler; Paul Hoffman; Peter Smibert; Efthymia Papalexi; Rahul Satija
Journal:  Nat Biotechnol       Date:  2018-04-02       Impact factor: 54.908

Review 5.  Challenges in unsupervised clustering of single-cell RNA-seq data.

Authors:  Vladimir Yu Kiselev; Tallulah S Andrews; Martin Hemberg
Journal:  Nat Rev Genet       Date:  2019-05       Impact factor: 53.242

6.  Single-cell RNA-seq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm.

Authors:  Li-Fang Chu; Ning Leng; Jue Zhang; Zhonggang Hou; Daniel Mamott; David T Vereide; Jeea Choi; Christina Kendziorski; Ron Stewart; James A Thomson
Journal:  Genome Biol       Date:  2016-08-17       Impact factor: 13.583

7.  Massively parallel digital transcriptional profiling of single cells.

Authors:  Grace X Y Zheng; Jessica M Terry; Phillip Belgrader; Paul Ryvkin; Zachary W Bent; Ryan Wilson; Solongo B Ziraldo; Tobias D Wheeler; Geoff P McDermott; Junjie Zhu; Mark T Gregory; Joe Shuga; Luz Montesclaros; Jason G Underwood; Donald A Masquelier; Stefanie Y Nishimura; Michael Schnall-Levin; Paul W Wyatt; Christopher M Hindson; Rajiv Bharadwaj; Alexander Wong; Kevin D Ness; Lan W Beppu; H Joachim Deeg; Christopher McFarland; Keith R Loeb; William J Valente; Nolan G Ericson; Emily A Stevens; Jerald P Radich; Tarjei S Mikkelsen; Benjamin J Hindson; Jason H Bielas
Journal:  Nat Commun       Date:  2017-01-16       Impact factor: 14.919

8.  Deep generative modeling for single-cell transcriptomics.

Authors:  Romain Lopez; Jeffrey Regier; Michael B Cole; Michael I Jordan; Nir Yosef
Journal:  Nat Methods       Date:  2018-11-30       Impact factor: 28.547

9.  SAVER: gene expression recovery for single-cell RNA sequencing.

Authors:  Mo Huang; Jingshu Wang; Eduardo Torre; Hannah Dueck; Sydney Shaffer; Roberto Bonasio; John I Murray; Arjun Raj; Mingyao Li; Nancy R Zhang
Journal:  Nat Methods       Date:  2018-06-25       Impact factor: 28.547

10.  Splatter: simulation of single-cell RNA sequencing data.

Authors:  Luke Zappia; Belinda Phipson; Alicia Oshlack
Journal:  Genome Biol       Date:  2017-09-12       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.