Literature DB >> 34278382

Correlation Imputation in Single cell RNA-seq using Auxiliary Information and Ensemble Learning.

Luqin Gan1, Giuseppe Vinci2, Genevera I Allen1.   

Abstract

Single cell RNA sequencing is a powerful technique that measures the gene expression of individual cells in a high throughput fashion. However, due to sequencing inefficiency, the data is unreliable due to dropout events, or technical artifacts where genes erroneously appear to have zero expression. Many data imputation methods have been proposed to alleviate this issue. Yet, effective imputation can be difficult and biased because the data is sparse and high-dimensional, resulting in major distortions in downstream analyses. In this paper, we propose a completely novel approach that imputes the gene-by-gene correlations rather than the data itself. We call this method SCENA: Single cell RNA-seq Correlation completion by ENsemble learning and Auxiliary information. The SCENA gene-by-gene correlation matrix estimate is obtained by model stacking of multiple imputed correlation matrices based on known auxiliary information about gene connections. In an extensive simulation study based on real scRNA-seq data, we demonstrate that SCENA not only accurately imputes gene correlations but also outperforms existing imputation approaches in downstream analyses such as dimension reduction, cell clustering, graphical model estimation.

Keywords:  Auxiliary Information; Clustering; Correlation Completion; Dimension Reduction; Ensemble Learning; Graphical modeling; Imputation; Single Cell RNA-seq

Year:  2020        PMID: 34278382      PMCID: PMC8281968          DOI: 10.1145/3388440.3412462

Source DB:  PubMed          Journal:  ACM BCB


  26 in total

1.  Matrix Completion and Low-Rank SVD via Fast Alternating Least Squares.

Authors:  Trevor Hastie; Rahul Mazumder; Jason D Lee; Reza Zadeh
Journal:  J Mach Learn Res       Date:  2015       Impact factor: 3.654

Review 2.  Gene regulatory network inference: data integration in dynamic models-a review.

Authors:  Michael Hecker; Sandro Lambeck; Susanne Toepfer; Eugene van Someren; Reinhard Guthke
Journal:  Biosystems       Date:  2008-12-27       Impact factor: 1.973

3.  scRMD: Imputation for single cell RNA-seq data via robust matrix decomposition.

Authors:  Chong Chen; Changjing Wu; Linjie Wu; Xiaochen Wang; Minghua Deng; Ruibin Xi
Journal:  Bioinformatics       Date:  2020-03-02       Impact factor: 6.937

4.  A UNIFIED STATISTICAL FRAMEWORK FOR SINGLE CELL AND BULK RNA SEQUENCING DATA.

Authors:  Lingxue Zhu; Jing Lei; Bernie Devlin; Kathryn Roeder
Journal:  Ann Appl Stat       Date:  2018-03-09       Impact factor: 2.083

5.  Better diagnostic signatures from RNAseq data through use of auxiliary co-data.

Authors:  Putri W Novianti; Barbara C Snoek; Saskia M Wilting; Mark A van de Wiel
Journal:  Bioinformatics       Date:  2017-05-15       Impact factor: 6.937

6.  Quantitative utilization of prior biological knowledge in the Bayesian network modeling of gene expression data.

Authors:  Shouguo Gao; Xujing Wang
Journal:  BMC Bioinformatics       Date:  2011-08-31       Impact factor: 3.169

7.  Gene Network Reconstruction by Integration of Prior Biological Knowledge.

Authors:  Yupeng Li; Scott A Jackson
Journal:  G3 (Bethesda)       Date:  2015-03-30       Impact factor: 3.154

8.  Targeted disruption of DNMT1, DNMT3A and DNMT3B in human embryonic stem cells.

Authors:  Jing Liao; Rahul Karnik; Hongcang Gu; Michael J Ziller; Kendell Clement; Alexander M Tsankov; Veronika Akopian; Casey A Gifford; Julie Donaghey; Christina Galonska; Ramona Pop; Deepak Reyon; Shengdar Q Tsai; William Mallard; J Keith Joung; John L Rinn; Andreas Gnirke; Alexander Meissner
Journal:  Nat Genet       Date:  2015-03-30       Impact factor: 38.330

9.  A survey of human brain transcriptome diversity at the single cell level.

Authors:  Spyros Darmanis; Steven A Sloan; Ye Zhang; Martin Enge; Christine Caneda; Lawrence M Shuer; Melanie G Hayden Gephart; Ben A Barres; Stephen R Quake
Journal:  Proc Natl Acad Sci U S A       Date:  2015-05-18       Impact factor: 11.205

10.  An integrated encyclopedia of DNA elements in the human genome.

Authors: 
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.