Literature DB >> 29121214

Missing data and technical variability in single-cell RNA-sequencing experiments.

Stephanie C Hicks1,2, F William Townes1,2, Mingxiang Teng1,2, Rafael A Irizarry1,2.   

Abstract

Until recently, high-throughput gene expression technology, such as RNA-Sequencing (RNA-seq) required hundreds of thousands of cells to produce reliable measurements. Recent technical advances permit genome-wide gene expression measurement at the single-cell level. Single-cell RNA-Seq (scRNA-seq) is the most widely used and numerous publications are based on data produced with this technology. However, RNA-seq and scRNA-seq data are markedly different. In particular, unlike RNA-seq, the majority of reported expression levels in scRNA-seq are zeros, which could be either biologically-driven, genes not expressing RNA at the time of measurement, or technically-driven, genes expressing RNA, but not at a sufficient level to be detected by sequencing technology. Another difference is that the proportion of genes reporting the expression level to be zero varies substantially across single cells compared to RNA-seq samples. However, it remains unclear to what extent this cell-to-cell variation is being driven by technical rather than biological variation. Furthermore, while systematic errors, including batch effects, have been widely reported as a major challenge in high-throughput technologies, these issues have received minimal attention in published studies based on scRNA-seq technology. Here, we use an assessment experiment to examine data from published studies and demonstrate that systematic errors can explain a substantial percentage of observed cell-to-cell expression variability. Specifically, we present evidence that some of these reported zeros are driven by technical variation by demonstrating that scRNA-seq produces more zeros than expected and that this bias is greater for lower expressed genes. In addition, this missing data problem is exacerbated by the fact that this technical variation varies cell-to-cell. Then, we show how this technical cell-to-cell variability can be confused with novel biological results. Finally, we demonstrate and discuss how batch-effects and confounded experiments can intensify the problem.

Mesh:

Year:  2018        PMID: 29121214      PMCID: PMC6215955          DOI: 10.1093/biostatistics/kxx053

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  67 in total

1.  Counting absolute numbers of molecules using unique molecular identifiers.

Authors:  Teemu Kivioja; Anna Vähärautio; Kasper Karlsson; Martin Bonke; Martin Enge; Sten Linnarsson; Jussi Taipale
Journal:  Nat Methods       Date:  2011-11-20       Impact factor: 28.547

Review 2.  Design and Analysis of Single-Cell Sequencing Experiments.

Authors:  Dominic Grün; Alexander van Oudenaarden
Journal:  Cell       Date:  2015-11-05       Impact factor: 41.582

3.  The reduction of gene expression variability from single cells to populations follows simple statistical laws.

Authors:  Vincent Piras; Kumar Selvarajoo
Journal:  Genomics       Date:  2014-12-29       Impact factor: 5.736

4.  Unbiased classification of sensory neuron types by large-scale single-cell RNA sequencing.

Authors:  Dmitry Usoskin; Alessandro Furlan; Saiful Islam; Hind Abdo; Peter Lönnerberg; Daohua Lou; Jens Hjerling-Leffler; Jesper Haeggström; Olga Kharchenko; Peter V Kharchenko; Sten Linnarsson; Patrik Ernfors
Journal:  Nat Neurosci       Date:  2014-11-24       Impact factor: 24.884

Review 5.  Computational and analytical challenges in single-cell transcriptomics.

Authors:  Oliver Stegle; Sarah A Teichmann; John C Marioni
Journal:  Nat Rev Genet       Date:  2015-01-28       Impact factor: 53.242

6.  CEL-Seq: single-cell RNA-Seq by multiplexed linear amplification.

Authors:  Tamar Hashimshony; Florian Wagner; Noa Sher; Itai Yanai
Journal:  Cell Rep       Date:  2012-08-30       Impact factor: 9.423

7.  Near-optimal probabilistic RNA-seq quantification.

Authors:  Nicolas L Bray; Harold Pimentel; Páll Melsted; Lior Pachter
Journal:  Nat Biotechnol       Date:  2016-04-04       Impact factor: 54.908

Review 8.  Tackling the widespread and critical impact of batch effects in high-throughput data.

Authors:  Jeffrey T Leek; Robert B Scharpf; Héctor Corrada Bravo; David Simcha; Benjamin Langmead; W Evan Johnson; Donald Geman; Keith Baggerly; Rafael A Irizarry
Journal:  Nat Rev Genet       Date:  2010-09-14       Impact factor: 53.242

9.  Batch effects and the effective design of single-cell gene expression studies.

Authors:  Po-Yuan Tung; John D Blischak; Chiaowen Joyce Hsiao; David A Knowles; Jonathan E Burnett; Jonathan K Pritchard; Yoav Gilad
Journal:  Sci Rep       Date:  2017-01-03       Impact factor: 4.379

10.  MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data.

Authors:  Greg Finak; Andrew McDavid; Masanao Yajima; Jingyuan Deng; Vivian Gersuk; Alex K Shalek; Chloe K Slichter; Hannah W Miller; M Juliana McElrath; Martin Prlic; Peter S Linsley; Raphael Gottardo
Journal:  Genome Biol       Date:  2015-12-10       Impact factor: 13.583

View more
  125 in total

1.  Gene regulatory network reconstruction using single-cell RNA sequencing of barcoded genotypes in diverse environments.

Authors:  Christopher A Jackson; Dayanne M Castro; Richard Bonneau; David Gresham; Giuseppe-Antonio Saldi
Journal:  Elife       Date:  2020-01-27       Impact factor: 8.140

Review 2.  Tutorial: guidelines for annotating single-cell transcriptomic maps using automated and manual methods.

Authors:  Zoe A Clarke; Tallulah S Andrews; Jawairia Atif; Delaram Pouyabahar; Brendan T Innes; Sonya A MacParland; Gary D Bader
Journal:  Nat Protoc       Date:  2021-05-24       Impact factor: 13.491

Review 3.  Single cell RNA-sequencing: replicability of cell types.

Authors:  Megan Crow; Jesse Gillis
Journal:  Curr Opin Neurobiol       Date:  2019-01-09       Impact factor: 6.627

Review 4.  Single-Cell RNA Sequencing: A New Window into Cell Scale Dynamics.

Authors:  Sabyasachi Dasgupta; Gary D Bader; Sidhartha Goyal
Journal:  Biophys J       Date:  2018-07-11       Impact factor: 4.033

5.  Latent cellular analysis robustly reveals subtle diversity in large-scale single-cell RNA-seq data.

Authors:  Changde Cheng; John Easton; Celeste Rosencrance; Yan Li; Bensheng Ju; Justin Williams; Heather L Mulder; Yakun Pang; Wenan Chen; Xiang Chen
Journal:  Nucleic Acids Res       Date:  2019-12-16       Impact factor: 16.971

Review 6.  Transformative Opportunities for Single-Cell Proteomics.

Authors:  Harrison Specht; Nikolai Slavov
Journal:  J Proteome Res       Date:  2018-07-19       Impact factor: 4.466

7.  A comprehensive survey of regulatory network inference methods using single-cell RNA sequencing data.

Authors:  Hung Nguyen; Duc Tran; Bang Tran; Bahadir Pehlivan; Tin Nguyen
Journal:  Brief Bioinform       Date:  2020-09-16       Impact factor: 11.622

8.  Performance Assessment and Selection of Normalization Procedures for Single-Cell RNA-Seq.

Authors:  Michael B Cole; Davide Risso; Allon Wagner; David DeTomaso; John Ngai; Elizabeth Purdom; Sandrine Dudoit; Nir Yosef
Journal:  Cell Syst       Date:  2019-04-24       Impact factor: 10.304

Review 9.  Single cell protein analysis for systems biology.

Authors:  Ezra Levy; Nikolai Slavov
Journal:  Essays Biochem       Date:  2018-10-26       Impact factor: 8.000

Review 10.  Co-expression in Single-Cell Analysis: Saving Grace or Original Sin?

Authors:  Megan Crow; Jesse Gillis
Journal:  Trends Genet       Date:  2018-08-23       Impact factor: 11.639

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.