Literature DB >> 35178743

Assessing reproducibility of high-throughput experiments in the case of missing data.

Roopali Singh1, Feipeng Zhang2, Qunhua Li1.   

Abstract

High-throughput experiments are an essential part of modern biological and biomedical research. The outcomes of high-throughput biological experiments often have a lot of missing observations due to signals below detection levels. For example, most single-cell RNA-seq (scRNA-seq) protocols experience high levels of dropout due to the small amount of starting material, leading to a majority of reported expression levels being zero. Though missing data contain information about reproducibility, they are often excluded in the reproducibility assessment, potentially generating misleading assessments. In this article, we develop a regression model to assess how the reproducibility of high-throughput experiments is affected by the choices of operational factors (eg, platform or sequencing depth) when a large number of measurements are missing. Using a latent variable approach, we extend correspondence curve regression, a recently proposed method for assessing the effects of operational factors to reproducibility, to incorporate missing values. Using simulations, we show that our method is more accurate in detecting differences in reproducibility than existing measures of reproducibility. We illustrate the usefulness of our method using a single-cell RNA-seq dataset collected on HCT116 cells. We compare the reproducibility of different library preparation platforms and study the effect of sequencing depth on reproducibility, thereby determining the cost-effective sequencing depth that is required to achieve sufficient reproducibility.
© 2022 John Wiley & Sons Ltd.

Entities:  

Keywords:  correspondence curve; high throughput experiments; missing data; reproducibility; sequencing depth; single cell RNA-seq

Mesh:

Year:  2022        PMID: 35178743      PMCID: PMC9039958          DOI: 10.1002/sim.9334

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.497


  13 in total

1.  Multiple-laboratory comparison of microarray platforms.

Authors:  Rafael A Irizarry; Daniel Warren; Forrest Spencer; Irene F Kim; Shyam Biswal; Bryan C Frank; Edward Gabrielson; Joe G N Garcia; Joel Geoghegan; Gregory Germino; Constance Griffin; Sara C Hilmer; Eric Hoffman; Anne E Jedlicka; Ernest Kawasaki; Francisco Martínez-Murillo; Laura Morsberger; Hannah Lee; David Petersen; John Quackenbush; Alan Scott; Michael Wilson; Yanqin Yang; Shui Qing Ye; Wayne Yu
Journal:  Nat Methods       Date:  2005-04-21       Impact factor: 28.547

2.  3-Tesla MRI in patients with fully implanted deep brain stimulation devices: a preliminary study in 10 patients.

Authors:  Francesco Sammartino; Vibhor Krishna; Tejas Sankar; Jason Fisico; Suneil K Kalia; Mojgan Hodaie; Walter Kucharczyk; David J Mikulis; Adrian Crawley; Andres M Lozano
Journal:  J Neurosurg       Date:  2016-12-23       Impact factor: 5.115

3.  A regression framework for assessing covariate effects on the reproducibility of high-throughput experiments.

Authors:  Qunhua Li; Feipeng Zhang
Journal:  Biometrics       Date:  2017-11-29       Impact factor: 2.571

Review 4.  Sequencing depth and coverage: key considerations in genomic analyses.

Authors:  David Sims; Ian Sudbery; Nicholas E Ilott; Andreas Heger; Chris P Ponting
Journal:  Nat Rev Genet       Date:  2014-02       Impact factor: 53.242

Review 5.  Separating measurement and expression models clarifies confusion in single-cell RNA sequencing analysis.

Authors:  Abhishek Sarkar; Matthew Stephens
Journal:  Nat Genet       Date:  2021-05-24       Impact factor: 38.330

6.  Bayesian approach to single-cell differential expression analysis.

Authors:  Peter V Kharchenko; Lev Silberstein; David T Scadden
Journal:  Nat Methods       Date:  2014-05-18       Impact factor: 28.547

7.  Embracing the dropouts in single-cell RNA-seq analysis.

Authors:  Peng Qiu
Journal:  Nat Commun       Date:  2020-03-03       Impact factor: 14.919

8.  Impact of sequencing depth and read length on single cell RNA sequencing data of T cells.

Authors:  Simone Rizzetto; Auda A Eltahla; Peijie Lin; Rowena Bull; Andrew R Lloyd; Joshua W K Ho; Vanessa Venturi; Fabio Luciani
Journal:  Sci Rep       Date:  2017-10-06       Impact factor: 4.379

Review 9.  Single-cell RNA sequencing technologies and bioinformatics pipelines.

Authors:  Byungjin Hwang; Ji Hyun Lee; Duhee Bang
Journal:  Exp Mol Med       Date:  2018-08-07       Impact factor: 8.718

Review 10.  Statistics or biology: the zero-inflation controversy about scRNA-seq data.

Authors:  Ruochen Jiang; Tianyi Sun; Dongyuan Song; Jingyi Jessica Li
Journal:  Genome Biol       Date:  2022-01-21       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.