Literature DB >> 29985984

Empirical Bayes shrinkage and false discovery rate estimation, allowing for unwanted variation.

David Gerard1, Matthew Stephens2.   

Abstract

We combine two important ideas in the analysis of large-scale genomics experiments (e.g. experiments that aim to identify genes that are differentially expressed between two conditions). The first is use of Empirical Bayes (EB) methods to handle the large number of potentially-sparse effects, and estimate false discovery rates and related quantities. The second is use of factor analysis methods to deal with sources of unwanted variation such as batch effects and unmeasured confounders. We describe a simple modular fitting procedure that combines key ideas from both these lines of research. This yields new, powerful EB methods for analyzing genomics experiments that account for both sparse effects and unwanted variation. In realistic simulations, these new methods provide significant gains in power and calibration over competing methods. In real data analysis, we find that different methods, while often conceptually similar, can vary widely in their assessments of statistical significance. This highlights the need for care in both choice of methods and interpretation of results.
© The Author 2018. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Keywords:  Batch effects; Empirical Bayes; RNA-seq; Surrogate variable analysis; Unobserved confounding; Unwanted variation

Mesh:

Year:  2020        PMID: 29985984      PMCID: PMC8204175          DOI: 10.1093/biostatistics/kxy029

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  18 in total

1.  Using control genes to correct for unwanted variation in microarray data.

Authors:  Johann A Gagnon-Bartsch; Terence P Speed
Journal:  Biostatistics       Date:  2011-11-17       Impact factor: 5.899

2.  False discovery rates: a new deal.

Authors:  Matthew Stephens
Journal:  Biostatistics       Date:  2017-04-01       Impact factor: 5.899

3.  High-Dimensional Sparse Factor Modeling: Applications in Gene Expression Genomics.

Authors:  Carlos M Carvalho; Jeffrey Chang; Joseph E Lucas; Joseph R Nevins; Quanli Wang; Mike West
Journal:  J Am Stat Assoc       Date:  2008-12-01       Impact factor: 5.033

4.  A Bayesian framework to account for complex non-genetic factors in gene expression levels greatly increases power in eQTL studies.

Authors:  Oliver Stegle; Leopold Parts; Richard Durbin; John Winn
Journal:  PLoS Comput Biol       Date:  2010-05-06       Impact factor: 4.475

5.  Human housekeeping genes, revisited.

Authors:  Eli Eisenberg; Erez Y Levanon
Journal:  Trends Genet       Date:  2013-06-27       Impact factor: 11.639

6.  Understanding mechanisms underlying human gene expression variation with RNA sequencing.

Authors:  Joseph K Pickrell; John C Marioni; Athma A Pai; Jacob F Degner; Barbara E Engelhardt; Everlyne Nkadori; Jean-Baptiste Veyrieras; Matthew Stephens; Yoav Gilad; Jonathan K Pritchard
Journal:  Nature       Date:  2010-03-10       Impact factor: 49.962

7.  Do housekeeping genes exist?

Authors:  Yijuan Zhang; Ding Li; Bingyun Sun
Journal:  PLoS One       Date:  2015-05-13       Impact factor: 3.240

8.  voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.

Authors:  Charity W Law; Yunshun Chen; Wei Shi; Gordon K Smyth
Journal:  Genome Biol       Date:  2014-02-03       Impact factor: 13.583

9.  A comparison of methods for differential expression analysis of RNA-seq data.

Authors:  Charlotte Soneson; Mauro Delorenzi
Journal:  BMC Bioinformatics       Date:  2013-03-09       Impact factor: 3.169

10.  Capturing heterogeneity in gene expression studies by surrogate variable analysis.

Authors:  Jeffrey T Leek; John D Storey
Journal:  PLoS Genet       Date:  2007-08-01       Impact factor: 5.917

View more
  3 in total

1.  DOUBLY DEBIASED LASSO: HIGH-DIMENSIONAL INFERENCE UNDER HIDDEN CONFOUNDING.

Authors:  Zijian Guo; Domagoj Ćevid; Peter Bühlmann
Journal:  Ann Stat       Date:  2022-06-16       Impact factor: 4.904

2.  Estimating and accounting for unobserved covariates in high-dimensional correlated data.

Authors:  Chris McKennan; Dan Nicolae
Journal:  J Am Stat Assoc       Date:  2020-06-30       Impact factor: 4.369

3.  Data-based RNA-seq simulations by binomial thinning.

Authors:  David Gerard
Journal:  BMC Bioinformatics       Date:  2020-05-24       Impact factor: 3.169

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.