Literature DB >> 18384266

Adaptive choice of the number of bootstrap samples in large scale multiple testing.

Wenge Guo1, Shyamal Peddada.   

Abstract

It is a common practice to use resampling methods such as the bootstrap for calculating the p-value for each test when performing large scale multiple testing. The precision of the bootstrap p-values and that of the false discovery rate (FDR) relies on the number of bootstraps used for testing each hypothesis. Clearly, the larger the number of bootstraps the better the precision. However, the required number of bootstraps can be computationally burdensome, and it multiplies the number of tests to be performed. Further adding to the computational challenge is that in some applications the calculation of the test statistic itself may require considerable computation time. As technology improves one can expect the dimension of the problem to increase as well. For instance, during the early days of microarray technology, the number of probes on a cDNA chip was less than 10,000. Now the Affymetrix chips come with over 50,000 probes per chip. Motivated by this important need, we developed a simple adaptive bootstrap methodology for large scale multiple testing, which reduces the total number of bootstrap calculations while ensuring the control of the FDR. The proposed algorithm results in a substantial reduction in the number of bootstrap samples. Based on a simulation study we found that, relative to the number of bootstraps required for the Benjamini-Hochberg (BH) procedure, the standard FDR methodology which was the proposed methodology achieved a very substantial reduction in the number of bootstraps. In some cases the new algorithm required as little as 1/6th the number of bootstraps as the conventional BH procedure. Thus, if the conventional BH procedure used 1,000 bootstraps, then the proposed method required only 160 bootstraps. This methodology has been implemented for time-course/dose-response data in our software, ORIOGEN, which is available from the authors upon request.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 18384266      PMCID: PMC2752392          DOI: 10.2202/1544-6115.1360

Source DB:  PubMed          Journal:  Stat Appl Genet Mol Biol        ISSN: 1544-6115


  16 in total

1.  Regulation of DNA replication fork genes by 17beta-estradiol.

Authors:  Edward K Lobenhofer; Lee Bennett; P LouAnn Cable; Leping Li; Pierre R Bushel; Cynthia A Afshari
Journal:  Mol Endocrinol       Date:  2002-06

2.  Statistical significance for genomewide studies.

Authors:  John D Storey; Robert Tibshirani
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-25       Impact factor: 11.205

3.  Use of the false discovery rate for evaluating clinical safety data.

Authors:  Devan V Mehrotra; Joseph F Heyse
Journal:  Stat Methods Med Res       Date:  2004-06       Impact factor: 3.021

4.  Quantitative trait Loci analysis using the false discovery rate.

Authors:  Yoav Benjamini; Daniel Yekutieli
Journal:  Genetics       Date:  2005-06-14       Impact factor: 4.562

5.  Augmentation procedures for control of the generalized family-wise error rate and tail probabilities for the proportion of false positives.

Authors:  Mark J van der Laan; Sandrine Dudoit; Katherine S Pollard
Journal:  Stat Appl Genet Mol Biol       Date:  2004-06-15

6.  Two-sided confidence intervals for the single proportion: comparison of seven methods.

Authors:  R G Newcombe
Journal:  Stat Med       Date:  1998-04-30       Impact factor: 2.373

7.  Comparative analysis of two rates.

Authors:  O Miettinen; M Nurminen
Journal:  Stat Med       Date:  1985 Apr-Jun       Impact factor: 2.373

8.  Empirical Bayes screening of many p-values with applications to microarray studies.

Authors:  Susmita Datta; Somnath Datta
Journal:  Bioinformatics       Date:  2005-02-02       Impact factor: 6.937

9.  Bootstrapping of gene-expression data improves and controls the false discovery rate of differentially expressed genes.

Authors:  Theo H E Meuwissen; Mike E Goddard
Journal:  Genet Sel Evol       Date:  2004 Mar-Apr       Impact factor: 4.297

10.  Quadratic regression analysis for gene discovery and pattern recognition for non-cyclic short time-course microarray experiments.

Authors:  Hua Liu; Sergey Tarima; Aaron S Borders; Thomas V Getchell; Marilyn L Getchell; Arnold J Stromberg
Journal:  BMC Bioinformatics       Date:  2005-04-25       Impact factor: 3.169

View more
  6 in total

1.  Analysis of Correlated Gene Expression Data on Ordered Categories.

Authors:  Shyamal D Peddada; Shawn F Harris; Ori Davidov
Journal:  J Indian Soc Agric Stat       Date:  2010

2.  Statistical properties of an early stopping rule for resampling-based multiple testing.

Authors:  Hui Jiang; Julia Salzman
Journal:  Biometrika       Date:  2012-10-03       Impact factor: 2.445

3.  Renal Cell Carcinomas in Vinylidene Chloride-exposed Male B6C3F1 Mice Are Characterized by Oxidative Stress and TP53 Pathway Dysregulation.

Authors:  Schantel A Hayes; Arun R Pandiri; Thai-vu T Ton; Hue-Hua L Hong; Natasha P Clayton; Keith R Shockley; Shyamal D Peddada; Kevin Gerrish; Michael Wyde; Robert C Sills; Mark J Hoenerhoff
Journal:  Toxicol Pathol       Date:  2015-12-17       Impact factor: 1.902

4.  Evaluation of statistical approaches for association testing in noisy drug screening data.

Authors:  Petr Smirnov; Ian Smith; Zhaleh Safikhani; Wail Ba-Alawi; Farnoosh Khodakarami; Eva Lin; Yihong Yu; Scott Martin; Janosch Ortmann; Tero Aittokallio; Marc Hafner; Benjamin Haibe-Kains
Journal:  BMC Bioinformatics       Date:  2022-05-18       Impact factor: 3.307

5.  Pooling/bootstrap-based GWAS (pbGWAS) identifies new loci modifying the age of onset in PSEN1 p.Glu280Ala Alzheimer's disease.

Authors:  J I Vélez; S C Chandrasekharappa; E Henao; A F Martinez; U Harper; M Jones; B D Solomon; L Lopez; G Garcia; D C Aguirre-Acevedo; N Acosta-Baena; J C Correa; C M Lopera-Gómez; M C Jaramillo-Elorza; D Rivera; K S Kosik; N J Schork; J M Swanson; F Lopera; M Arcos-Burgos
Journal:  Mol Psychiatry       Date:  2012-06-19       Impact factor: 15.992

6.  Large-scale multiple testing in genome-wide association studies via region-specific hidden Markov models.

Authors:  Jian Xiao; Wensheng Zhu; Jianhua Guo
Journal:  BMC Bioinformatics       Date:  2013-09-25       Impact factor: 3.169

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.