Literature DB >> 18573796

On correcting the overestimation of the permutation-based false discovery rate estimator.

Shuo Jiao1, Shunpu Zhang.   

Abstract

MOTIVATION: Recent attempts to account for multiple testing in the analysis of microarray data have focused on controlling the false discovery rate (FDR), which is defined as the expected percentage of the number of false positive genes among the claimed significant genes. As a consequence, the accuracy of the FDR estimators will be important for correctly controlling FDR. Xie et al. found that the standard permutation method of estimating FDR is biased and proposed to delete the predicted differentially expressed (DE) genes in the estimation of FDR for one-sample comparison. However, we notice that the formula of the FDR used in their paper is incorrect. This makes the comparison results reported in their paper unconvincing. Other problems with their method include the biased estimation of FDR caused by over- or under-deletion of DE genes in the estimation of FDR and by the implicit use of an unreasonable estimator of the true proportion of equivalently expressed (EE) genes. Due to the great importance of accurate FDR estimation in microarray data analysis, it is necessary to point out such problems and propose improved methods.
RESULTS: Our results confirm that the standard permutation method overestimates the FDR. With the correct FDR formula, we show the method of Xie et al. always gives biased estimation of FDR: it overestimates when the number of claimed significant genes is small, and underestimates when the number of claimed significant genes is large. To overcome these problems, we propose two modifications. The simulation results show that our estimator gives more accurate estimation.

Mesh:

Year:  2008        PMID: 18573796      PMCID: PMC2638866          DOI: 10.1093/bioinformatics/btn310

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  14 in total

1.  Analysis of variance for gene expression microarray data.

Authors:  M K Kerr; M Martin; G A Churchill
Journal:  J Comput Biol       Date:  2000       Impact factor: 1.479

2.  On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles.

Authors:  C M Kendziorski; M A Newton; H Lan; M N Gould
Journal:  Stat Med       Date:  2003-12-30       Impact factor: 2.373

3.  On the use of permutation in and the performance of a class of nonparametric methods to detect differential gene expression.

Authors:  Wei Pan
Journal:  Bioinformatics       Date:  2003-07-22       Impact factor: 6.937

4.  A mixture model approach to detecting differentially expressed genes with microarray data.

Authors:  Wei Pan; Jizhen Lin; Chap T Le
Journal:  Funct Integr Genomics       Date:  2003-07-01       Impact factor: 3.410

5.  Statistical significance for genomewide studies.

Authors:  John D Storey; Robert Tibshirani
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-25       Impact factor: 11.205

6.  A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data.

Authors:  Yang Xie; Wei Pan; Arkady B Khodursky
Journal:  Bioinformatics       Date:  2005-09-27       Impact factor: 6.937

7.  Using weighted permutation scores to detect differential gene expression with microarray data.

Authors:  Xu Guo; Wei Pan
Journal:  J Bioinform Comput Biol       Date:  2005-08       Impact factor: 1.122

8.  Linear models and empirical bayes methods for assessing differential expression in microarray experiments.

Authors:  Gordon K Smyth
Journal:  Stat Appl Genet Mol Biol       Date:  2004-02-12

9.  An improved nonparametric approach for detecting differentially expressed genes with replicated microarray data.

Authors:  Shunpu Zhang
Journal:  Stat Appl Genet Mol Biol       Date:  2007-01-02

10.  Evolutionary genomics of ecological specialization.

Authors:  Shaobin Zhong; Arkady Khodursky; Daniel E Dykhuizen; Antony M Dean
Journal:  Proc Natl Acad Sci U S A       Date:  2004-08-02       Impact factor: 11.205

View more
  4 in total

1.  Bayesian hierarchical modeling and selection of differentially expressed genes for the EST data.

Authors:  Fang Yu; Ming-Hui Chen; Lynn Kuo; Peng Huang; Wanling Yang
Journal:  Biometrics       Date:  2011-03       Impact factor: 2.571

2.  Genomic regions identified by overlapping clusters of nominally-positive SNPs from genome-wide studies of alcohol and illegal substance dependence.

Authors:  Catherine Johnson; Tomas Drgon; Donna Walther; George R Uhl
Journal:  PLoS One       Date:  2011-07-27       Impact factor: 3.240

3.  MAP: model-based analysis of proteomic data to detect proteins with significant abundance changes.

Authors:  Mushan Li; Shiqi Tu; Zijia Li; Fengxiang Tan; Jian Liu; Qian Wang; Yuannyu Zhang; Jian Xu; Yijing Zhang; Feng Zhou; Zhen Shao
Journal:  Cell Discov       Date:  2019-08-13       Impact factor: 10.849

4.  Bioinformatics identification of lncRNA biomarkers associated with the progression of esophageal squamous cell carcinoma.

Authors:  Jun Yu; Xiaoliu Wu; Kaidan Huang; Ming Zhu; Xiaomei Zhang; Yuanying Zhang; Senqing Chen; Xinyu Xu; Qin Zhang
Journal:  Mol Med Rep       Date:  2019-05-02       Impact factor: 2.952

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.