Koji Kadota1, Kentaro Shimizu. 1. Agricultural Bioinformatics Research Unit, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Yayoi, Bunkyo-ku, Japan. kadota@bi.a.u-tokyo.ac.jp
Abstract
BACKGROUND: Statistical methods for ranking differentially expressed genes (DEGs) from gene expression data should be evaluated with regard to high sensitivity, specificity, and reproducibility. In our previous studies, we evaluated eight gene ranking methods applied to only Affymetrix GeneChip data. A more general evaluation that also includes other microarray platforms, such as the Agilent or Illumina systems, is desirable for determining which methods are suitable for each platform and which method has better inter-platform reproducibility. RESULTS: We compared the eight gene ranking methods using the MicroArray Quality Control (MAQC) datasets produced by five manufacturers: Affymetrix, Applied Biosystems, Agilent, GE Healthcare, and Illumina. The area under the curve (AUC) was used as a measure for both sensitivity and specificity. Although the highest AUC values can vary with the definition of "true" DEGs, the best methods were, in most cases, either the weighted average difference (WAD), rank products (RP), or intensity-based moderated t statistic (ibmT). The percentages of overlapping genes (POGs) across different test sites were mainly evaluated as a measure for both intra- and inter-platform reproducibility. The POG values for WAD were the highest overall, irrespective of the choice of microarray platform. The high intra- and inter-platform reproducibility of WAD was also observed at a higher biological function level. CONCLUSION: These results for the five microarray platforms were consistent with our previous ones based on 36 real experimental datasets measured using the Affymetrix platform. Thus, recommendations made using the MAQC benchmark data might be universally applicable.
BACKGROUND: Statistical methods for ranking differentially expressed genes (DEGs) from gene expression data should be evaluated with regard to high sensitivity, specificity, and reproducibility. In our previous studies, we evaluated eight gene ranking methods applied to only Affymetrix GeneChip data. A more general evaluation that also includes other microarray platforms, such as the Agilent or Illumina systems, is desirable for determining which methods are suitable for each platform and which method has better inter-platform reproducibility. RESULTS: We compared the eight gene ranking methods using the MicroArray Quality Control (MAQC) datasets produced by five manufacturers: Affymetrix, Applied Biosystems, Agilent, GE Healthcare, and Illumina. The area under the curve (AUC) was used as a measure for both sensitivity and specificity. Although the highest AUC values can vary with the definition of "true" DEGs, the best methods were, in most cases, either the weighted average difference (WAD), rank products (RP), or intensity-based moderated t statistic (ibmT). The percentages of overlapping genes (POGs) across different test sites were mainly evaluated as a measure for both intra- and inter-platform reproducibility. The POG values for WAD were the highest overall, irrespective of the choice of microarray platform. The high intra- and inter-platform reproducibility of WAD was also observed at a higher biological function level. CONCLUSION: These results for the five microarray platforms were consistent with our previous ones based on 36 real experimental datasets measured using the Affymetrix platform. Thus, recommendations made using the MAQC benchmark data might be universally applicable.
Authors: Roger D Canales; Yuling Luo; James C Willey; Bradley Austermiller; Catalin C Barbacioru; Cecilie Boysen; Kathryn Hunkapiller; Roderick V Jensen; Charles R Knight; Kathleen Y Lee; Yunqing Ma; Botoul Maqsodi; Adam Papallo; Elizabeth Herness Peters; Karen Poulter; Patricia L Ruppel; Raymond R Samaha; Leming Shi; Wen Yang; Lu Zhang; Federico M Goodsaid Journal: Nat Biotechnol Date: 2006-09 Impact factor: 54.908
Authors: Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov Journal: Proc Natl Acad Sci U S A Date: 2005-09-30 Impact factor: 11.205
Authors: Wei Shi; Ashish Banerjee; Matthew E Ritchie; Steve Gerondakis; Gordon K Smyth Journal: BMC Bioinformatics Date: 2009-11-11 Impact factor: 3.169
Authors: Marianna Zahurak; Giovanni Parmigiani; Wayne Yu; Robert B Scharpf; David Berman; Edward Schaeffer; Shabana Shabbeer; Leslie Cope Journal: BMC Bioinformatics Date: 2007-05-01 Impact factor: 3.169
Authors: Sreevidya Sadananda Sadasiva Rao; Lori A Shepherd; Andrew E Bruno; Song Liu; Jeffrey C Miecznikowski Journal: Adv Bioinformatics Date: 2013-10-09