Literature DB >> 19278560

Investigation of reproducibility of differentially expressed genes in DNA microarrays through statistical simulation.

Xiaohui Fan¹, Leming Shi, Hong Fang, Stephen Harris, Roger Perkins, Weida Tong.

Abstract

Recent publications have raised concerns about the reliability of microarray technology because of the lack of reproducibility of differentially expressed genes (DEGs) from highly similar studies across laboratories and platforms. The rat toxicogenomics study of the MicroArray Quality Control (MAQC) project empirically revealed that the DEGs selected using a fold change (FC)-based criterion were more reproducible than those derived solely by statistical significance such as P-value from a simple t-tests. In this study, we generate a set of simulated microarray datasets to compare gene selection/ranking rules, including P-value, FC and their combinations, using the percentage of overlapping genes between DEGs from two similar simulated datasets as the measure of reproducibility. The results are supportive of the MAQC's conclusion on that DEG lists are more reproducible across laboratories and platforms when FC-based ranking coupled with a nonstringent P-value cutoff is used for gene selection compared with selection based on P-value based ranking method. We conclude that the MAQC recommendation should be considered when reproducibility is an important study objective.

Entities: Chemical Disease Species

Year: 2009 PMID： 19278560 PMCID： PMC2654487 DOI： 10.1186/1753-6561-3-s2-s4

Source DB: PubMed Journal: BMC Proc ISSN： 1753-6561

Background

The utility of DNA microarrays has been demonstrated in clinical applications and risk/safety assessments [1-6]. With the wide variety of array platforms and analysis approaches, however, challenges remain in this field. For example, several publications [7-11] recently raised concerns about the reliability of microarray technology based on the lack of agreement in differentially expressed genes (DEGs) obtained from different laboratories and array platforms for highly similar study designs and experiments. By reanalyzing seven of the largest public DNA microarrays datasets aimed at cancer prognosis, Michiels et al. found that the signature genes of the classifiers were extremely unstable [11]. The MicroArray Quality Control (MAQC) project conducted a large study using reference RNA samples and a toxicogenomics dataset [12,13] revealed that the DEGs selected using fold change (FC)-based criterion were more stable in terms of reproducibility across labs and platforms than those derived solely from statistical significance measures such as P-value from simple t-tests. The MAQC study caused some to argue that the MAQC conclusion could be so broadly generalized. In response, this study sought to duplicate the finding of MAQC, except through statistical simulation using postulated datasets. Specifically, we generated a set of simulated microarray datasets with varying amount of noise, expression magnitude, and sample size in order to systematically compare the relationships among gene selection/ranking rules (i.e., P-value, FC and their combinations) with respect to reproducibility of DEGs.

Methods

Two simulated groups of samples were generated, a control group and a treatment group. The control and treatment groups consisted of either 5 or 50 replicates (samples) with each replicate containing 12,000 genes. The gene intensities of the samples in the control group were simulated by Signal + Noise while the corresponding gene intensities of the treated samples were Signal + FC + Noise. Both Signal and Noise were distributed normally, while FC was distributed exponentially. The study used the set of parameters that are summarized in Table 1. Specifically, both treated and control groups contain either 50 or 5 simulated replicates with a distributed CV (coefficient of variation) similar to those observed in the MAQC study for the reference RNA samples and rat toxicogenomics dataset. CV values of 2%, 10%, 30%, and 100% were used, corresponding to low, medium, high, and very high noise level, respectively. For each CV value, three expression magnitudes were considered corresponding to mean FC of 1.5, 0.6 and 0.2; these values are corresponding to the MAQC's study for the reference RNA samples and rat toxicogenomics dataset as well as consistent with the range typically found in clinical microarray experiments, respectively.

Table 1

Summary of the parameters used in this study.

CV	Low	Medium	High	Very High

	~2%	~10%	~30%	~100%
Magnitude(FC)	MAQC main study	MAQC Rat toxicogenomics	Clinical application

	~1.5	~0.6	~0.2

Sample Size	5 per group		50 per group

Summary of the parameters used in this study.

Results and discussions

The study applied 24 simulated conditions (or 24 permutations) corresponding to two sample sizes, each having four values of CV and three different mean FC values, corresponding to Table 1. For each permutation, six gene selection methods were used to determine DEGs by comparing the treated group with the control group. These gene selection methods were (1) FC: genes are rank ordered by FC and DEGs determined by a FC cut-off only; (2–3) FC (P < 0.01) and FC (P < 0.05): genes are rank ordered first by FC and DEGs are determined by a P-value cutoff of either 0.01 and 0.05; (4) P: genes are rank ordered by P-value from the simple t-test and DEGs are selected using a specified P-value cutoff; and (5–6) P (FC > 1.4), and P (FC > 2): genes are rank ordered first by P-value and DEGs are then determined by either a FC = 1.4 or FC = 2 cutoff. Each permutation was repeated twice to mimic the process of conducting the same experiment in two different labs or two different platforms. The resulting DEGs from two simulations were compared to assess reproducibility across labs or platforms based on the percentage of overlapping genes (POG). Figure 1 compares six gene selection methods applied to four datasets, each containing a different noise level (i.e., CV = 2%, 10%, 30% and 100%), where POG is shown as a function of the number of genes selected as differentially expressed between two simulations for the same permutation (magnitude = 1.5 and sample size = 50). In general, the FC-based gene selection methods outperformed the P-based gene selection method in terms of DEG reproducibility measured by POG. Specifically, three FC-based gene selection methods, i.e. FC, FC (P < 0.01), and FC (P < 0.05) consistently result in the highest POG values, regardless of CV value. Higher noise consistently results in lower POG (i.e., DEG reproducibility), as expected. The POG consistently decreases with increasing CV. For P value selection methods, higher FC cutoff results in higher POG. All results are consistent with MAQC observations.

Figure 1

The relationship of POGs with the degree of noise level in the simulated datasets: (A) Low noise (CV = 2%); (B) Medium noise (CV = 10%); (C) High noise (CV = 30%); and (D) Very high noise (CV = 100%). The simulated datasets were set to the expression magnitude difference between the treated and control groups of 1.5 and the sample size of 50. The x-axis represents the number of genes selected as differentially expressed, and the y-axis represents the POG (%) of two gene lists for a given number of differentially expressed genes. Each line on the graph represents the overlap of differentially expressed gene lists based on one of six different gene ranking/selection methods. The red and blue numbers give the POG (%) for 500 selected DEGs (red dashed line) from P rank ordering only and FC rank ordering with P < 0.05, respectively. Figure 2 compares six gene selection methods on three datasets, each having a different magnitude level between the treated and control groups (i.e., FC = 1.5, 0.6 and 0.2). Similar to Figure 1, the FC-based methods resulted in greater reproducibility compared to the P-based method. Furthermore, POG increases with increasing differential expression magnitude for FC selection methods. However, this trend is not prominent for P value-based selection methods, where it seems that the trend is equivocal.

Figure 2

The relationship of POG with the degree of difference in expression magnitude between the treated versus control groups. (A) Magnitude = 0.6; (B) Magnitude = 1.5; and (C) Magnitude = 0.2. The simulated datasets had CV = 30% and sample size = 50. The x-axis represents the number of genes selected as differentially expressed, and the y-axis represents the POG (%) of two gene lists for a given number of differentially expressed genes. Each line on the graph represents the overlap of differentially expressed gene lists based on one of six different gene ranking/selection methods. The red and blue numbers give the POG (%) when 500 genes (red dashed line) are selected as DEGs using P rank ordering only and FC rank ordering with P < 0.05, respectively.

Figure 3

The relationship of POG with the sample size: (A) 50 samples/group and (B) 5 samples/group. The simulated datasets had CV = 30% and magnitude = 50 (see Table 1). The x-axis represents the number of genes selected as differentially expressed, and the y-axis represents the POG (%) of two gene lists for a given number of differentially expressed genes. Each line on the graph represents the overlap of differentially expressed gene lists based on one of six different gene ranking/selection methods. The red and blue numbers give the POG (%) when 500 genes (red dashed line) are selected as DEGs using P rank ordering only and FC rank ordering with P < 0.05, respectively. Whereas POG are affected by the degree of noise level, expression magnitude and sample size of the datasets, the above results clearly demonstrated that the DEGs become more reproducible, especially when fewer genes are selected, if the FC is included as the ranking criterion for subsequent DEGs identification. It is likely that the discordance of reported microarray results in literature is in large part due to the widespread of using P-based approach to rank genes over the FC-based method. The results of our another related study demonstrated that the relationship of the tradeoff between reproducibility and specificity/sensitivity in the FC (P) approach can be balanced by weighting the FC as a primary consideration in gene ranking: that is an FC criterion explicitly incorporates the measured quantity to ensure reproducibility, whereas a P criterion incorporates control of sensitivity and specificity [14].

Conclusion

Our simulation results show that the choice of gene selection method significantly affects apparent reproducibility of DEGs as measured by POG. Reproducibility as measured by POG between lists substantially increases when FC is the ranking criterion for identifying DEGs, especially for shorter gene lists. This observation holds for different noise levels, expression magnitudes and sample sizes. Our simulation are consistent with MAQC's conclusion that to generate more reproducible DEG lists across labs and platforms, the FC ranking with a nonstringent P-value cutoff, so named the FC (P) approach, should be considered when reproducibility is a consideration in a microarray study.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

XF performed data analyses and finished the first draft of the manuscript. WT and HF guided the analysis and helped writing manuscript. LS had the original idea for statistical simulation. RP also helped manuscript writing. SH helped the microarray data management. All authors participated in preparation of the manuscript, and approved its final version.

14 in total

1. Getting the noise out of gene arrays.

Authors: Eliot Marshall
Journal: Science Date: 2004-10-22 Impact factor: 47.728

2. Prediction of cancer outcome with microarrays: a multiple random validation strategy.

Authors: Stefan Michiels; Serge Koscielny; Catherine Hill
Journal: Lancet Date: 2005 Feb 5-11 Impact factor: 79.321

3. Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer.

Authors: Liat Ein-Dor; Or Zuk; Eytan Domany
Journal: Proc Natl Acad Sci U S A Date: 2006-04-03 Impact factor: 11.205

4. Development and evaluation of therapeutically relevant predictive classifiers using gene expression profiling.

Authors: Richard Simon
Journal: J Natl Cancer Inst Date: 2006-09-06 Impact factor: 13.506

5. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.

Authors: Leming Shi; Laura H Reid; Wendell D Jones; Richard Shippy; Janet A Warrington; Shawn C Baker; Patrick J Collins; Francoise de Longueville; Ernest S Kawasaki; Kathleen Y Lee; Yuling Luo; Yongming Andrew Sun; James C Willey; Robert A Setterquist; Gavin M Fischer; Weida Tong; Yvonne P Dragan; David J Dix; Felix W Frueh; Frederico M Goodsaid; Damir Herman; Roderick V Jensen; Charles D Johnson; Edward K Lobenhofer; Raj K Puri; Uwe Schrf; Jean Thierry-Mieg; Charles Wang; Mike Wilson; Paul K Wolber; Lu Zhang; Shashi Amur; Wenjun Bao; Catalin C Barbacioru; Anne Bergstrom Lucas; Vincent Bertholet; Cecilie Boysen; Bud Bromley; Donna Brown; Alan Brunner; Roger Canales; Xiaoxi Megan Cao; Thomas A Cebula; James J Chen; Jing Cheng; Tzu-Ming Chu; Eugene Chudin; John Corson; J Christopher Corton; Lisa J Croner; Christopher Davies; Timothy S Davison; Glenda Delenstarr; Xutao Deng; David Dorris; Aron C Eklund; Xiao-hui Fan; Hong Fang; Stephanie Fulmer-Smentek; James C Fuscoe; Kathryn Gallagher; Weigong Ge; Lei Guo; Xu Guo; Janet Hager; Paul K Haje; Jing Han; Tao Han; Heather C Harbottle; Stephen C Harris; Eli Hatchwell; Craig A Hauser; Susan Hester; Huixiao Hong; Patrick Hurban; Scott A Jackson; Hanlee Ji; Charles R Knight; Winston P Kuo; J Eugene LeClerc; Shawn Levy; Quan-Zhen Li; Chunmei Liu; Ying Liu; Michael J Lombardi; Yunqing Ma; Scott R Magnuson; Botoul Maqsodi; Tim McDaniel; Nan Mei; Ola Myklebost; Baitang Ning; Natalia Novoradovskaya; Michael S Orr; Terry W Osborn; Adam Papallo; Tucker A Patterson; Roger G Perkins; Elizabeth H Peters; Ron Peterson; Kenneth L Philips; P Scott Pine; Lajos Pusztai; Feng Qian; Hongzu Ren; Mitch Rosen; Barry A Rosenzweig; Raymond R Samaha; Mark Schena; Gary P Schroth; Svetlana Shchegrova; Dave D Smith; Frank Staedtler; Zhenqiang Su; Hongmei Sun; Zoltan Szallasi; Zivana Tezak; Danielle Thierry-Mieg; Karol L Thompson; Irina Tikhonova; Yaron Turpaz; Beena Vallanat; Christophe Van; Stephen J Walker; Sue Jane Wang; Yonghong Wang; Russ Wolfinger; Alex Wong; Jie Wu; Chunlin Xiao; Qian Xie; Jun Xu; Wen Yang; Liang Zhang; Sheng Zhong; Yaping Zong; William Slikker
Journal: Nat Biotechnol Date: 2006-09 Impact factor: 54.908

6. Concordance among gene-expression-based predictors for breast cancer.

Authors: Cheng Fan; Daniel S Oh; Lodewyk Wessels; Britta Weigelt; Dimitry S A Nuyten; Andrew B Nobel; Laura J van't Veer; Charles M Perou
Journal: N Engl J Med Date: 2006-08-10 Impact factor: 91.245

7. Rat toxicogenomic study reveals analytical consistency across microarray platforms.

Authors: Lei Guo; Edward K Lobenhofer; Charles Wang; Richard Shippy; Stephen C Harris; Lu Zhang; Nan Mei; Tao Chen; Damir Herman; Federico M Goodsaid; Patrick Hurban; Kenneth L Phillips; Jun Xu; Xutao Deng; Yongming Andrew Sun; Weida Tong; Yvonne P Dragan; Leming Shi
Journal: Nat Biotechnol Date: 2006-09 Impact factor: 54.908

8. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

Authors: A Bhattacharjee; W G Richards; J Staunton; C Li; S Monti; P Vasa; C Ladd; J Beheshti; R Bueno; M Gillette; M Loda; G Weber; E J Mark; E S Lander; W Wong; B E Johnson; T R Golub; D J Sugarbaker; M Meyerson
Journal: Proc Natl Acad Sci U S A Date: 2001-11-13 Impact factor: 11.205

9. Oligonucleotide microarray for prediction of early intrahepatic recurrence of hepatocellular carcinoma after curative resection.

Authors: Norio Iizuka; Masaaki Oka; Hisafumi Yamada-Okabe; Minekatsu Nishida; Yoshitaka Maeda; Naohide Mori; Takashi Takao; Takao Tamesa; Akira Tangoku; Hisahiro Tabuchi; Kenji Hamada; Hironobu Nakayama; Hideo Ishitsuka; Takanobu Miyamoto; Akira Hirabayashi; Shunji Uchimura; Yoshihiko Hamamoto
Journal: Lancet Date: 2003-03-15 Impact factor: 79.321

10. The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies.

Authors: Leming Shi; Wendell D Jones; Roderick V Jensen; Stephen C Harris; Roger G Perkins; Federico M Goodsaid; Lei Guo; Lisa J Croner; Cecilie Boysen; Hong Fang; Feng Qian; Shashi Amur; Wenjun Bao; Catalin C Barbacioru; Vincent Bertholet; Xiaoxi Megan Cao; Tzu-Ming Chu; Patrick J Collins; Xiao-Hui Fan; Felix W Frueh; James C Fuscoe; Xu Guo; Jing Han; Damir Herman; Huixiao Hong; Ernest S Kawasaki; Quan-Zhen Li; Yuling Luo; Yunqing Ma; Nan Mei; Ron L Peterson; Raj K Puri; Richard Shippy; Zhenqiang Su; Yongming Andrew Sun; Hongmei Sun; Brett Thorn; Yaron Turpaz; Charles Wang; Sue Jane Wang; Janet A Warrington; James C Willey; Jie Wu; Qian Xie; Liang Zhang; Lu Zhang; Sheng Zhong; Russell D Wolfinger; Weida Tong
Journal: BMC Bioinformatics Date: 2008-08-12 Impact factor: 3.169

5 in total

1. Development and validation of a resistance and virulence gene microarray targeting Escherichia coli and Salmonella enterica.

Authors: Margaret A Davis; Ji Youn Lim; Yesim Soyer; Heather Harbottle; Yung-Fu Chang; Daniel New; Lisa H Orfe; Thomas E Besser; Douglas R Call
Journal: J Microbiol Methods Date: 2010-03-31 Impact factor: 2.363

2. Comparison of lung cancer cell lines representing four histopathological subtypes with gene expression profiling using quantitative real-time PCR.

Authors: Takashi Watanabe; Tomohiro Miura; Yusuke Degawa; Yuna Fujita; Masaaki Inoue; Makoto Kawaguchi; Chie Furihata
Journal: Cancer Cell Int Date: 2010-01-21 Impact factor: 5.722

3. Considerations for Strategic Use of High-Throughput Transcriptomics Chemical Screening Data in Regulatory Decisions.

Authors: Joshua Harrill; Imran Shah; R Woodrow Setzer; Derik Haggard; Scott Auerbach; Richard Judson; Russell S Thomas
Journal: Curr Opin Toxicol Date: 2019

4. Why is there a lack of consensus on molecular subgroups of glioblastoma? Understanding the nature of biological and statistical variability in glioblastoma expression data.

Authors: Nicholas F Marko; John Quackenbush; Robert J Weil
Journal: PLoS One Date: 2011-07-28 Impact factor: 3.240

Review 5. A common molecular signature in ASD gene expression: following Root 66 to autism.

Authors: L Diaz-Beltran; F J Esteban; D P Wall
Journal: Transl Psychiatry Date: 2016-01-05 Impact factor: 6.222

5 in total