| Literature DB >> 20838429 |
Marine Jeanmougin1, Aurelien de Reynies, Laetitia Marisa, Caroline Paccard, Gregory Nuel, Mickael Guedj.
Abstract
High-throughput post-genomic studies are now routinely and promisingly investigated in biological and biomedical research. The main statistical approach to select genes differentially expressed between two groups is to apply a t-test, which is subject of criticism in the literature. Numerous alternatives have been developed based on different and innovative variance modeling strategies. However, a critical issue is that selecting a different test usually leads to a different gene list. In this context and given the current tendency to apply the t-test, identifying the most efficient approach in practice remains crucial. To provide elements to answer, we conduct a comparison of eight tests representative of variance modeling strategies in gene expression data: Welch's t-test, ANOVA [1], Wilcoxon's test, SAM [2], RVM [3], limma [4], VarMixt [5] and SMVar [6]. Our comparison process relies on four steps (gene list analysis, simulations, spike-in data and re-sampling) to formulate comprehensive and robust conclusions about test performance, in terms of statistical power, false-positive rate, execution time and ease of use. Our results raise concerns about the ability of some methods to control the expected number of false positives at a desirable level. Besides, two tests (limma and VarMixt) show significant improvement compared to the t-test, in particular to deal with small sample sizes. In addition limma presents several practical advantages, so we advocate its application to analyze gene expression data.Entities:
Mesh:
Year: 2010 PMID: 20838429 PMCID: PMC2933223 DOI: 10.1371/journal.pone.0012336
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Data sets used for the gene list analysis.
| Data-set | Groups | Sample size | Publication |
| Lymphoid tumors | Disease staging |
| Lamant et al. 2007 |
| Liver tumors | TP53 mutation |
| Boyault et al. 2007 |
| Head and neck tumors | Gender |
| Rickman et al. 2008 |
| Leukemia | Gender |
| Soulier et al. 2006 |
| Breast tumors | ESR1 expression |
| Bertheau et al. 2007 |
The five data sets come from the Cartes d'Identité des Tumeurs (CIT, http://cit.ligue-cancer.net) program and are publicly available. All the microarrays are Affymetrix U133A microarrays with 22,283 genes.
Figure 1Data matrix resulting from simulations.
Rows refer to genes simulated under and , columns refer to samples of both groups to compare.
Figure 2Gene list analysis.
PCAs and dendrograms are generated based on the gene lists resulting from the application of the eight tests of interest and the control-test. Here we show results for two data sets comparing ESR1 expression in breast cancer and gender in leukemia. Both outline five clusters of tests.
Figure 3Power study from simulations (Gaussian model, M1).
Power values are calculated at the 5% level and displayed according to the sample size. Figures A and C represent power values. Red arrows highlight the effect of false-positive rate adjustment on power values. Figures B and D represent power values relative to t-test. Figures A and B concern power values calculated at the actual false-positive rate. Figures C and D concern power values calculated at the adjusted false-positive rate.
False-positive rate study from simulations.
| M1 | M2 | M3 | M4 | |||||
| Sample size |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ANOVA |
|
|
|
|
|
|
|
|
| Wilcoxon▾ |
|
|
|
|
|
|
|
|
| SAM |
|
|
|
|
|
|
|
|
| RVM▴ |
|
|
|
|
|
|
|
|
| limma |
|
|
|
|
|
|
|
|
| SMVar▴ |
|
|
|
|
|
|
|
|
| VarMixt |
|
|
|
|
|
|
|
|
For small and large samples, this table presents the confidence-interval of false-positive rate obtained by applying a threshold of to the p-values. Up triangles ▴ (resp. down triangles ▾) indicate an increase (resp. a decrease) of the false-positive rate compared to the expected level of . Two triangles inform of a deviation in both small and large sample sizes.
Figure 4Spike-in data set.
Power values are calculated at the 5% level and displayed according to six of the 13 pairwise comparisons.
Figure 5Re-sampling approach.
Power values are calculated at a 0.1 FDR level and displayed according to the sample size.
Summary table.
| False-positive rate | Power | In practice | ||||
| Small samples | Large samples | Small samples | Large samples | Ease of use | Execution time | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This table summarizes the results of our study in terms of false-positive rate, power and practical criteria. The number of “+” indicates the performance, from weak (+), to very good one (+++).