| Literature DB >> 21955789 |
Claudio Isella1, Tommaso Renzulli, Davide Corà, Enzo Medico.
Abstract
BACKGROUND: Many microarray experiments search for genes with differential expression between a common "reference" group and multiple "test" groups. In such cases currently employed statistical approaches based on t-tests or close derivatives have limited efficacy, mainly because estimation of the standard error is done on only two groups at a time. Alternative approaches based on ANOVA correctly capture within-group variance from all the groups, but then do not confront single test groups with the reference. Ideally, a t-test better suited for this type of data would compare each test group with the reference, but use within-group variance calculated from all the groups.Entities:
Mesh:
Year: 2011 PMID: 21955789 PMCID: PMC3230912 DOI: 10.1186/1471-2105-12-382
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.307
Figure 1Mulcom optimization plot. Optimization plot generated by Mulcom to choose test parameters (m and t). The heatmap highlights the number of significant genes for each combination of m (y-axis) and t (x-axis) with a FDR below the threshold defined by the user (five percent in this case). Given the FDR threshold of choice, the box with the colour closest to the top of the scale indicates t and m values giving the maximum number of significant genes. Additional boxes to the right and top of the lightest one are shown to provide an estimate of the number of significant genes passing the test under more stringent t and m conditions. Whether such conditions further improve FDR, should be tested by fixing a lower FDR threshold and repeating the analysis.
Figure 2Overlap between Mulcom and other tests. Venn diagram showing intersections between lists of significant probe sets defined by Mulcom, SAM and Limma (Affy dataset). These show a limited partial but significant overlap.
Validation across microarray platforms of Mulcom, Limma and SAM tests
| Mulcom | Limma | SAM | ||
|---|---|---|---|---|
| Significant genes in Affy | 867 | 672 | 723 | |
| Validated in Illumina | 150 | 0 | 48 | |
| Significant genes in Affy | 681 | 518 | 561 | |
| Validated in Illumina | 317 | 237 | 100 | |
| Significant genes in Affy | 4 | 0 | 82 | |
| Validated in Illumina | 1 | 0 | 1 | |
| Significant genes in Affy | 26 | 6 | 75 | |
| Validated in Illumina | 1 | 1 | 0 | |
| Significant genes in Affy | 1249 | 956 | 1006 | |
| Validated in Illumina | 487 | 246 | 151 | |
| True positive rate | 39% | 26% | 15% | |
Number of genes identified by Mulcom, Limma and SAM tests in the Affymetrix dataset, and the number of validated genes in the Illumina dataset for all the pair-wise comparisons.
Figure 3Functional significance of Mulcom results. Enrichment in annotation to specific cellular functions for gene lists generated by Mulcom, SAM and Limma (p-value below 0.001) and analysed using the Ingenuity pathway.