| Literature DB >> 18541049 |
Xutao Deng1, Jun Xu, Charles Wang.
Abstract
BACKGROUND: In DNA microarray gene expression profiling studies, a fundamental task is to extract statistically significant genes that meet certain research hypothesis. Currently, Venn diagram is a frequently used method for identifying overlapping genes that meet the investigator's research hypotheses. However this simple operation of intersecting multiple gene lists, known as the Intersection-Union Tests (IUTs), is performed without knowing the incurred changes in Type 1 error rate and can lead to loss of discovery power.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18541049 PMCID: PMC2423437 DOI: 10.1186/1471-2105-9-S6-S14
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Monte-Carlo estimates of type 1 error rate α'(%)
| RIUT | BIUT | Fisher | Stouffer | |
| 0.0 | 0.047 | 0.003 | 0.050 | 0.050 |
| 0.5 | 0.044 | 0.006 | 0.090 | 0.085 |
| 1.0 | 0.035 | 0.015 | 0.210 | 0.188 |
| 1.5 | 0.035 | 0.027 | 0.405 | 0.327 |
| 2.0 | 0.040 | 0.039 | 0.632 | 0.476 |
| 2.5 | 0.049 | 0.048 | 0.821 | 0.621 |
| 3.0 | 0.048 | 0.048 | 0.931 | 0.729 |
| 3.5 | 0.047 | 0.047 | 0.982 | 0.808 |
| 4.0 | 0.051 | 0.051 | 0.995 | 0.875 |
| 4.5 | 0.048 | 0.048 | 0.999 | 0.907 |
| 5.0 | 0.046 | 0.046 | 1.000 | 0.930 |
m = 2, n = 5, and μ= 0.0; individual tests were performed at α = 0.05 and each estimate were based on 10000 runs.
Figure 1Distributions of the p-values of 10000 simulations with μ. a. The distribution of RIUT p-value p'. b. The distribution of unadjusted BIUT p-value p. The dashed line indicates the hypothetical uniform distribution of the p-values of the 10000 runs.
Monte-Carlo estimates of power (%)
| RIUT | BIUT | RIUT | BIUT | |
| 0.5 | 0.057 | 0.014 | 0.007 | 0.001 |
| 1.0 | 0.059 | 0.029 | 0.006 | 0.003 |
| 1.5 | 0.077 | 0.064 | 0.008 | 0.006 |
| 2.0 | 0.085 | 0.081 | 0.014 | 0.012 |
| 2.5 | 0.101 | 0.101 | 0.020 | 0.020 |
| 3.0 | 0.103 | 0.103 | 0.023 | 0.023 |
| 3.5 | 0.109 | 0.109 | 0.030 | 0.030 |
| 4.0 | 0.105 | 0.105 | 0.024 | 0.024 |
| 4.5 | 0.112 | 0.112 | 0.027 | 0.027 |
| 5.0 | 0.104 | 0.104 | 0.024 | 0.024 |
m = 2, n = 5, and μ= 0.5; individual tests were performed at α = 0.05 or α = 0.01 and each estimate based on 10000 runs.
Simulation results of pooled instances
| RIUT | BIUT | |||||
| 0.1 | 9899 | 101 | 482(0.049) | 12(0.119) | 24(0.002) | 1(0.010) |
| 0.2 | 9607 | 393 | 447(0.047) | 39(0.099) | 32(0.003) | 6(0.015) |
| 0.3 | 9087 | 913 | 394(0.043) | 98(0.107) | 29(0.003) | 8(0.009) |
| 0.4 | 8427 | 1573 | 363(0.043) | 126(0.080) | 33(0.004) | 21(0.013) |
| 0.5 | 7560 | 2440 | 310(0.041) | 202(0.083) | 30(0.004) | 28(0.011) |
| 0.6 | 6470 | 3530 | 270(0.042) | 264(0.075) | 22(0.003) | 35(0.010) |
| 0.7 | 5023 | 4977 | 209(0.042) | 317(0.064) | 23(0.005) | 59(0.012) |
| 0.8 | 3534 | 6466 | 119(0.034) | 410(0.063) | 10(0.003) | 83(0.013) |
| 0.9 | 1893 | 8107 | 75(0.040) | 483(0.060) | 10(0.005) | 84(0.010) |
m = 2, n = 5, and μ= μ= 0.5; individual tests were performed at α = 0.05 and each estimate were based on 10000 runs.
Number of differentially expressed genes in all drug treatment groups
| Platforms | IUT : A | IUT : B | IUT : C | |||
| RIUT | BIUT | RIUT | BIUT | RIUT | BIUT | |
| Applied Biosystems | 1160 | 1011 | 940 | 763 | ||
| Agilent | 362 | 262 | 525 | 413 | 452 | 159 |
| GE Healthcare | 697 | 528 | 862 | 723 | 639 | 415 |
| Affymetrix (Site 1) | 359 | 251 | 502 | 382 | ||
| Affymetrix (Site 2) | 524 | 375 | 724 | 556 | 521 | 289 |
α = 0.05 (Nominal # false positive < 230)
IUT A: t1: L_AA vs. L_CTL and t2: L_CFY vs. L_CTL
IUT B: t1: L_RDL vs. L_CTL and t2: L_CFY vs. L_CTL
IUT C: t1: L_AA vs. L_CTL and t2: L_RDL vs. L_CTL
Figure 2The joint p-value distribution from . The IUT consists of two individual two-sample t tests (t1: 2.0 μM Cd vs. control at 3 h, t2: 2.0 μM Cd vs. control at 12 h).
KEGG pathways Identified and the enrichment p-values
| Metabolism of xenobiotics by cytochrome P450 | 0.004 |
| MAPK signaling pathway | 0.028 |
| Porphyrin and chlorophyll metabolism | 0.030 |
(Fisher's Exact Test) based on the gene sets derived from RIUT and BIUT
Figure 3Partition of the sample space of the two dimensional outcome space defined by . The shaded two areas are the true non-null areas for the two individual tests respectively. The rejected regions by both tests are highlighted in bold rectangles at the four corners of the outcome space. Only the lower-left corner is the non-null region from both tests and therefore should be rejected by IUT. The other three corners are type 1 error region for IUT because at least one individual test is making type 1 error.