| Literature DB >> 17597927 |
Jacob P L Brand1, Lang Chen, Xiangqin Cui, Alfred A Bartolucci, Grier P Page, Kyoungmi Kim, Stephen Barnes, Vinodh Srinivasasainagendra, Mark T Beasley, David B Allison.
Abstract
The adaptive alpha-spending algorithm incorporates additional contextual evidence (including correlations among genes) about differential expression to adjust the initial p-values to yield the alpha-spending adjusted p-values. The alpha-spending algorithm is named so because of its similarity with the alpha-spending algorithm in interim analysis of clinical trials in which stage-specific significance levels are assigned to each stage of the clinical trial. We show that the Bonferroni correction applied to the alpha-spending adjusted p-values approximately controls the Family Wise Error Rate under the complete null hypothesis. Using simulations we also show that the use of the alpha spending algorithm yields increased power over the unadjusted p-values while controlling FDR. We found the greater benefits of the alpha spending algorithm with increasing sample sizes and correlation among genes. The use of the alpha spending algorithm will result in microarray experiments that make more efficient use of their data and may help conserve resources.Year: 2007 PMID: 17597927 PMCID: PMC1896052 DOI: 10.6026/97320630001384
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Observed PCER for the alpha-spending post-processed p-values estimated for correlated genes, uncorrelated genes, and all genes under the complete null hypothesis that all genes are non-differentially expressed. The number of genes in each simulation was 700 and the nominal alpha levels of 0.01, 0.05, and 0.1 were used for identifying differential genes. In each simulation parameter setting (ρ , n) the observed PCER was estimated from 100 simulated data sets
| Correlated genes | Uncorrelated genes | All genes | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 0.01 | 0.05 | 0.1 | 0.01 | 0.05 | 0.1 | 0.01 | 0.05 | 0.1 | ||
| 0.3 | 4 | 0.0092 | 0.0506 | 0.1044 | 0.0099 | 0.0483 | 0.0966 | 0.0098 | 0.0487 | 0.0982 |
| 0.3 | 6 | 0.0136 | 0.0689 | 0.1362 | 0.0095 | 0.0466 | 0.0938 | 0.0103 | 0.0510 | 0.1023 |
| 0.3 | 10 | 0.0117 | 0.0660 | 0.1316 | 0.0098 | 0.0463 | 0.0928 | 0.0102 | 0.0502 | 0.1006 |
| 0.5 | 4 | 0.0111 | 0.0663 | 0.1333 | 0.0091 | 0.0466 | 0.0932 | 0.0095 | 0.0505 | 0.1012 |
| 0.5 | 6 | 0.0175 | 0.0864 | 0.1664 | 0.0085 | 0.0421 | 0.0849 | 0.0103 | 0.0510 | 0.1012 |
| 0.5 | 10 | 0.0238 | 0.1006 | 0.1849 | 0.0081 | 0.0437 | 0.0875 | 0.0112 | 0.0551 | 0.1070 |
| 0.7 | 4 | 0.0326 | 0.1078 | 0.1908 | 0.0088 | 0.0450 | 0.0897 | 0.0136 | 0.0575 | 0.1099 |
| 0.7 | 6 | 0.0126 | 0.0794 | 0.1723 | 0.0088 | 0.0433 | 0.0864 | 0.0096 | 0.0505 | 0.1036 |
| 0.7 | 10 | 0.0353 | 0.1265 | 0.2249 | 0.0079 | 0.0389 | 0.0813 | 0.0134 | 0.0564 | 0.1101 |
Figure 1Observed PCER and observed FDR of the alpha-spending algorithm as a function of power of the ordinary t-test for different correlations ρ = 0.3, 0.5, 0.7 and different group sizes n = 4, 6,10 for k = 700 . The number of genes in each simulation was 700 and the nominal alpha levels of 0.05 was used for identifying differential genes. A thin dashed black line, a solid blue line, and a thick red line refer to a correlation ρ of 0.3, 0.5, and 0.7, respectively. The group sizes of 4, 6, and 10 are represented by circles, squares, and triangles, respectively. In each simulation parameter setting (ρ , n) the observed PCER was estimated from 100 simulated data sets
Figure 2Power improvement of alpha-spending p-values with respect to the ordinary t-test. The results are from the partial null hypothesis simulations with 20% of the genes differentially expressed and correlated with the same correlation coefficient ρ and 80% of the genes non-differentially expressed and uncorrelated. For k = 700 , the 700 = 7x100 simulated data sets per plot were obtained by independently generating 100 data sets for each of seven different values of the population mean differential expression Δ . These seven values of Δ = Δ(1– β ) were obtained such that the corresponding power of the ordinary t-test in detecting the differentially expressed genes was varied by1– β = 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 . For k = 2000 the 30 simulated data sets correspond to 1– β = 0.5 only. The situation k = 2000 is simulated for n = 4, 6 but not for n =10