| Literature DB >> 21060308 |
Abstract
Advances in high-throughput biology and computer science are driving an exponential increase in the number of hypothesis tests in genomics and other scientific disciplines. Studies using current genotyping platforms frequently include a million or more tests. In addition to the monetary cost, this increase imposes a statistical cost owing to the multiple testing corrections needed to avoid large numbers of false-positive results. To safeguard against the resulting loss of power, some have suggested sample sizes on the order of tens of thousands that can be impractical for many diseases or may lower the quality of phenotypic measurements. This study examines the relationship between the number of tests on the one hand and power, detectable effect size or required sample size on the other. We show that once the number of tests is large, power can be maintained at a constant level, with comparatively small increases in the effect size or sample size. For example at the 0.05 significance level, a 13% increase in sample size is needed to maintain 80% power for ten million tests compared with one million tests, whereas a 70% increase in sample size is needed for 10 tests compared with a single test. Relative costs are less when measured by increases in the detectable effect size. We provide an interactive Excel calculator to compute power, effect size or sample size when comparing study designs or genome platforms involving different numbers of hypothesis tests. The results are reassuring in an era of extreme multiple testing.Entities:
Mesh:
Year: 2010 PMID: 21060308 PMCID: PMC3252610 DOI: 10.1038/mp.2010.117
Source DB: PubMed Journal: Mol Psychiatry ISSN: 1359-4184 Impact factor: 15.992
Figure 1Distribution of B under the null and alternative hypotheses illustrating the power analysis for a single test (a) and two tests (b–d). (a) A single 0.05 significance test uses critical value 1.96 (circle on horizontal line). It has 80% power (area outlined in bold) to detect an effect of size b=2.80. (b) For two tests, the critical value is increased by Δ=0.28 standard errors to 2.24 (triangle on horizontal line). Distance, D, from the mean of the alternative distribution, is reduced by Δ and the power (outlined in bold) is reduced to 71% at the original effect size. (c) To accommodate the increased critical value, the alternative curve can also be shifted to the right by Δ standard errors to maintain 80% power at a larger effect size b=1.96+0.28+0.84=3.08. (d) Finally, the sample size can be increased to make the densities narrower and taller. This reduces the overlap between the null and the alternative, maintaining 80% power at original effect size 2.80.
Cost of multiple tests for 80 and 90% power and α=0.05
| 1 | 1.96 | 80.0 | 1.00 | 1.000 | 90.0 | 1.00 | 1.000 |
| 5 | 2.58 | 58.8 | 1.22 | 1.488 | 74.6 | 1.19 | 1.416 |
| 10 | 2.81 | 49.7 | 1.30 | 1.690 | 66.7 | 1.26 | 1.588 |
| 50 | 3.29 | 31.3 | 1.47 | 2.161 | 48.1 | 1.41 | 1.988 |
| 100 | 3.48 | 24.9 | 1.54 | 2.372 | 40.6 | 1.47 | 2.161 |
| 500 | 3.89 | 13.8 | 1.69 | 2.856 | 25.8 | 1.60 | 2.560 |
| 1K | 4.06 | 10.4 | 1.75 | 3.062 | 20.7 | 1.65 | 2.722 |
| 5K | 4.42 | 5.3 | 1.88 | 3.534 | 11.9 | 1.76 | 3.098 |
| 10K | 4.56 | 3.9 | 1.93 | 3.725 | 9.4 | 1.80 | 3.240 |
| 50K | 4.89 | 1.8 | 2.05 | 4.202 | 5.0 | 1.90 | 3.610 |
| 100K | 5.03 | 1.3 | 2.10 | 4.410 | 3.7 | 1.95 | 3.802 |
| 300K | 5.23 | 0.8 | 2.17 | 4.709 | 2.3 | 2.01 | 4.040 |
| 500K | 5.33 | 0.6 | 2.20 | 4.840 | 1.8 | 2.04 | 4.162 |
| 560K | 5.35 | 0.5 | 2.21 | 4.884 | 1.7 | 2.05 | 4.202 |
| 1M | 5.45 | 0.4 | 2.25 | 5.062 | 1.4 | 2.08 | 4.326 |
| 1.2M | 5.48 | 0.4 | 2.26 | 5.108 | 1.3 | 2.09 | 4.368 |
| 1.8M | 5.55 | 0.3 | 2.28 | 5.198 | 1.0 | 2.11 | 4.452 |
| 2.5M | 5.61 | 0.2 | 2.30 | 5.290 | 0.9 | 2.13 | 4.537 |
| 5M | 5.73 | 0.2 | 2.35 | 5.523 | 0.6 | 2.16 | 4.666 |
| 10M | 5.85 | 0.1 | 2.39 | 5.712 | 0.5 | 2.20 | 4.840 |
| 50M | 6.11 | 0.0 | 2.48 | 6.150 | 0.2 | 2.28 | 5.198 |
| 100M | 6.22 | 0.0 | 2.52 | 6.350 | 0.1 | 2.31 | 5.336 |
| 500M | 6.47 | 0.0 | 2.61 | 6.812 | 0.1 | 2.39 | 5.712 |
| 1B | 6.57 | 0.0 | 2.65 | 7.022 | 0.0 | 2.42 | 5.856 |
| 1T | 7.53 | 0.0 | 2.99 | 8.940 | 0.0 | 2.72 | 7.398 |
Abbreviations: B, billion; K, thousand; M, million; T, trillion.
Existing or proposed GWAS genotyping platform.
Figure 2Power (a), effect size multiplier (b) and sample size multiplier (c) as a function of the number of tests on the log scale up to ten million tests, where power for a single test is 50, 80, 90 or 99%. The effect size multiplier is the number by which the effect size for a single test must be multiplied to maintain the same power for the same sample size at the specified number of tests. The sample size multiplier is the number by which the sample size for a single test must be multiplied to maintain the same power for the same effect size at the specified number of tests and is nearly linear with respect to the log of the number of tests. (d) Critical value, effect size multiplier and sample size multiplier for 80% power, with the number of tests on the raw (unlogged) scale. As the number of tests increases, the rate of increase in all three decreases dramatically.