| Literature DB >> 23977056 |
Abstract
Genome-wide association studies (GWAS) are routinely conducted for both quantitative and binary (disease) traits. We present two analytical tools for use in the experimental design of GWAS. Firstly, we present power calculations quantifying power in a unified framework for a range of scenarios. In this context we consider the utility of quantitative scores (e.g. endophenotypes) that may be available on cases only or both cases and controls. Secondly, we consider, the accuracy of prediction of genetic risk from genome-wide SNPs and derive an expression for genomic prediction accuracy using a liability threshold model for disease traits in a case-control design. The expected values based on our derived equations for both power and prediction accuracy agree well with observed estimates from simulations.Entities:
Mesh:
Year: 2013 PMID: 23977056 PMCID: PMC3747270 DOI: 10.1371/journal.pone.0071494
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Expected power for an association study from the derived equations and observed averaged power from simulation.
| BT_POP | BT_CCa | QT_POP | QT_CCa | |||||
|
| Exp | Obs (SE) | Exp | Obs (SE) | Exp | Obs (SE) | Exp | Obs (SE) |
| N = 2000, | ||||||||
| 0.0001 | 0.058 | 0.053 (0.002) | 0.072 | 0.072 (0.003) | 0.073 | 0.075 (0.003) | 0.082 | 0.083 (0.003) |
| 0.0005 | 0.090 | 0.086 (0.003) | 0.164 | 0.163 (0.004) | 0.170 | 0.172 (0.004) | 0.218 | 0.221 (0.004) |
| 0.001 | 0.131 | 0.130 (0.003) | 0.281 | 0.286(0.005) | 0.293 | 0.294 (0.005) | 0.386 | 0.386 (0.005) |
| N = 2000, | ||||||||
| 0.0001 | 0.052 | 0.057 (0.002) | 0.092 | 0.092 (0.003) | 0.073 | 0.075 (0.003) | 0.105 | 0.102 (0.003) |
| 0.0005 | 0.058 | 0.057 (0.002) | 0.270 | 0.267(0.004) | 0.170 | 0.169 (0.004) | 0.333 | 0.329 (0.005) |
| 0.001 | 0.067 | 0.066 (0.002) | 0.478 | 0.474 (0.005) | 0.293 | 0.295 (0.005) | 0.579 | 0.574 (0.005) |
| N = 2000, | ||||||||
| 0.0001 | 0.050 | 0.042 (0.002) | 0.117 | 0.117(0.003) | 0.073 | 0.075 (0.003) | 0.130 | 0.132 (0.003) |
| 0.0005 | 0.051 | 0.052 (0.002) | 0.392 | 0.387 (0.005) | 0.170 | 0.176 (0.004) | 0.451 | 0.451 (0.005) |
| 0.001 | 0.053 | 0.052 (0.002) | 0.664 | 0.657 (0.005) | 0.293 | 0.296 (0.005) | 0.738 | 0.733 (0.004) |
h 2: variance explained by the locus.
a: in case-control samples, 50% of the sample are cases, P = 0.5.
Exp: Expected power based on NCP derived from equation (1)∼(4).
Obs: Averaged power over 10000 replicates of simulation.
SE: Empirical standard error over 10000 replicates.
Figure 1Power derived for QT_POP (dotted line), BT_POP (solid line), BT_CC (dashed line) and QT_CC (dot-dashed line) when using population prevalence K = 0.1 (a), K = 0.01 (b) or K = 0.001 (c) assuming the same total sample size N = 2000 and a critical significance threshold of 5×10−8.
Prediction accuracy for a disease with population or case-control samples when true proportion of variance explained by the set of SNPs on the liability scale is 0.5, τ = N/M is 1 for different disease prevalences.
| Prevalence | Population | Case-Control | |||
| Exp1 | Est (se) | Exp2 | Exp3 | Est (se) | |
| 0.001 | 0.075 | 0.063 (0.004) | 0.628 | 0.766 | 0.767 (0.002) |
| 0.01 | 0.186 | 0.183 (0.003) | 0.594 | 0.689 | 0.690 (0.002) |
| 0.1 | 0.382 | 0.377 (0.003) | 0.533 | 0.568 | 0.570 (0.002) |
| 0.2 | 0.444 | 0.438 (0.003) | 0.511 | 0.526 | 0.529 (0.003) |
| 0.5 | 0.491 | 0.487 (0.003) | 0.491 | 0.491 | 0.487 (0.003) |
Exp1: Expected value from equation (2) or equation (6) of Daetwyler et al. (2008).
Exp2: Expected value from equation (9) of Daetwyler et al (2008).
Exp3: Expected value from equation (3).
Est: Average of estimates from 100 replicates.
se: Empirical standard error over 100 replicates.
Proportion of cases in case-control study is P = 0.5.
Prediction accuracy for a disease with population or case-control samples when prevalence is 0.01, τ = N/M is 1 for diseases with different h.
|
| Population | Case-Control | |||
| Exp1 | Est (se) | Exp2 | Exp3 | Est (se) | |
| 0.1 | 0.084 | 0.087 (0.004) | 0.371 | 0.392 | 0.395 (0.003) |
| 0.5 | 0.186 | 0.183 (0.003) | 0.594 | 0.689 | 0.690 (0.002) |
| 0.9 | 0.246 | 0.243 (0.003) | 0.653 | 0.787 | 0.787 (0.001) |
Exp1: Expected value from equation (2) or equation (6) of Daetwyler et al. (2008).
Exp2: Expected value from equation (9) of Daetwyler et al (2008).
Exp3: Expected value from equation (3).
Est: Average of estimates from 100 replicates.
se: Empirical standard error over 100 replicates.
Proportion of cases in case-control study is P = 0.5.
Prediction accuracy for a disease with population or case-control samples when true proportion of variance explained by the set of SNPs on the liability scale is 0.5, prevalence is 0.01 and τ = N/M varies.
|
| Population | Case-Control | |||
| Exp1 | Est (se) | Exp2 | Exp3 | Est (se) | |
| 0.02 | 0.027 | 0.028 (0.003) | 0.104 | 0.133 | 0.124 (0.004) |
| 1 | 0.186 | 0.183 (0.003) | 0.594 | 0.689 | 0.690 (0.002) |
| 5 | 0.390 | 0.389 (0.004) | 0.731 | 0.905 | 0.905 (0.001) |
Exp1: Expected value from equation (2) or equation (6) of Daetwyler et al. (2008).
Exp2: Expected value from equation (9) of Daetwyler et al (2008).
Exp3: Expected value from equation (3).
Est: Average of estimates from 100 replicates.
se: Empirical standard error over 100 replicates.
Proportion of cases in case-control study is P = 0.5.