| Literature DB >> 18802462 |
Amy Murphy1, Scott T Weiss, Christoph Lange.
Abstract
For genome-wide association studies in family-based designs, we propose a powerful two-stage testing strategy that can be applied in situations in which parent-offspring trio data are available and all offspring are affected with the trait or disease under study. In the first step of the testing strategy, we construct estimators of genetic effect size in the completely ascertained sample of affected offspring and their parents that are statistically independent of the family-based association/transmission disequilibrium tests (FBATs/TDTs) that are calculated in the second step of the testing strategy. For each marker, the genetic effect is estimated (without requiring an estimate of the SNP allele frequency) and the conditional power of the corresponding FBAT/TDT is computed. Based on the power estimates, a weighted Bonferroni procedure assigns an individually adjusted significance level to each SNP. In the second stage, the SNPs are tested with the FBAT/TDT statistic at the individually adjusted significance levels. Using simulation studies for scenarios with up to 1,000,000 SNPs, varying allele frequencies and genetic effect sizes, the power of the strategy is compared with standard methodology (e.g., FBATs/TDTs with Bonferroni correction). In all considered situations, the proposed testing strategy demonstrates substantial power increases over the standard approach, even when the true genetic model is unknown and must be selected based on the conditional power estimates. The practical relevance of our methodology is illustrated by an application to a genome-wide association study for childhood asthma, in which we detect two markers meeting genome-wide significance that would not have been detected using standard methodology.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18802462 PMCID: PMC2529406 DOI: 10.1371/journal.pgen.1000197
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Power for 500–2000 trios and 500K markers, using mating type ratio equation R4, under an additive genetic model.
| Number | Odds | MAF | Independence scenario | LD scenario (DSL only) | LD scenario (DSL+) | |||
| of Trios | Ratio | Weighted | Standard | Weighted | Standard | Weighted | Standard | |
|
| ||||||||
| 1.25 | 0.1 | 0.066 | 0.003 | 0.042 | 0.001 | 0.127 | 0.017 | |
| 0.2 | 0.241 | 0.039 | 0.168 | 0.012 | 0.391 | 0.147 | ||
| 0.3 | 0.295 | 0.089 | 0.203 | 0.031 | 0.513 | 0.300 | ||
| 0.4 | 0.270 | 0.129 | 0.165 | 0.048 | 0.504 | 0.366 | ||
| 1.375 | 0.1 | 0.226 | 0.078 | 0.154 | 0.033 | 0.371 | 0.195 | |
| 0.2 | 0.591 | 0.388 | 0.454 | 0.212 | 0.800 | 0.665 | ||
| 0.3 | 0.744 | 0.591 | 0.590 | 0.397 | 0.921 | 0.857 | ||
| 0.4 | 0.764 | 0.666 | 0.591 | 0.465 | 0.930 | 0.893 | ||
| 1.5 | 0.1 | 0.517 | 0.357 | 0.390 | 0.225 | 0.722 | 0.604 | |
| 0.2 | 0.908 | 0.846 | 0.827 | 0.703 | 0.985 | 0.964 | ||
| 0.3 | 0.976 | 0.952 | 0.931 | 0.874 | 0.995 | 0.992 | ||
| 0.4 | 0.979 | 0.969 | 0.940 | 0.902 | 0.997 | 0.995 | ||
|
| ||||||||
| 1.5 | 0.1 | 0.100 | 0.032 | 0.072 | 0.018 | 0.170 | 0.084 | |
| 0.2 | 0.354 | 0.189 | 0.271 | 0.113 | 0.520 | 0.352 | ||
| 0.3 | 0.470 | 0.336 | 0.360 | 0.220 | 0.667 | 0.555 | ||
| 0.4 | 0.456 | 0.371 | 0.333 | 0.248 | 0.660 | 0.571 | ||
| 1.75 | 0.1 | 0.438 | 0.324 | 0.345 | 0.236 | 0.581 | 0.488 | |
| 0.2 | 0.859 | 0.777 | 0.770 | 0.658 | 0.940 | 0.901 | ||
| 0.3 | 0.932 | 0.896 | 0.881 | 0.819 | 0.976 | 0.960 | ||
| 0.4 | 0.936 | 0.904 | 0.881 | 0.839 | 0.976 | 0.964 | ||
| 2 | 0.1 | 0.825 | 0.759 | 0.750 | 0.669 | 0.918 | 0.876 | |
| 0.2 | 0.992 | 0.985 | 0.984 | 0.970 | 0.999 | 0.997 | ||
| 0.3 | 0.998 | 0.996 | 0.994 | 0.990 | 1.000 | 1.000 | ||
| 0.4 | 0.997 | 0.995 | 0.994 | 0.989 | 0.998 | 0.997 | ||
|
| ||||||||
| 2 | 0.1 | 0.184 | 0.128 | 0.132 | 0.085 | 0.276 | 0.205 | |
| 0.2 | 0.573 | 0.480 | 0.490 | 0.382 | 0.693 | 0.606 | ||
| 0.3 | 0.711 | 0.628 | 0.635 | 0.538 | 0.805 | 0.740 | ||
| 0.4 | 0.665 | 0.590 | 0.591 | 0.505 | 0.771 | 0.707 | ||
| 2.25 | 0.1 | 0.447 | 0.350 | 0.367 | 0.278 | 0.551 | 0.473 | |
| 0.2 | 0.849 | 0.787 | 0.797 | 0.720 | 0.916 | 0.878 | ||
| 0.3 | 0.905 | 0.868 | 0.869 | 0.811 | 0.954 | 0.928 | ||
| 0.4 | 0.894 | 0.856 | 0.849 | 0.805 | 0.934 | 0.900 | ||
| 2.5 | 0.1 | 0.694 | 0.612 | 0.624 | 0.542 | 0.793 | 0.729 | |
| 0.2 | 0.957 | 0.934 | 0.935 | 0.895 | 0.981 | 0.967 | ||
| 0.3 | 0.978 | 0.964 | 0.966 | 0.943 | 0.991 | 0.982 | ||
| 0.4 | 0.965 | 0.949 | 0.945 | 0.919 | 0.983 | 0.975 | ||
Estimated power levels to detect the DSL using 500–2000 trios, assuming a 10% disease prevalence and additive mode of inheritance. The significance level is set to 5%. For the weighted Bonferroni method (Weighted), the partitioning parameters are K = 7 and r = 2. MAF denotes minor allele frequency. The power reflects the proportion of times the p-value of the DSL (Independence scenario and LD scenario (DSL only)) or a SNP in LD with the DSL (LD scenario (DSL+)) met the weighted Bonferroni (Weighted) or standard Bonferroni corrected (Standard) significance level. The standard Bonferroni correction adjusts for 500 K comparisons.
Power for 2000 trios and 500K markers, using mating type ratio equation R4, under an “unknown” genetic model.
| True Gen. | Odds | MAF | Independence scenario | LD scenario (DSL only) | LD scenario (DSL+) | |||
| Model | Ratio | Weighted | Standard | Weighted | Standard | Weighted | Standard | |
|
| ||||||||
| 1.25 | 0.1 | 0.033 | 0.001 | 0.019 | 0.000 | 0.074 | 0.005 | |
| 0.2 | 0.140 | 0.008 | 0.085 | 0.002 | 0.265 | 0.055 | ||
| 0.3 | 0.175 | 0.022 | 0.109 | 0.007 | 0.320 | 0.122 | ||
| 0.4 | 0.137 | 0.029 | 0.083 | 0.007 | 0.305 | 0.174 | ||
| 1.375 | 0.1 | 0.140 | 0.026 | 0.098 | 0.008 | 0.256 | 0.092 | |
| 0.2 | 0.414 | 0.171 | 0.316 | 0.085 | 0.623 | 0.430 | ||
| 0.3 | 0.537 | 0.332 | 0.373 | 0.166 | 0.777 | 0.644 | ||
| 0.4 | 0.532 | 0.404 | 0.376 | 0.241 | 0.793 | 0.711 | ||
| 1.5 | 0.1 | 0.354 | 0.183 | 0.281 | 0.107 | 0.546 | 0.385 | |
| 0.2 | 0.790 | 0.646 | 0.669 | 0.466 | 0.928 | 0.876 | ||
| 0.3 | 0.910 | 0.844 | 0.802 | 0.694 | 0.984 | 0.967 | ||
| 0.4 | 0.916 | 0.878 | 0.817 | 0.742 | 0.985 | 0.973 | ||
|
| ||||||||
| 1.5 | 0.1 | 0.207 | 0.099 | 0.135 | 0.053 | 0.360 | 0.230 | |
| 0.2 | 0.376 | 0.257 | 0.271 | 0.154 | 0.597 | 0.490 | ||
| 0.3 | 0.306 | 0.218 | 0.200 | 0.129 | 0.522 | 0.443 | ||
| 0.4 | 0.145 | 0.104 | 0.072 | 0.046 | 0.263 | 0.204 | ||
| 1.75 | 0.1 | 0.760 | 0.690 | 0.642 | 0.548 | 0.896 | 0.856 | |
| 0.2 | 0.937 | 0.910 | 0.862 | 0.808 | 0.988 | 0.979 | ||
| 0.3 | 0.906 | 0.868 | 0.821 | 0.758 | 0.967 | 0.951 | ||
| 0.4 | 0.693 | 0.624 | 0.577 | 0.503 | 0.830 | 0.784 | ||
| 2 | 0.1 | 0.989 | 0.984 | 0.970 | 0.959 | 1.000 | 0.999 | |
| 0.2 | 1.000 | 0.999 | 0.999 | 0.999 | 1.000 | 1.000 | ||
| 0.3 | 0.997 | 0.995 | 0.993 | 0.992 | 0.999 | 0.998 | ||
| 0.4 | 0.965 | 0.950 | 0.935 | 0.911 | 0.987 | 0.982 | ||
|
| ||||||||
| 2 | 0.1 | 0.002 | 0.000 | 0.002 | 0.000 | 0.002 | 0.000 | |
| 0.2 | 0.011 | 0.005 | 0.008 | 0.003 | 0.019 | 0.007 | ||
| 0.3 | 0.217 | 0.165 | 0.147 | 0.104 | 0.335 | 0.267 | ||
| 0.4 | 0.767 | 0.723 | 0.657 | 0.598 | 0.887 | 0.867 | ||
| 2.25 | 0.1 | 0.003 | 0.000 | 0.002 | 0.000 | 0.006 | 0.000 | |
| 0.2 | 0.039 | 0.014 | 0.029 | 0.010 | 0.057 | 0.029 | ||
| 0.3 | 0.562 | 0.463 | 0.450 | 0.373 | 0.704 | 0.620 | ||
| 0.4 | 0.971 | 0.959 | 0.949 | 0.927 | 0.991 | 0.985 | ||
| 2.5 | 0.1 | 0.005 | 0.000 | 0.004 | 0.000 | 0.007 | 0.000 | |
| 0.2 | 0.103 | 0.053 | 0.068 | 0.036 | 0.155 | 0.087 | ||
| 0.3 | 0.850 | 0.784 | 0.783 | 0.709 | 0.926 | 0.884 | ||
| 0.4 | 0.997 | 0.995 | 0.995 | 0.991 | 1.000 | 1.000 | ||
Estimated power levels to detect the DSL using 2000 trios, assuming a 10% disease prevalence. The significance level is set to 5%. For the weighted Bonferroni method (Weighted), the partitioning parameters are K = 7 and r = 2. “Under True Gen. Model”, Add. refers to the scenario where the true (but “unknown”) model is additive (as the results are analyzed using all three genetic models). Similar scenarios are provided for the dominant (Dom.) and recessive (Rec.) genetic models. MAF denotes minor allele frequency. The power reflects the proportion of times the p-value of the DSL (Independence scenario and LD scenario (DSL only)) or a SNP in LD with the DSL (LD scenario (DSL+)) met the weighted Bonferroni (Weighted) or standard Bonferroni corrected (Standard) significance level. The standard Bonferroni correction adjusts for 1.5 M comparisons (500 K markers * 3 genetic models).
CAMP results: SNPs meeting genome-wide significance at α = 0.05.
| Marker | MAF | H-W Equil. | Num. Info. Families | FBAT p-value | Power Rank | Required Significance Level |
| rs10863712 | 0.471 | 0.813 | 275 | 0.0032 | 1 | 0.005 |
| rs1294497 | 0.490 | 0.882 | 276 | 0.0047 | 2 | 0.005 |
Results of the CAMP analysis with 402 families, 534,290 SNPs, assuming an additive mode of inheritance. Num. Info. Families indicates the number of families that were informative (i.e., at least one parent was heterozygous) for the marker of interest, and MAF denotes minor allele frequency. Markers with fewer than 20 families were removed from the analysis, as the asymptotic properties required for the test statistic may not hold. The power ranks are obtained from the conditional power of the test, calculated using our new technique with mating type ratio equation R4. The required significance level is obtained using the Ionita-Laza method [9] with K = 7, r = 2, and α = 0.05.