| Literature DB >> 19956679 |
Sungho Won1, Jemma B Wilk, Rasika A Mathias, Christopher J O'Donnell, Edwin K Silverman, Kathleen Barnes, George T O'Connor, Scott T Weiss, Christoph Lange.
Abstract
For genome-wide association studies in family-based designs, we propose a new, universally applicable approach. The new test statistic exploits all available information about the association, while, by virtue of its design, it maintains the same robustness against population admixture as traditional family-based approaches that are based exclusively on the within-family information. The approach is suitable for the analysis of almost any trait type, e.g. binary, continuous, time-to-onset, multivariate, etc., and combinations of those. We use simulation studies to verify all theoretically derived properties of the approach, estimate its power, and compare it with other standard approaches. We illustrate the practical implications of the new analysis method by an application to a lung-function phenotype, forced expiratory volume in one second (FEV1) in 4 genome-wide association studies.Entities:
Mesh:
Year: 2009 PMID: 19956679 PMCID: PMC2777973 DOI: 10.1371/journal.pgen.1000741
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Empirical type-1 error for 500K GWAS at genome-wide significance level 0.05.
|
| Empirical error rate |
| 0.00 | 0.0505 |
| 0.05 | 0.0395 |
| 0.10 | 0.0425 |
| 0.20 | 0.0450 |
| 0.30 | 0.0445 |
The number of trios, N, is assumed to be 1,000 and the empirical type-1 error of the minimum p-value for GWAS at 500K GWAS is calculated with 2,000 replicates.
Average of empirical proportion at 500K GWAS.
|
|
|
|
|
|
|
| 0.00 | 5.00×10−2 | 9.97×10−3 | 9.91×10−4 | 9.86×10−5 | 9.66×10−6 |
| 0.05 | 5.00×10−2 | 9.97×10−3 | 9.91×10−4 | 9.85×10−5 | 9.76×10−6 |
| 0.10 | 5.00×10−2 | 9.96×10−3 | 9.88×10−4 | 9.78×10−5 | 9.79×10−6 |
| 0.20 | 4.99×10−2 | 9.95×10−3 | 9.87×10−4 | 9.76×10−5 | 9.60×10−6 |
| 0.30 | 4.98×10−2 | 9.92×10−3 | 9.82×10−4 | 9.68×10−5 | 9.40×10−6 |
The number of trios, N, is assumed to be 1,000 and the empirical proportions of SNPs whose p-values for Z are less than c in each replicate for 500K GWAS are averaged over 2,000 replicates.
Average of empirical proportion at 100K GWAS.
| Method |
|
|
|
|
|
|
| EIGENSTRAT | 0.001 | 5.07×10−2 | 1.02×10−2 | 1.04×10−3 | 1.05×10−4 | 1.02×10−5 |
| 0.005 | 5.44×10−2 | 1.17×10−2 | 1.36×10−3 | 1.72×10−4 | 2.45×10−5 | |
| 0.01 | 5.86×10−2 | 1.39×10−2 | 2.09×10−3 | 3.62×10−4 | 7.57×10−5 | |
| 0.05 | 8.20×10−2 | 3.24×10−2 | 1.32×10−2 | 6.58×10−3 | 3.39×10−3 | |
| LIP | 0.001 | 5.00×10−2 | 9.99×10−3 | 9.93×10−4 | 9.89×10−5 | 9.70×10−6 |
| 0.005 | 5.00×10−2 | 9.99×10−3 | 1.00×10−3 | 1.01×10−4 | 1.00×10−5 | |
| 0.01 | 5.00×10−2 | 9.99×10−3 | 9.97×10−4 | 9.96×10−5 | 9.99×10−6 | |
| 0.05 | 5.00×10−2 | 9.98×10−3 | 9.94×10−4 | 9.89×10−5 | 9.98×10−6 |
The number of trios, N, is assumed to be 1,000. The empirical proportions of SNPs whose p-values for Z are less than c in each replicate for 500K GWAS are averaged over 2000 replicates when there is local population stratification. LIP stands for the proposed method using Liptak method to combine pFBAT and pT.
Empirical power for GWAS under no population stratification.
|
|
| POP | FBAT | QTDT | LIP | VAN | ION |
| 2,000 | 0.001 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 0.005 | 0.0200 | 0.0025 | 0.0010 | 0.0185 | 0.0080 | 0.0130 | |
| 0.01 | 0.2085 | 0.0125 | 0.0180 | 0.1955 | 0.0990 | 0.1505 | |
| 0.015 | 0.5725 | 0.0765 | 0.0150 | 0.5350 | 0.3045 | 0.4515 | |
| 2,500 | 0.001 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 0.005 | 0.0385 | 0.0030 | 0.0030 | 0.0370 | 0.0155 | 0.0210 | |
| 0.01 | 0.3970 | 0.0430 | 0.0430 | 0.3760 | 0.2025 | 0.2960 | |
| 0.015 | 0.8135 | 0.1420 | 0.1790 | 0.7995 | 0.5525 | 0.7380 | |
| 3,000 | 0.001 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 0.005 | 0.0740 | 0.0020 | 0.0070 | 0.0675 | 0.0325 | 0.0495 | |
| 0.01 | 0.5720 | 0.0810 | 0.0855 | 0.5495 | 0.3175 | 0.4710 | |
| 0.015 | 0.9175 | 0.2665 | 0.3265 | 0.8980 | 0.7055 | 0.8630 |
Empirical powers are calculated from 2,000 replicates at the genome-wide significance level 0.05 from Bonferroni method under no population stratification. LIP stands for the proposed method using Liptak method to combine pFBAT and pT.VAN and ION indicate the VanSteen approach screening top 20 SNPs and Ionita approach using an exponential weighting scheme with partitioning parameters of K = 7 and r = 2 respectively. FBAT are the results only from the within-family component and POP is the standard population-based method.
Figure 1Empirical power at 0.001 significance level for additive disease.
POP is the empirical power of the standard population-based method. T is the empirical power of the Wald test based on the conditional mean model [22] for between-faimly components. LIP is the empirical power of the combined p-values with Liptak's method. In this figure, FBAT and T are completely overlapped.
Figure 2Empirical power at 0.001 significance level for dominant disease.
POP is the empirical power of the standard population-based method. T is the empirical power of the Wald test based on the conditional mean model [22] for between-faimly components. LIP is the empirical power of the combined p-values with Liptak's method. In this figure, FBAT and T are completely overlapped.
Figure 3Empirical power at 0.001 significance level for recessive disease.
POP is the empirical power of the standard population-based method. T is the empirical power of the Wald test based on the conditional mean model [22] for between-family components. LIP is the empirical power of the combined p-values with Liptak's method. In this figure, FBAT and T are completely overlapped.
Empirical power for GWAS under population stratification.
|
|
| FBAT | QTDT | LIP | VAN | ION | EIG |
| 0.001 | 0.005 | 0.0000 | 0.0010 | 0.0083 | 0.0000 | 0.0000 | 0.0000 |
| 0.010 | 0.0000 | 0.0030 | 0.1157 | 0.0826 | 0.1157 | 0.0579 | |
| 0.015 | 0.0000 | 0.0085 | 0.3884 | 0.2975 | 0.3471 | 0.2562 | |
| 0.005 | 0.005 | 0.0000 | 0.0000 | 0.0083 | 0.0083 | 0.0083 | 0.0083 |
| 0.010 | 0.0000 | 0.0020 | 0.0909 | 0.0579 | 0.0661 | 0.0661 | |
| 0.015 | 0.0083 | 0.0080 | 0.3223 | 0.2810 | 0.3140 | 0.1901 | |
| 0.01 | 0.005 | 0.0000 | 0.0015 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
| 0.010 | 0.0000 | 0.0010 | 0.0909 | 0.0826 | 0.0579 | 0.0331 | |
| 0.015 | 0.0083 | 0.0135 | 0.3636 | 0.2975 | 0.3388 | 0.2645 | |
| 0.05 | 0.005 | 0.0000 | 0.0000 | 0.01653 | 0.0330 | 0.0248 | 0.0000 |
| 0.010 | 0.0083 | 0.0035 | 0.0992 | 0.0744 | 0.0826 | 0.0165 | |
| 0.015 | 0.0165 | 0.0080 | 0.3140 | 0.2645 | 0.2727 | 0.2066 |
The number of trios, N, is assumed to be 1,000. Empirical powers are calculated from 2,000 replicates at the genome-wide significance level 0.05 from Bonferroni method under no population stratification. LIP stands for the proposed method using Liptak method to combine pFBAT and pT. VAN and ION indicate the VanSteen approach selecting top 20 SNP and Ionita approach using an exponential weighting scheme with partitioning parameters of K = 5 and r = 2 respectively. FBAT indicates the empirical power only from FBAT and EIG indicates the empirical power from EIGENSTRAT.
Applications to forced expiratory volume in one second in Framingham Heart study.
| SNP | Chrom | Position | MAF | Num. Info. Fam. |
|
|
|
| rs805294 | 6 | 31796196 | 0.340 | 918 | 4.300×10−3 | 2.073×10−5 | 5.929×10−7 |
| rs10863838 | 1 | 208750806 | 0.450 | 1016 | 7.408×10−5 | 2.535×10−3 | 2.553×10−6 |
| rs6794842 | 3 | 119308208 | 0.331 | 950 | 3.226×10−2 | 2.400×10−5 | 6.654×10−6 |
| rs804963 | 14 | 85918211 | 0.460 | 1031 | 9.786×10−2 | 2.775×10−6 | 7.060×10−6 |
| rs525914 | 11 | 119200660 | 0.187 | 711 | 9.204×10−4 | 1.888×10−3 | 2.081×10−5 |
| rs1886280 | 10 | 89347496 | 0.362 | 971 | 1.797×10−2 | 2.297×10−4 | 2.511×10−5 |
| rs710469 | 3 | 188467212 | 0.491 | 1058 | 3.202×10−3 | 1.388×10−3 | 2.639×10−5 |
| rs10799746 | 1 | 22497833 | 0.168 | 651 | 1.388×10−2 | 3.538×10−4 | 2.748×10−5 |
| rs1225888 | 20 | 15972225 | 0.449 | 1007 | 7.518×10−5 | 1.736×10−2 | 2.994×10−5 |
| rs4638547 | 15 | 71122046 | 0.377 | 999 | 3.454×10−5 | 2.760×10−2 | 3.549×10−5 |
The number of markers is 306,264 and the genome-wide significance level at 0.05 is 1.636 × 10−7. The top 10 SNPs from Z are selected, assuming additive disease mode of inheritance. For pT, the estimated powers are used and the weights for LIP are calculated with the number of informative trios.
Descriptive statistics and results of rs805294 in different studies.
| FHS | British Cohort | CAMP | BAR | ||
| Affy | Illumina | ||||
| Num. Info. Fam. | 918 | - | - | 488 | 33 |
| Sample Size | - | 1372 | 1323 | - | - |
| MAF | 0.34 | 0.36 | 0.36 | 0.33 | 0.22 |
| P-values |
|
|
|
| 7.84×10−1 |
The negative sign of the P-values indicates that the minor alleles are under-expressed in cases.