| Literature DB >> 26286599 |
Suyeon Park1,2,3, Sungyoung Lee4, Young Lee5,6, Christine Herold7,8, Basavaraj Hooli9, Kristina Mullin10, Taesung Park11, Changsoon Park12, Lars Bertram13,14,15, Christoph Lange16,17,18,19, Rudolph Tanzi20, Sungho Won21,22,23.
Abstract
BACKGROUND: In family-based association analysis, each family is typically ascertained from a single proband, which renders the effects of ascertainment bias heterogeneous among family members. This is contrary to case-control studies, and may introduce sample or ascertainment bias. Statistical efficiency is affected by ascertainment bias, and careful adjustment can lead to substantial improvements in statistical power. However, genetic association analysis has often been conducted using family-based designs, without addressing the fact that each proband in a family has had a great influence on the probability for each family member to be affected.Entities:
Mesh:
Year: 2015 PMID: 26286599 PMCID: PMC4593209 DOI: 10.1186/s12881-015-0198-6
Source DB: PubMed Journal: BMC Med Genet ISSN: 1471-2350 Impact factor: 2.103
Fig. 1Two sample pedigree structures from AD data; (a) individual 9 of family 1 was selected as a "proband" and individual 3–8 of family 1 were selcted as "non-probands" (b) individual 3 of family 2 was selected as a "proband" and individual 4–6 of family 2 were selcted as "non-probands"
Fig. 2Family tree. There are two different types of family structures, including (a) nuclear family and (b) extended family, which were considered in our simulation study
Fig. 3QQ plots for FQLS 1 and FQLS 2 under the null hypothesis. QQplots for FQLS 1 and FQLS 2 are obtaind when h2 is 0.2((a), (b)), 0.5((c), (d)), or 0.8((e), (f)). P-values were calculated based on 5000 replicates when the number of families was 900. The genetic effect β was assumed to be 0, and the minor allele frequency was 0.2
Empirical type-1 error estimates. The empirical type-1 error rates and their 95 % confidence intervals were estimated with 5000 replicates at the 0.01 and 0.05 significance level for h 2 = 0.2, 0.5, and 0.8. The number of families was assumed to be 900, and the disease allele frequency was 0.2
|
| Statistics | Type-1 error estimates | 95 % confidence interval | ||
|---|---|---|---|---|---|
| Lower | Upper | ||||
| 0.01 | 0.2 |
| 0.011 | 0.008 | 0.014 |
|
| 0.011 | 0.008 | 0.013 | ||
| 0.5 |
| 0.010 | 0.007 | 0.013 | |
|
| 0.009 | 0.007 | 0.012 | ||
| 0.8 |
| 0.009 | 0.006 | 0.011 | |
|
| 0.009 | 0.007 | 0.012 | ||
| 0.05 | 0.2 |
| 0.049 | 0.043 | 0.055 |
|
| 0.049 | 0.043 | 0.055 | ||
| 0.5 |
| 0.050 | 0.044 | 0.056 | |
|
| 0.053 | 0.047 | 0.059 | ||
| 0.8 |
| 0.047 | 0.042 | 0.053 | |
|
| 0.053 | 0.047 | 0.059 | ||
Empirical power estimates for scenario 1 when h 2 is 0.2. The empirical power estimates for scenario 1 were calculated with 1000 replicates at the both 0.01 and 0.001 significance levels. The disease allele frequency was assumed to be 0.2, and the prevalence was assumed to 0.2. The relative phenotypic variance attributable to the main disease gene was assumed to be 0.005
|
| Statistic |
| ||||||
|---|---|---|---|---|---|---|---|---|
| 100 | 300 | 600 | 900 | 1200 | 1400 | |||
| 0.01 | 1 |
| 0.027 | 0.062 | 0.129 | 0.219 | 0.293 | 0.369 |
|
|
|
| 0.122 | 0.220 | 0.304 | 0.372 | ||
|
|
| 0.059 |
|
|
|
| ||
| 2 |
| 0.033 |
| 0.165 | 0.295 | 0.415 | 0.456 | |
|
|
| 0.073 | 0.166 | 0.309 | 0.418 | 0.461 | ||
|
| 0.035 |
|
|
|
|
| ||
| 3 |
|
| 0.103 |
| 0.398 | 0.519 | 0.609 | |
|
| 0.039 |
|
|
| 0.526 |
| ||
|
| 0.036 | 0.110 | 0.253 | 0.407 |
| 0.621 | ||
| 4 |
| 0.041 | 0.127 | 0.297 | 0.497 | 0.626 | 0.720 | |
|
| 0.044 |
|
|
| 0.626 | 0.719 | ||
|
|
| 0.126 |
| 0.493 |
|
| ||
| 0.001 | 1 |
|
|
|
| 0.075 | 0.113 | 0.153 |
|
|
|
|
|
|
| 0.157 | ||
|
| 0.003 | 0.010 |
|
| 0.106 |
| ||
| 2 |
| 0.006 |
| 0.060 | 0.098 | 0.193 | 0.224 | |
|
|
| 0.018 | 0.059 | 0.099 |
| 0.217 | ||
|
| 0.006 |
|
|
| 0.197 |
| ||
| 3 |
| 0.004 | 0.018 | 0.086 | 0.164 | 0.267 | 0.337 | |
|
| 0.007 | 0.020 |
| 0.162 |
| 0.333 | ||
|
|
|
| 0.087 |
|
|
| ||
| 4 |
|
| 0.029 | 0.116 | 0.231 | 0.309 | 0.451 | |
|
| 0.009 | 0.029 | 0.116 | 0.228 |
| 0.449 | ||
|
| 0.009 |
|
|
| 0.363 |
| ||
The bold text indicates the highest empirical estimate of the power for each situation
Empirical power estimates for scenario 1 when h 2 is 0.5. The empirical power estimates for scenario 1 were calculated with 1000 replicates at the both 0.1 and 0.001 significance levels. The disease allele frequency was assumed to be 0.2, and the prevalence was assumed to 0.2. The relative phenotypic variance attributable to the main disease gene was assumed to be 0.005
|
| Statistic |
| ||||||
|---|---|---|---|---|---|---|---|---|
| 100 | 300 | 600 | 900 | 1200 | 1400 | |||
| 0.01 | 1 |
| 0.052 | 0.142 | 0.369 | 0.518 | 0.682 | 0.765 |
|
| 0.053 | 0.145 | 0.352 | 0.523 | 0.681 | 0.766 | ||
|
|
|
|
|
|
|
| ||
| 2 |
| 0.053 | 0.174 | 0.396 | 0.616 | 0.761 | 0.829 | |
|
|
| 0.183 | 0.400 | 0.619 | 0.780 | 0.834 | ||
|
| 0.053 |
|
|
|
|
| ||
| 3 |
| 0.068 | 0.118 | 0.470 | 0.692 | 0.808 | 0.818 | |
|
|
| 0.216 | 0.489 | 0.705 |
|
| ||
|
| 0.067 |
|
|
|
| 0.904 | ||
| 4 |
| 0.066 | 0.222 | 0.528 | 0.755 | 0.860 | 0.938 | |
|
| 0.068 |
|
|
|
|
| ||
|
|
| 0.244 | 0.544 | 0.766 | 0.870 | 0.935 | ||
| 0.001 | 1 |
| 0.012 | 0.034 | 0.143 | 0.247 | 0.376 | 0.505 |
|
|
|
| 0.142 | 0.245 | 0.387 | 0.496 | ||
|
| 0.013 |
|
|
|
|
| ||
| 2 |
| 0.008 | 0.060 | 0.151 | 0.342 | 0.494 | 0.612 | |
|
|
|
| 0.163 | 0.357 | 0.512 | 0.640 | ||
|
| 0.012 |
|
|
|
|
| ||
| 3 |
| 0.005 | 0.033 | 0.223 | 0.404 | 0.579 | 0.595 | |
|
| 0.008 | 0.076 | 0.235 | 0.432 | 0.604 |
| ||
|
|
|
|
|
|
| 0.699 | ||
| 4 |
| 0.010 | 0.079 | 0.274 | 0.484 | 0.655 | 0.763 | |
|
| 0.008 |
|
|
|
|
| ||
|
|
| 0.084 | 0.280 | 0.474 | 0.671 | 0.769 | ||
The bold text indicates the highest empirical estimate of the power for each situation
Empirical power estimates for scenario 1 when h 2 is 0.8. The empirical power estimates for scenario 1 were calculated with 1000 replicates at the both 0.01 and 0.001 significance levels. The disease allele frequency was assumed to be 0.2, and the prevalence was assumed to 0.2. The relative phenotypic variance attributable to the main disease gene was assumed to be 0.005
|
| Statistic |
| ||||||
|---|---|---|---|---|---|---|---|---|
| 100 | 300 | 600 | 900 | 1200 | 1400 | |||
| 0.01 | 1 |
| 0.068 | 0.233 | 0.470 | 0.699 | 0.819 | 0.903 |
|
| 0.071 | 0.238 | 0.471 | 0.717 | 0.841 | 0.896 | ||
|
|
|
|
|
|
|
| ||
| 2 |
| 0.071 | 0.222 | 0.521 | 0.708 | 0.881 | 0.937 | |
|
| 0.080 | 0.304 | 0.568 | 0.788 | 0.907 | 0.942 | ||
|
|
|
|
|
|
|
| ||
| 3 |
| 0.075 | 0.253 | 0.555 | 0.786 | 0.911 | 0.931 | |
|
|
| 0.298 | 0.592 | 0.813 | 0.921 | 0.972 | ||
|
| 0.078 |
|
|
|
|
| ||
| 4 |
| 0.081 | 0.307 | 0.585 | 0.800 | 0.917 | 0.951 | |
|
| 0.088 |
|
| 0.828 | 0.928 | 0.957 | ||
|
|
| 0.318 | 0.614 |
|
|
| ||
| 0.001 | 1 |
| 0.016 | 0.074 | 0.229 | 0.444 | 0.602 | 0.710 |
|
| 0.017 | 0.072 | 0.221 | 0.436 | 0.643 | 0.693 | ||
|
|
|
|
|
|
|
| ||
| 2 |
| 0.020 | 0.074 | 0.251 | 0.436 | 0.676 | 0.778 | |
|
| 0.017 | 0.103 | 0.313 | 0.540 | 0740 | 0.798 | ||
|
|
|
|
|
|
|
| ||
| 3 |
| 0.024 | 0.081 | 0.278 | 0.513 | 0.734 | 0.810 | |
|
| 0.016 | 0.124 | 0.341 | 0.581 | 0.769 | 0.864 | ||
|
|
|
|
|
|
|
| ||
| 4 |
| 0.015 | 0.109 | 0.310 | 0.542 | 0.756 | 0.830 | |
|
|
| 0.118 | 0.335 | 0.588 | 0.783 | 0.867 | ||
|
|
|
|
|
|
|
| ||
The bold text indicates the highest empirical estimate of the power for each situation
Empirical power estimates for scenario 2 when h 2 is 0.2. The empirical power estimates for scenario 2 were calculated with 1000 replicates at the both 0.01, and 0.001 significance levels. The disease allele frequency was assumed to be 0.2, and the prevalence was assumed to 0.2. The relative phenotypic variance attributable to the main disease gene was assumed to be 0.005
|
| Statistic |
| ||||
|---|---|---|---|---|---|---|
| 100 | 300 | 600 | 900 | |||
| 0.01 | 1 |
| 0.072 | 0.149 | 0.304 | 0.409 |
|
| 0.072 | 0.147 | 0.305 | 0.415 | ||
|
|
|
|
|
| ||
| 2 |
| 0.042 | 0.137 | 0.300 | 0.448 | |
|
| 0.039 | 0.136 |
| 0.455 | ||
|
|
|
| 0.295 |
| ||
| 3 |
|
| 0.188 | 0.410 | 0.608 | |
|
| 0.054 |
|
|
| ||
|
| 0.055 | 0.190 | 0.423 | 0.615 | ||
| 0.001 | 1 |
|
| 0.059 | 0.147 | 0.211 |
|
| 0.023 |
|
| 0.197 | ||
|
| 0.022 | 0.055 |
|
| ||
| 2 |
| 0.010 |
| 0.123 | 0.229 | |
|
|
| 0.036 | 0.123 |
| ||
|
| 0.010 | 0.036 |
| 0.227 | ||
| 3 |
| 0.006 | 0.055 |
| 0.342 | |
|
| 0.007 | 0.055 |
| 0.355 | ||
|
|
|
|
|
| ||
The bold text indicates the highest empirical estimate of the power for each situation
Empirical power estimates for scenario 2 when h 2 is 0.5. The empirical power estimates for scenario 2 were calculated with 1000 replicates at the both 0.1 and 0.001 significance levels. The disease allele frequency was assumed to be 0.2, and the prevalence was assumed to 0.2. The relative phenotypic variance attributable to the main disease gene was assumed to be 0.005
|
| Statistic |
| ||||
|---|---|---|---|---|---|---|
| 100 | 300 | 600 | 900 | |||
| 0.01 | 1 |
| 0.130 | 0.293 | 0.567 | 0.787 |
|
| 0.130 | 0.295 | 0.568 | 0.773 | ||
|
|
|
|
|
| ||
| 2 |
| 0.093 | 0.332 | 0.645 | 0.864 | |
|
| 0.094 |
| 0.654 | 0.871 | ||
|
|
| 0.354 |
|
| ||
| 3 |
| 0.100 | 0.382 | 0.735 | 0.904 | |
|
| 0.108 | 0.406 | 0.751 | 0.915 | ||
|
|
|
|
|
| ||
| 0.001 | 1 |
| 0.046 | 0.148 | 0.341 | 0.560 |
|
|
| 0.149 | 0.353 | 0.559 | ||
|
| 0.047 |
|
|
| ||
| 2 |
| 0.019 | 0.127 | 0.387 | 0.634 | |
|
|
| 0.119 | 0.394 | 0.648 | ||
|
| 0.017 |
|
|
| ||
| 3 |
| 0.023 | 0.166 | 0.481 | 0.749 | |
|
| 0.026 | 0.183 | 0.511 | 0.772 | ||
|
|
|
|
|
| ||
The bold text indicates the highest empirical estimate of the power for each situation
Empirical power estimates for scenario 2 when h 2 is 0.8. The empirical power estimates for scenario 2 were calculated with 1000 replicates at the both 0.01 and 0.001 significance levels. The disease allele frequency was assumed to be 0.2, and the prevalence was assumed to 0.2. The relative phenotypic variance attributable to the main disease gene was assumed to be 0.005
|
| Statistic |
| ||||
|---|---|---|---|---|---|---|
| 100 | 300 | 600 | 900 | |||
| 0.01 | 1 |
| 0.164 | 0.445 | 0.749 | 0.906 |
|
| 0.156 | 0.441 | 0.751 | 0.905 | ||
|
|
|
|
|
| ||
| 2 |
| 0.132 | 0.473 | 0.823 | 0.970 | |
|
| 0.131 | 0.505 | 0.861 | 0.969 | ||
|
|
|
|
|
| ||
| 3 |
| 0.140 | 0.475 | 0.835 | 0.958 | |
|
| 0.134 | 0.520 | 0.867 | 0.970 | ||
|
|
|
|
|
| ||
| 0.001 | 1 |
| 0.059 | 0.230 | 0.519 | 0.759 |
|
| 0.053 | 0.236 | 0.519 | 0.757 | ||
|
|
|
|
|
| ||
| 2 |
| 0.039 | 0.239 | 0.561 | 0.858 | |
|
| 0.033 | 0.250 | 0.594 | 0.884 | ||
|
|
|
|
|
| ||
| 3 |
| 0.033 | 0.215 | 0.629 | 0.865 | |
|
|
| 0.247 | 0.671 | 0.900 | ||
|
| 0.044 |
|
|
| ||
The bold text indicates the highest empirical estimate of the power for each situation
Empirical power estimates for three situations when h 2 is 0.8. The empirical power estimates for three situations were calculated with 1000 replicates at the 0.01 significance levels. Phenotypes were generated under the assumption that the prevalence was assumed to 0.2. Prevalence was set to be 0.1, 0.2 or 0.3 to calculate the proposed statistics. The relative phenotypic variance attributable to the main disease gene was assumed to be 0.005
|
|
| Statistic | Prevalence to be set for statistics | ||
|---|---|---|---|---|---|
| 0.1 | 0.2 | 0.3 | |||
| 0 | 1 |
| 0.486 |
|
|
|
|
|
| 0.525 | ||
| 2 |
| 0.562 |
| 0.532 | |
|
|
|
| 0.624 | ||
| 3 |
| 0.632 |
| 0.634 | |
|
| 0.644 |
| 0.654 | ||
| 1 | 1 |
| 0.432 |
|
|
|
| 0.468 |
| 0.462 | ||
| 2 |
| 0.522 |
| 0.519 | |
|
| 0.547 |
| 0.559 | ||
| 3 |
| 0.580 |
| 0.576 | |
|
| 0.575 |
| 0.585 | ||
| 2 | 1 |
| 0.415 |
| 0.399 |
|
| 0.428 |
| 0.412 | ||
| 2 |
| 0.482 |
|
| |
|
| 0.483 |
| 0.492 | ||
| 3 |
| 0.487 |
| 0.490 | |
|
|
|
| 0.511 | ||
The bold text indicates the reference for the proposed statistics to compare with the misspecified prevalence
Fig. 4Multidimensional scaling plots from samples for the GWAS for AD. Founders were selectively used, and multidimensional scaling plots were obtained with the first and second PC scores
Fig. 5QQ plots of results from GWAS for AD. QQ plots are provided with the results from (a) WL and (b) FQLS 2
Top significant results of GWAS for AD. For the genome-wide significant SNPs from FQLS 2 and WL, their p-values are given
| SNP |
|
|
|
|---|---|---|---|
| SNP1 | 3.53× 10−7 | 1.94× 10−8 | 4.23× 10−4 |
| SNP2 | 5.74× 10−7 | 4.45× 10−8 | 4.9× 10−5 |
| SNP3 | 1.69× 10−7 | 8.36× 10−9 | 8.6× 10−5 |
| SNP4 | 2.79× 10−9 | 2.86× 10−10 | 6.94× 10−12 |