| Literature DB >> 27980632 |
Yuriko Katsumata1, David W Fardo1.
Abstract
Several statistical group-based approaches have been proposed to detect effects of variation within a gene for each of the population- and family-based designs. However, unified tests to combine gene-phenotype associations obtained from these 2 study designs are not yet well established. In this study, we investigated the efficient combination of population-based and family-based sequencing data to evaluate best practices using the Genetic Analysis Workshop 19 (GAW19) data set. Because one design employed whole genome sequencing and the other whole exome sequencing, we examined variants overlapping both data sets. We used the family-based sequence kernel association test (famSKAT) to analyze the family- and population-based data sets separately as well as with a combined data set. These were compared against meta-analysis. Using the combined data, we showed that famSKAT has high power to detect associations between diastolic and/or systolic blood pressures and the genes that have causal variants with large effect sizes, such as MAP4, TNN, and CGN. However, when there was a considerable difference in the powers between family- and population-based data, famSKAT with the combined data had lower power than that from the population-based data alone. The famSKAT test statistic for the combined data can be influenced by sample imbalance from the 2 designs. This underscores the importance of foresight in study design as, in this situation, the greatly lower sample size in the family-based data essentially serves to dilute signal. We observed inflated type I errors in our simulation study, largely when using population-based data, which might be a result of principal components failing to completely account for population admixture in this cohort.Entities:
Year: 2016 PMID: 27980632 PMCID: PMC5133531 DOI: 10.1186/s12919-016-0026-9
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
The number of variants in each gene in family-based, population-based, and combined data sets
| Gene | Number of variants | Number of causal variants | ||
|---|---|---|---|---|
| Totala | Familyb | Populationc | Combinedd | |
| DBP | ||||
|
| 41 | 5 | 8 | 8 |
|
| 52 | 11 | 15 | 15 |
|
| 17 | 0 | 0 | 0 |
|
| 43 | 3 | 7 | 7 |
|
| 46 | 1 | 2 | 2 |
|
| 18 | 1 | 5 | 5 |
|
| 56 | 9 | 16 | 16 |
|
| 62 | 6 | 10 | 10 |
|
| 29 | 0 | 0 | 0 |
|
| 20 | 1 | 5 | 5 |
|
| 55 | 3 | 7 | 7 |
|
| 47 | 0 | 0 | 0 |
|
| 37 | 1 | 0 | 1 |
|
| 34 | 3 | 3 | 3 |
|
| 32 | 3 | 4 | 4 |
| SBP | ||||
|
| 41 | 5 | 9 | 9 |
|
| 52 | 12 | 15 | 15 |
|
| 17 | 0 | 0 | 0 |
|
| 43 | 3 | 7 | 7 |
|
| 46 | 1 | 2 | 2 |
|
| 20 | 1 | 5 | 5 |
|
| 37 | 1 | 0 | 1 |
|
| 34 | 0 | 0 | 0 |
|
| 81 | 5 | 7 | 7 |
|
| 39 | 2 | 7 | 7 |
|
| 35 | 1 | 2 | 2 |
|
| 33 | 2 | 4 | 4 |
|
| 77 | 1 | 2 | 2 |
|
| 42 | 1 | 1 | 1 |
|
| 4 | 0 | 0 | 0 |
aThe number of variants that have the same position between the family- and population-based data
bThe number of causal variants of the intersected variants in the family-based data
cThe number of causal variants of the intersected variants in the population-based data
dThe number of causal variants of the intersected variants in the combined data between the family- and population-based data
Type I errors (95 % confidence intervals) of family-based sequence kernel association test in the family-based, population-based, and the combined data, and of meta-analytic approach
| Gene | Family | Population | Combined | Meta-analysis |
|---|---|---|---|---|
|
| 0.050 (0.024–0.090) | 0.100 (0.062–0.150) | 0.085 (0.050–0.133) | 0.095 (0.058–0.144) |
|
| 0.030 (0.011–0.064) | 0.080 (0.046–0.127) | 0.085 (0.050–0.133) | 0.125 (0.083–0.179) |
|
| 0.050 (0.024–0.090) | 0.070 (0.039–0.115) | 0.080 (0.046–0.127) | 0.065 (0.035–0.109) |
|
| 0.040 (0.017–0.077) | 0.045 (0.021–0.084) | 0.080 (0.046–0.127) | 0.055 (0.028–0.096) |
|
| 0.090 (0.054–0.139) | 0.060 (0.031–0.102) | 0.060 (0.031–0.102) | 0.070 (0.039–0.115) |
|
| 0.045 (0.021–0.084) | 0.045 (0.021–0.084) | 0.060 (0.031–0.102) | 0.110 (0.070–0.162) |
|
| 0.040 (0.017–0.077) | 0.060 (0.031–0.102) | 0.070 (0.039–0.115) | 0.105 (0.066–0.156) |
|
| 0.035 (0.014–0.071) | 0.030 (0.011–0.064) | 0.065 (0.035–0.109) | 0.075 (0.043–0.121) |
|
| 0.085 (0.050–0.133) | 0.065 (0.035–0.109) | 0.090 (0.054–0.139) | 0.100 (0.062–0.150) |
|
| 0.075 (0.043–0.121) | 0.060 (0.031–0.102) | 0.050 (0.024–0.090) | 0.070 (0.039–0.115) |
|
| 0.035 (0.014–0.071) | 0.090 (0.054–0.139) | 0.080 (0.046–0.127) | 0.095 (0.058–0.144) |
|
| 0.070 (0.039–0.115) | 0.080 (0.046–0.127) | 0.095 (0.058–0.144) | 0.110 (0.070–0.162) |
|
| 0.055 (0.028–0.096) | 0.090 (0.054–0.139) | 0.090 (0.054–0.139) | 0.130 (0.087–0.185) |
|
| 0.045 (0.021–0.084) | 0.080 (0.046–0.127) | 0.090 (0.054–0.139) | 0.095 (0.058–0.144) |
|
| 0.065 (0.035–0.109) | 0.050 (0.024–0.090) | 0.075 (0.043–0.121) | 0.080 (0.046–0.127) |
|
| 0.080 (0.046–0.127) | 0.075 (0.043–0.121) | 0.060 (0.031–0.102) | 0.080 (0.046–0.127) |
|
| 0.050 (0.024–0.090) | 0.070 (0.039–0.115) | 0.095 (0.058–0.144) | 0.120 (0.078–0.173) |
|
| 0.065 (0.035–0.109) | 0.055 (0.028–0.096) | 0.055 (0.028–0.096) | 0.105 (0.066–0.156) |
|
| 0.065 (0.035–0.109) | 0.050 (0.024–0.090) | 0.135 (0.091–0.190) | 0.105 (0.066–0.156) |
|
| 0.025 (0.008–0.057) | 0.080 (0.046–0.127) | 0.080 (0.046–0.127) | 0.090 (0.054–0.139) |
|
| 0.030 (0.011–0.064) | 0.065 (0.035–0.109) | 0.065 (0.035–0.109) | 0.105 (0.066–0.156) |
|
| 0.075 (0.043–0.121) | 0.075 (0.043–0.121) | 0.100 (0.062–0.150) | 0.125 (0.083–0.179) |
|
| 0.070 (0.039–0.115) | 0.055 (0.028–0.096) | 0.050 (0.024–0.090) | 0.055 (0.028–0.096) |
Fig. 1Powers of family-based sequence kernel association test (famSKAT)