| Literature DB >> 29212447 |
Arnd Gross1,2, Anke Tönjes3, Markus Scholz4,5.
Abstract
BACKGROUND: When testing for SNP (single nucleotide polymorphism) associations in related individuals, observations are not independent. Simple linear regression assuming independent normally distributed residuals results in an increased type I error and the power of the test is also affected in a more complicate manner. Inflation of type I error is often successfully corrected by genomic control. However, this reduces the power of the test when relatedness is of concern. In the present paper, we derive explicit formulae to investigate how heritability and strength of relatedness contribute to variance inflation of the effect estimate of the linear model. Further, we study the consequences of variance inflation on hypothesis testing and compare the results with those of genomic control correction. We apply the developed theory to the publicly available HapMap trio data (N=129), the Sorbs (a self-contained population with N=977 characterised by a cryptic relatedness structure) and synthetic family studies with different sample sizes (ranging from N=129 to N=999) and different degrees of relatedness.Entities:
Keywords: Heritability; Linear regression; Relatedness; SNP association analysis
Mesh:
Year: 2017 PMID: 29212447 PMCID: PMC5719591 DOI: 10.1186/s12863-017-0571-x
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Estimated variance inflation under relatedness
| Study |
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|
| HapMap | 129 | 1.288 (0.074) | 1.295 (0.051) | 1.297 | - | 0.006 | 0.152 |
| SFS1 | 129 | 1.284 (0.087) | 1.293 (0.051) | 1.294 | 1.295 | 0.007 | 0.153 |
| SFS2 | 999 | 1.306 (0.050) | 1.313 (0.020) | 1.314 | 1.299 | 0.001 | 0.143 |
| Sorbs | 977 | 1.410 (0.135) | 1.448 (0.071) | 1.449 | - | 0.001 | 0.100 |
| SFS3 | 999 | 2.006 (0.139) | 2.022 (0.083) | 2.021 | 2.002 | 0.002 | 0.044 |
Variance inflation and related measures are compared between the data sets HapMap, SFS1 (synthetic family study 1), SFS2, Sorbs and SFS3 assuming . Provided are the sample size n, average inflation of all SNPs, average inflation estimated for SNPs with minor allele frequencies > 10%, expected (theoretical) inflation λ ′ obtained from estimated relationships, expected inflation λ f;m;c′ obtained from true relationships (synthetic family studies only), mean relatedness and heritability corresponding to inflation λt′=1.05. Standard deviations are given in parentheses
Fig. 1Expected variance inflation for synthetic family studies. The figure presents the expected variance inflation λ f;m;c′ for heritability and family studies with varying numbers of mothers m and children c, each between 1 and 10, and with a total of about n =1000 individuals. The background colour corresponds to the values presented and ranges from white for the minimum to black for the maximum inflation
Simulation results for the test statistic T under the null hypothesis
| Study |
|
|
|
|---|---|---|---|
| HapMap | 0.002 (0.037) | 1.330 (0.096) | 0.992 |
| SFS1 | -0.000 (0.037) | 1.321 (0.107) | 0.992 |
| SFS2 | -0.001 (0.037) | 1.309 (0.076) | 0.999 |
| Sorbs | -0.001 (0.037) | 1.412 (0.144) | 0.999 |
| SFS3 | 0.001 (0.043) | 2.015 (0.166) | 0.997 |
The test statistics averaged over replicates and SNPs and the average of the empirical variances are compared between HapMap, SFS1 (synthetic family study 1), SFS2, Sorbs and SFS3 assuming the null hypothesis and . Standard deviations are presented in parentheses. We further provide an estimate of the deflation factor ν for the empirical variance of the effect estimate
Simulation results for the test statistic T under the alternative hypothesis
| Study |
|
|
|
|---|---|---|---|
| HapMap | 1.619 (0.037) | 1.343 (0.095) | 1.600 |
| SFS1 | 1.619 (0.036) | 1.336 (0.112) | 1.600 |
| SFS2 | 4.472 (0.036) | 1.330 (0.076) | 4.468 |
| Sorbs | 4.420 (0.039) | 1.432 (0.148) | 4.418 |
| SFS3 | 4.479 (0.046) | 2.030 (0.162) | 4.468 |
The test statistics averaged over replicates and SNPs and the average of the empirical variances are compared between HapMap, SFS1 (synthetic family study 1), SFS2, Sorbs and SFS3 assuming the alternative hypothesis with and heritability . Standard deviations are presented in parentheses. We further provide the expected value μ of the test statistic T
Fig. 2Comparison of type I errors with respect to different degrees of variance inflation. The figure provides a comparison of type I errors dependent on the significance level α without variance inflation λ=1 and variance inflation with λ=1.05, 1.3 and 2. The negative common logarithm is presented for α as well as the type I error. The grey vertical line corresponds to a significance level of α=0.05
Fig. 3Comparison of power with respect to different degrees of variance inflation. Both figures provide a comparison of power in percent dependent on the significance level α without variance inflation λ=1 and variance inflation with λ=1.05, 1.3 and 2. Figure a corresponds to the uncorrected test statistic, whereas Figure b refers to the test statistic after genomic control. The negative common logarithm is presented for α. The grey vertical line corresponds to a significance level of α=0.05. An explained variance of was assumed. Sample size was set to n=1000