| Literature DB >> 29218908 |
Jason Westra1, Nicholas Hartman, Bethany Lake, Gregory Shearer, Nathan Tintle.
Abstract
Standard approaches to evaluate the impact of single nucleotide polymorphisms (SNP) on quantitative phenotypes use linear models. However, these normal-based approaches may not optimally model phenotypes which are better represented by Gaussian mixture distributions (e.g., some metabolomics data). We develop a likelihood ratio test on the mixing proportions of two-component Gaussian mixture distributions and consider more restrictive models to increase power in light of a priori biological knowledge. Data were simulated to validate the improved power of the likelihood ratio test and the restricted likelihood ratio test over a linear model and a log transformed linear model. Then, using real data from the Framingham Heart Study, we analyzed 20,315 SNPs on chromosome 11, demonstrating that the proposed likelihood ratio test identifies SNPs well known to participate in the desaturation of certain fatty acids. Our study both validates the approach of increasing power by using the likelihood ratio test that leverages Gaussian mixture models, and creates a model with improved sensitivity and interpretability.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29218908 PMCID: PMC5757879
Source DB: PubMed Journal: Pac Symp Biocomput ISSN: 2335-6928
Figure 1visually illustrates the null and alternative models. The black, light grey, and dark grey two-component mixture distributions are the phenotype distributions for the less common homozygote, the heterozygote and the more common homozygote, respectively. In the null model, 75% of the observations in each genotype are in the component with the smaller mean. In the alternative model, the mixing proportion for the component density with the smaller mean varies across genotypes.
LRTpro
| Genotype | Component 1 of Mixture Distribution | Component 2 of Mixture Distribution |
|---|---|---|
| 0 | 1 − | |
| 1 | 1 − ( | |
| 2 | 1 − ( |
LRTadd
| Genotype | Component 1 of Mixture Distribution | Component 2 of Mixture Distribution |
|---|---|---|
| 0 | 1 − | |
| 1 | 1 − ( | |
| 2 | 1 − ( |
Type I Error Estimates
| Nominal Significance Level | |||||
|---|---|---|---|---|---|
|
| |||||
| SD | 0.05 | 0.01 | 0.001 | Kolmogorov-Smirnov test p-value | |
| LRTpro | 0.5 | 0.0497 | 0.011 | 0.0012 | 0.6846 |
| 0.75 | 0.0515 | 0.0097 | 0.0010 | 0.8832 | |
|
| |||||
| LRTadd | 0.5 | 0.0472 | 0.0108 | 0.0012 | 0.7277 |
| 0.75 | 0.0495 | 0.0085 | 0.0008 | 0.7091 | |
|
| |||||
| LRT | 0.5 | 0.0557 | 0.0108 | 0.0012 | 0.2269 |
| 0.75 | 0.0478 | 0.0078 | 0.0013 | 0.7435 | |
|
| |||||
| Linear Model | 0.5 | 0.0538 | 0.0107 | 0.0007 | |
| 0.75 | 0.0458 | 0.0070 | 0.0005 | ||
|
| |||||
| Log Linear Model | 0.5 | 0.0523 | 0.0108 | 0.0007 | |
| 0.75 | 0.0460 | 0.0083 | 0.0007 | ||
As compared to a chi-square distribution.
Power Estimates
| model | q | maf | Linear Model | Log Linear Model | LRTpro | LRTadd | LRT | |
|---|---|---|---|---|---|---|---|---|
| add | 0.1 | 0.05 | 0.75 | 0.343 | 0.26 | 0.403 | 0.39 | 0.295 |
| 0.9 | 0.44 | 0.316 | 0.631 | 0.624 | 0.508 | |||
|
| ||||||||
| 0.1 | 0.75 | 0.824 | 0.736 | 0.879 | 0.871 | 0.798 | ||
| 0.9 | 0.898 | 0.793 | 0.967 | 0.966 | 0.938 | |||
|
| ||||||||
| 0.25 | 0.75 | 0.999 | 0.997 | 0.999 | 0.999 | 0.999 | ||
| 0.9 | 0.999 | 0.999 | 1 | 1 | 1 | |||
|
| ||||||||
| pro | 0.9 | 0.05 | 0.75 | 0.12 | 0.095 | 0.156 | 0.153 | 0.105 |
| 0.9 | 0.325 | 0.212 | 0.478 | 0.467 | 0.362 | |||
|
| ||||||||
| 0.1 | 0.75 | 0.388 | 0.31 | 0.46 | 0.451 | 0.342 | ||
| 0.9 | 0.75 | 0.622 | 0.891 | 0.887 | 0.831 | |||
|
| ||||||||
| 0.25 | 0.75 | 0.904 | 0.844 | 0.936 | 0.932 | 0.892 | ||
| 0.9 | 0.998 | 0.975 | 1 | 1 | 1 | |||
Power estimates for standard deviation of .75 for alpha = 0.0001
Figure 2P-value comparison between LRTpro and the linear model.
Estimates of Means for LRTpro
| True model | True | q | Standard deviation of | Standard deviation of | ||
|---|---|---|---|---|---|---|
| Add | 0.75 | 0.1 | 0.0005 | 0.02936 | 1.0022 | 0.0401 |
| 0.9 | 0.1 | −0.0021 | 0.0240 | 1.0036 | 0.0740 | |
| 0.75 | 0.2 | −0.0007 | 0.0240 | 1.0011 | 0.0349 | |
| 0.9 | 0.2 | −0.0020 | 0.0206 | 1.0005 | 0.0547 | |
|
| ||||||
| Pro | 0.75 | 0.75 | 0.0005 | 0.02678 | 1.0005 | 0.0356 |
| 0.9 | 0.75 | −0.0014 | 0.0110 | 1.0008 | 0.0518 | |
| 0.75 | 0.9 | 0.0002 | 0.0293 | 1.0030 | 0.0400 | |
| 0.9 | 0.9 | −0.0025 | 0.02496 | 1.0028 | 0.0781 | |
Estimates aggregated across all settings with these parameters and all simulations within each setting, with the true value of μ1 =0 and μ2 =1.
Estimates of Mixing Proportions for LRTpro
| model | True | q | sd | True | sd | True | sd | |||
|---|---|---|---|---|---|---|---|---|---|---|
| Add | 0.75 | 0.1 | 0.7509 | 0.0280 | 0.65 | 0.5197 | 0.1537 | 0.55 | 0.5290 | 0.0625 |
| 0.9 | 0.1 | 0.8973 | 0.0265 | 0.8 | 0.6134 | 0.2786 | 0.7 | 0.6059 | 0.1627 | |
| 0.75 | 0.2 | 0.7496 | 0.0247 | 0.55 | 0.5097 | 0.0582 | 0.35 | 0.5072 | 0.1731 | |
| 0.9 | 0.2 | 0.8985 | 0.0220 | 0.7 | 0.6559 | 0.1220 | 0.5 | 0.5523 | 0.0744 | |
|
| ||||||||||
| Pro | 0.75 | 0.75 | 0.7504 | 0.0248 | 0.5625 | 0.5390 | 0.0597 | 0.4219 | 0.4707 | 0.1188 |
| 0.9 | 0.75 | 0.8983 | 0.0205 | 0.6750 | 0.6627 | 0.0691 | 0.5063 | 0.5139 | 0.0677 | |
| 0.75 | 0.9 | 0.7507 | 0.0284 | 0.6750 | 0.5385 | 0.1754 | 0.6075 | 0.5358 | 0.1037 | |
| 0.9 | 0.9 | 0.8966 | 0.0278 | 0.8100 | 0.6290 | 0.2823 | 0.729 | 0.6170 | 0.1814 | |
Estimates aggregated across all settings with these parameters and all simulations within each setting.
Most significant SNPs in each region
| rs# | # of SNPs 1 | MAF | Pos | Gene | LRTpro p-value | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| rs10751124 | 0.346 | 85432084 | DLG2 | 2.50x10−8 | 0.062 | 0.114 | 0.162 | 0.174 | 0.100 | 0.023 | |
| rs11220658 | 1 | 0.350 | 99618283 | CNTN5 | 4.52x10−7 | 0.110 | 0.075 | 0.051 | 0.179 | 0.101 | 0.024 |
| rs7129015 | 5 | 0.198 | 110772485 | 1.86x10−7 | 0.105 | 0.059 | 0.034 | 0.179 | 0.101 | 0.024 | |
| rs11217753 | 1 | 0.167 | 120181415 | 2.94x10−9 | 0.108 | 0.052 | 0.025 | 0.180 | 0.101 | 0.024 | |
| rs174549 | 19 | 0.290 | 61803910 | FADS1 | 5.32x10−312 | 0.036 | 0.183 | 0.937 | 0.160 | 0.097 | 0.024 |