| Literature DB >> 24114802 |
Stephen Burgess1, Adam Butterworth, Simon G Thompson.
Abstract
Genome-wide association studies, which typically report regression coefficients summarizing the associations of many genetic variants with various traits, are potentially a powerful source of data for Mendelian randomization investigations. We demonstrate how such coefficients from multiple variants can be combined in a Mendelian randomization analysis to estimate the causal effect of a risk factor on an outcome. The bias and efficiency of estimates based on summarized data are compared to those based on individual-level data in simulation studies. We investigate the impact of gene-gene interactions, linkage disequilibrium, and 'weak instruments' on these estimates. Both an inverse-variance weighted average of variant-specific associations and a likelihood-based approach for summarized data give similar estimates and precision to the two-stage least squares method for individual-level data, even when there are gene-gene interactions. However, these summarized data methods overstate precision when variants are in linkage disequilibrium. If the P-value in a linear regression of the risk factor for each variant is less than 1×10⁻⁵, then weak instrument bias will be small. We use these methods to estimate the causal association of low-density lipoprotein cholesterol (LDL-C) on coronary artery disease using published data on five genetic variants. A 30% reduction in LDL-C is estimated to reduce coronary artery disease risk by 67% (95% CI: 54% to 76%). We conclude that Mendelian randomization investigations using summarized data from uncorrelated variants are similarly efficient to those using individual-level data, although the necessary assumptions cannot be so fully assessed.Entities:
Keywords: Mendelian randomization; causal inference; genome-wide association study; instrumental variables; weak instruments
Mesh:
Substances:
Year: 2013 PMID: 24114802 PMCID: PMC4377079 DOI: 10.1002/gepi.21758
Source DB: PubMed Journal: Genet Epidemiol ISSN: 0741-0395 Impact factor: 2.135
Results from simulation study with independently distributed variants
| α12 | α13 | α23 | Mean F | Method | Mean | Median | SD | Mean SE | Coverage | Power |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0 | 47.3 | 2SLS | 0.196 | 0.192 | 0.085 | 0.085 | 94.8 | 65.2 |
| IVW | 0.196 | 0.192 | 0.085 | 0.078 | 92.6 | 70.1 | ||||
| Likelihood | 0.200 | 0.197 | 0.087 | 0.082 | 94.2 | 69.1 | ||||
| +0.08 | +0.1 | +0.12 | 126.9 | 2SLS | 0.199 | 0.197 | 0.052 | 0.051 | 95.0 | 98.3 |
| IVW | 0.199 | 0.197 | 0.052 | 0.047 | 92.6 | 98.6 | ||||
| Likelihood | 0.200 | 0.199 | 0.052 | 0.050 | 94.0 | 98.6 | ||||
| −0.08 | +0.1 | +0.12 | 88.1 | 2SLS | 0.198 | 0.197 | 0.061 | 0.062 | 95.1 | 78.1 |
| IVW | 0.198 | 0.197 | 0.061 | 0.057 | 93.0 | 81.6 | ||||
| Likelihood | 0.201 | 0.199 | 0.062 | 0.060 | 94.2 | 80.9 | ||||
| +0.08 | −0.1 | +0.12 | 74.4 | 2SLS | 0.198 | 0.196 | 0.068 | 0.067 | 95.0 | 86.6 |
| IVW | 0.198 | 0.196 | 0.068 | 0.062 | 92.9 | 88.9 | ||||
| Likelihood | 0.201 | 0.199 | 0.068 | 0.065 | 94.2 | 88.6 | ||||
| +0.08 | +0.1 | −0.12 | 59.6 | 2SLS | 0.197 | 0.194 | 0.075 | 0.075 | 94.8 | 92.4 |
| IVW | 0.197 | 0.194 | 0.075 | 0.069 | 92.8 | 93.6 | ||||
| Likelihood | 0.201 | 0.198 | 0.076 | 0.073 | 94.2 | 93.4 | ||||
| −0.08 | −0.1 | +0.12 | 45.8 | 2SLS | 0.197 | 0.193 | 0.085 | 0.086 | 95.3 | 28.9 |
| IVW | 0.197 | 0.193 | 0.085 | 0.079 | 93.3 | 37.9 | ||||
| Likelihood | 0.202 | 0.197 | 0.087 | 0.084 | 94.6 | 35.9 | ||||
| −0.08 | +0.1 | −0.12 | 33.1 | 2SLS | 0.196 | 0.191 | 0.102 | 0.102 | 94.9 | 46.7 |
| IVW | 0.196 | 0.191 | 0.102 | 0.093 | 92.7 | 53.8 | ||||
| Likelihood | 0.203 | 0.197 | 0.105 | 0.100 | 94.4 | 52.2 | ||||
| +0.08 | −0.1 | −0.12 | 23.0 | 2SLS | 0.190 | 0.183 | 0.123 | 0.124 | 94.7 | 64.3 |
| IVW | 0.190 | 0.183 | 0.123 | 0.113 | 92.8 | 69.7 | ||||
| Likelihood | 0.201 | 0.192 | 0.129 | 0.121 | 94.2 | 68.7 | ||||
| −0.08 | −0.1 | −0.12 | 6.6 | 2SLS | 0.172 | 0.148 | 0.249 | 0.244 | 93.4 | 2.8 |
| IVW | 0.172 | 0.148 | 0.249 | 0.217 | 92.1 | 10.9 | ||||
| Likelihood | 0.221 | 0.180 | 0.357 | 0.285 | 94.6 | 5.7 |
Instrumental variable estimates of causal effect +0.2 from simulated data with and without gene–gene interactions using individual-level data (two-stage least squares method, 2SLS) and summarized data (inverse-variance weighted, IVW, and likelihood-based methods) with mean F statistic, mean and median estimates across 10,000 simulations, SD of estimates, mean SE of estimates, coverage (%) of 95% confidence interval, and power (%) at a 5% significance level
Results from simulation study with correlated variants
| Mean F | Method | Mean | Median | SD | Mean SE | Coverage | |
|---|---|---|---|---|---|---|---|
| 0.00 | 42.6 | 2SLS | 0.195 | 0.191 | 0.090 | 0.090 | 94.8 |
| IVW | 0.195 | 0.191 | 0.090 | 0.082 | 92.8 | ||
| Likelihood | 0.200 | 0.196 | 0.092 | 0.087 | 94.1 | ||
| 0.06 | 47.8 | 2SLS | 0.196 | 0.193 | 0.086 | 0.085 | 94.5 |
| IVW | 0.197 | 0.194 | 0.086 | 0.073 | 90.3 | ||
| Likelihood | 0.201 | 0.198 | 0.087 | 0.077 | 92.0 | ||
| 0.13 | 52.6 | 2SLS | 0.197 | 0.194 | 0.080 | 0.081 | 95.0 |
| IVW | 0.199 | 0.196 | 0.080 | 0.066 | 89.5 | ||
| Likelihood | 0.202 | 0.199 | 0.081 | 0.070 | 91.2 | ||
| 0.26 | 63.3 | 2SLS | 0.197 | 0.193 | 0.074 | 0.073 | 94.6 |
| IVW | 0.199 | 0.196 | 0.074 | 0.055 | 85.1 | ||
| Likelihood | 0.201 | 0.198 | 0.074 | 0.058 | 87.1 | ||
| 0.41 | 74.8 | 2SLS | 0.198 | 0.196 | 0.067 | 0.067 | 95.0 |
| IVW | 0.201 | 0.199 | 0.068 | 0.046 | 82.1 | ||
| Likelihood | 0.202 | 0.200 | 0.068 | 0.048 | 84.2 |
Instrumental variable estimates of causal effect +0.2 from simulated data with correlated variants (correlation measured by r2, the average squared correlation between variants) using individual-level data (two-stage least squares method, 2SLS) and summarized data (inverse-variance weighted, IVW, and likelihood-based methods) with mean F statistic, mean and median estimates across simulations, SD of estimates, mean SE of estimates, coverage (%) of 95% confidence interval
Figure 1Per allele associations of five genetic variants with low-density lipoprotein cholesterol (LDL-C) and risk of coronary artery disease (CAD) taken from Waterworth et al. [Waterworth et al., 2010], with causal estimate (and 95% confidence interval) of effect of LDL-C on CAD risk (likelihood-based method assuming zero correlation).
Causal odds ratios of CAD per 30% reduction in LDL-C
| Method | Correlation (ρ) | Estimate | 95% confidence interval |
|---|---|---|---|
| IVW | – | 0.33 | 0.25, 0.45 |
| Likelihood-based | 0 | 0.33 | 0.24, 0.46 |
| Likelihood-based | −0.4 | 0.33 | 0.23, 0.48 |
| Likelihood-based | −0.2 | 0.33 | 0.24, 0.47 |
| Likelihood-based | −0.1 | 0.33 | 0.24, 0.46 |
| Likelihood-based | 0.1 | 0.33 | 0.25, 0.45 |
| Likelihood-based | 0.2 | 0.33 | 0.25, 0.45 |
| Likelihood-based | 0.4 | 0.34 | 0.26, 0.44 |
Instrumental variable estimates of causal effect of low-density lipoprotein cholesterol (LDL-C) on risk of coronary artery disease (CAD) using inverse-variance weighted (IVW) method and likelihood-based method for different values of the correlation parameter (ρ)