| Literature DB >> 32081875 |
Yoonsu Cho1, Philip C Haycock1, Eleanor Sanderson1, Tom R Gaunt1, Jie Zheng1, Andrew P Morris2,3, George Davey Smith1, Gibran Hemani4.
Abstract
In Mendelian randomization (MR) analysis, variants that exert horizontal pleiotropy are typically treated as a nuisance. However, they could be valuable in identifying alternative pathways to the traits under investigation. Here, we develop MR-TRYX, a framework that exploits horizontal pleiotropy to discover putative risk factors for disease. We begin by detecting outliers in a single exposure-outcome MR analysis, hypothesising they are due to horizontal pleiotropy. We search across hundreds of complete GWAS summary datasets to systematically identify other (candidate) traits that associate with the outliers. We develop a multi-trait pleiotropy model of the heterogeneity in the exposure-outcome analysis due to pathways through candidate traits. Through detailed investigation of several causal relationships, many pleiotropic pathways are uncovered with already established causal effects, validating the approach, but also alternative putative causal pathways. Adjustment for pleiotropic pathways reduces the heterogeneity across the analyses.Entities:
Mesh:
Year: 2020 PMID: 32081875 PMCID: PMC7035387 DOI: 10.1038/s41467-020-14452-4
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Conceptual framework of the study.
Illustration of identifying putative factors that influence the original observations. a Where (gx) is the SNP–exposure effect, (xy) is the exposure–outcome effect as estimated through MR analysis from the non-outlier SNPs, (gp) is the SNP–candidate trait effect and (py) is the causal effect of the candidate trait on the outcome. b The open circles represent valid instruments and the slope of the dotted line represents the causal effect estimate of the exposure on the outcome. The closed red circle represents an outlier SNP which influences the outcome through two independent pathways, P and X. c One way in which the red SNP can exhibit a larger influence on the outcome than expected given its effect on the exposure is if it influences the outcome additionally through another pathway (horizontal pleiotropy). d Using the MR-Base database of GWAS summary data for hundreds of traits we can search for ‘candidate traits’ with which the outlier SNP has an association. e Instruments excluding the original outlier SNP are obtained for each candidate trait, LASSO-based multivariable MR is used to prune the candidate traits to avoid redundancy, and the causal influence of each of those independent candidate traits on the outcome can subsequently be estimated. This allows us to identify alternative traits that putatively influence the outcome and adjust the SNP–outcome associations for pleiotropic pathways in the original exposure–outcome model.
Fig. 2Simulations comparing methods across different scenarios.
We evaluated three scenarios: confounding pleiotropy, horizontal pleiotropy and mediated pleiotropy (columns of graphs, with DAGs illustrating the scenarios. See Methods for full details). The x-axis of each graph represents the proportion of variants used to instrument x that were similated to exhibit pleiotropic effects. Typically, 30 instruments were simulated directly for x but this varies across scenarios where necessary. The y-axis of the first row of graphs represents the proportion of simulations that lead to unbiased effect estimates of x on y. The y-axis of the second row of graphs represents the sensitivity and specificity of the analysis across the simulations, where the area under the receiving operating curve (AUROC) represents the ability of the method to distinguish between simulations in which the causal effect of x on y is either null or not null. For all graphs, higher y-axis values are better. Seven methods are evaluated at each simulation. Raw = IVW random effects estimates applied to all detected instruments; Removed = either all outliers are removed, or only outliers detected to associate with a candidate trait; MVMR = multivariable MR using either candidate traits detected to associate with any instrument or using only candidate traits associated with outlier instruments; Adjusted = adjusting SNP–outcome associations for candidate traits applied either only to variants detected to be outliers, or all variants regardless of outlier status.
Candidate traits associated with both exposure and outcome.
| Outlier SNPs | Nearest gene | Category | Phenotypesa | Beta (95% CI)c | |
|---|---|---|---|---|---|
| Early development | Birth weight of first child | 40 | −0.312 (−0.498, −0.126) | ||
| Anthropometric measures | Standing height | 577 | −0.208 (−0.264, −0.152) | ||
| Lipid | LDL cholesterol | 78 | 0.393 (0.290, 0.497) | ||
| HDL cholesterol | 86 | −0.172 (−0.288, −0.055) | |||
| Total cholesterol | 86 | 0.378 (0.271, 0.484) | |||
| Medications | Self-reported status of ibuprofen intake | 2 | −16.726 (−37.262, −3.811) | ||
| rs653178 | Early development | Birth weight of first child | 31 | 0.347 (0.065, 0.628) | |
| Birth weight | 40 | −0.312 (−0.498, −0.126) | |||
| Anthropometric measures | Comparative height size at age 10 | 357 | −0.248 (−0.342, −0.154) | ||
| Hip circumference | 275 | 0.131 (0.030, 0.231) | |||
| Impedance of arm (left) | 305 | −0.263 (−0.380, −0.145) | |||
| Standing height | 577 | −0.208 (−0.264, −0.152) | |||
| Lipid | HDL cholesterol | 78 | 0.393 (0.290, 0.497) | ||
| LDL cholesterol | 86 | −0.172 (−0.288, −0.055) | |||
| Total cholesterol | 86 | 0.378 (0.271, 0.484) | |||
| Disease | hypothyroidism/myxoedema (Self-reported) | 77 | 0.847 (0.211, 1.483) | ||
| Smoking | Past tobacco smoking | 41 | −0.265 (−0.500, −0.029) | ||
| Medications | Treatment/medication: levothyroxine sodium | 51 | 1.231 (0.270, 2.191) | ||
| rs642803 | Anthropometric measures | Waist circumference | 218 | 0.458 (0.352, 0.563) | |
| Disease | Malabsorption/coeliac disease (self-reported) | 11 | −8.401 (−12.842, −3.961) | ||
| rs13107325 | Anthropometric measures | Impedance of leg (left) | 282 | 0.179 (0.047, 0.311) | |
| Memory | Prospective memory result | 2 | 4.493 (1.851, 7.135) | ||
| Drinking | Alcohol intake frequency | 31 | 0.347 (0.065, 0.628) | ||
| Drinking | Alcohol intake frequency | 31 | 0.347 (0.065, 0.628) | ||
| Exercise | Usual walking pace | 22 | −1.595 (−2.364, −0.825) | ||
| Drinking | Alcohol intake frequency | 31 | 0.347 (0.065, 0.628) | ||
SNP single-nucleotide polymorphism, VLDL very low-density lipoprotein, HDLC high-density lipoprotein cholesterol, LDLC low-density lipoprotein cholesterol, N SNPs number of SNPs, CI confidence interval.
aCandidate traits that are associated with outliers (p < 5 × 10−8) and both exposure and outcome are listed. The listed traits were used in the adjusted model to investigate whether they are associated with the hypothesised outcome.
bThe number of SNPs used for two-sample MR analysis of candidate traits on the outcome.
cThe results were presented as IVW beta coefficient (95% CI), derived from two-sample MR analyses. Empirical analysis 1: systolic blood pressure (mmHg) and coronary heart disease (log odds); Empirical analysis 2: urate (mg/dl) and coronary heart disease (log odds); Empirical analysis 3: sleep duration (hour/night) and schizophrenia (log odds); Empirical analysis 4: years of schooling (years) and body mass index (kg/m2).
Results of empirical analyses with different IV estimators derived from various MR methods.
| Methods | All variants | Estimates (95% CIs) | ||
|---|---|---|---|---|
| Removing outliers | Removing candidate outliers | Adjustment for candidate outliers | ||
| Heterogeneity ( | 682.7 ( | 312.1 ( | 448.7 ( | 567.6 ( |
| IVW random effects | 1.761 (1.474, 2.104) | 1.876 (1.655, 2.125) | 1.797 (0.558, 5.789) | 1.706 (1.449, 2.008) |
| Egger random effects | 2.641 (1.490, 4.679) | 2.951 (1.970, 4.419) | 2.206 (0.314, 15.472) | – |
| Intercept | 0.980 (0.969, 0.992) | 0.990 (0.982, 0.998) | 0.996 (0.988, 1.004) | – |
| Weighted median | 1.770 (1.528, 2.050) | 1.782 (1.539, 2.065) | 1.765 (0.576, 5.403) | – |
| Weighted mode | 1.770 (1.264, 2.479) | 1.726 (1.218, 2.447) | 1.740 (0.600, 5.043) | – |
| Heterogeneity ( | 81.6 ( | 20.7 ( | 33.4 ( | 44.1 ( |
| IVW random effects | 1.081 (0.996, 1.174) | 1.054 (1.008, 1.103) | 1.062 (1.057, 1.122) | 1.070 (0.992, 1.155) |
| Egger random effects | 0.952 (0.846, 1.071) | 1.008 (0.937, 1.084) | 0.990 (0.910, 1.077) | – |
| Intercept | 1.015 (1.003, 1.027) | 1.006 (0.998, 1.014) | 0.992 (0.984, 1.000) | – |
| Weighted median | 1.019 (0.961, 1.081) | 1.016 (0.958, 1.078) | 1.017 (0.961, 1.077) | – |
| Weighted mode | 1.028 (0.975, 1.084) | 1.022 (0.966, 1.082) | 1.025 (0.970, 1.083) | – |
| Heterogeneity ( | 204.8 ( | 54.1 ( | 121.4 ( | 147.7 ( |
| IVW random effects | 1.184 (0.573, 2.445) | 1.289 (0.828, 2.008) | 1.215 (0.674, 2.192) | 1.181 (0.634, 2.197) |
| Egger random effects | 0.866 (0.056, 13.383) | 2.428 (0.485, 12.158) | 2.363 (0.254, 21.955) | – |
| Intercept | 1.004 (0.968, 1.042) | 0.991 (0.969, 1.013) | 0.991 (0.963, 1.020) | – |
| Weighted median | 1.276 (0.774, 2.104) | 1.249 (0.746, 2.090) | 1.250 (0.761, 2.052) | – |
| Weighted mode | 1.327 (0.679, 2.593) | 1.504 (0.728, 3.105) | 1.428 (0.702, 2.904) | – |
| Heterogeneity ( | 211.9 ( | 101.9 ( | 101.9 ( | 197.8 ( |
| IVW random effects | −0.272 (−0.386, −0.158) | −0.232 (−0.314, −0.150) | −0.232 (−0.314, −0.150) | −0.265 (−0.377, −0.153) |
| Egger random effects | 0.013 (−0.677, 0.703) | −0.404 (−0.910, 0.102) | −0.404 (−0.910, 0.102) | – |
| Intercept | −0.005 (−0.017, 0.007) | 0.003 (−0.005, 0.011) | 0.003 (−0.005, 0.011) | – |
| Weighted median | −0.209 (−0.307, −0.111) | −0.217 (−0.315, −0.119) | −0.217 (−0.315, −0.119) | – |
| Weighted mode | −0.141 (−0.413, 0.131) | −0.127 (−0.405, 0.151) | −0.127 (−0.405, 0.151) | – |
N SNPs number of single nucleotide polymorphisms, 95% CIs 95% confidence intervals, IVW inverse variance weighted. Empirical analysis 1: systolic blood pressure (mmHg) and coronary heart disease (log odds); Empirical analysis 2: urate (mg/dl) and coronary heart disease (log odds); Empirical analysis 3: sleep duration (hour/night) and schizophrenia (log odds); Empirical analysis 4: years of schooling (years) and body mass index (kg/m2).
aHeterogeneity amongst the estimates were assessed based on contribution of individual variant to Cochran’s statistic.
Fig. 3Causal associations between candidate exposures and hypothesised outcome.
Each candidate trait related to an outlier from an analysis is represented by a point in these plots. Along the x-axis, different phenotype groups are shown in different colours. The y-axis presents log transformed P value for each trait, multipled by the sign of the causal effect estimate on the outcome. Filled circles in each category indicate the evidence of association between candidate traits and exposure or outcome (using an FDR < 0.05 threshold; see Methods for discussion of this). a Empirical analysis 1: systolic blood pressure (mmHg) and coronary heart disease (log odds). b Empirical analysis 2: urate (mg/dl) and coronary heart disease (log odds). c Empirical analysis 3: sleep duration (hour/night) and schizophrenia (log odds). d Empirical analysis 4: years of schooling (years) and body mass index (kg/m2).
Fig. 4Exposure–outcome association adjusting the SNP effects on the candidate traits.
Radial plots of MR associations. The x-axis represents the weight (w) that each SNP contributes to the overall estimate, and the y-axis represents the product of the causal effect and weight of each SNP. The slopes represent causal effect estimates from different models (linetype). The arrows in this radial scatter plot indicates changes in the SNPʼs contribution to the overall causal effect estimate after conditioning on the effect of candidate traits on the outcome. The candidate traits that influence the association of the original exposure and the original outcome were listed in the box. a Empirical analysis 1: systolic blood pressure (mmHg) and coronary heart disease (log odds). b Empirical analysis 2: urate (mg/dl) and coronary heart disease (log odds). c Empirical analysis 3: sleep duration (hour/night) and schizophrenia (log odds). d Empirical analysis 4: years of schooling (years) and body mass index (kg/m2). Note that we use radial plots here as they explicitly show that one consequence of SNP-outcome effect adjustment is that the standard errors get larger (lower values on the x-axis). This leads to the adjusted variant contributing less weight to the causal effect and heterogeneity estimates, a process that acts in concert with the intention of attenuating the pleiotropic effect.