| Literature DB >> 29226067 |
Jie Zheng1, Denis Baird1, Maria-Carolina Borges1, Jack Bowden1, Gibran Hemani1, Philip Haycock1, David M Evans1,2, George Davey Smith1.
Abstract
PURPOSE OF REVIEW: Mendelian randomization (MR) is a strategy for evaluating causality in observational epidemiological studies. MR exploits the fact that genotypes are not generally susceptible to reverse causation and confounding, due to their fixed nature and Mendel's First and Second Laws of Inheritance. MR has the potential to provide information on causality in many situations where randomized controlled trials are not possible, but the results of MR studies must be interpreted carefully to avoid drawing erroneous conclusions. RECENTEntities:
Keywords: Databases and automation tools for causal inference; Disease progression; Drug development; Hypothesis-free causality; Mendelian randomization
Year: 2017 PMID: 29226067 PMCID: PMC5711966 DOI: 10.1007/s40471-017-0128-6
Source DB: PubMed Journal: Curr Epidemiol Rep
Fig. 1Design strategies for Mendelian randomization. a Standard MR: The causal relationship between an exposure variable (X) and an outcome (Y) is estimated using genetic variants (Z) as an instrument, regardless of the presence of variables (C) that may confound the observational association between the exposure and outcome. One method of estimation involves calculation of the Wald Ratio, [see Burgess review paper for description of the various instrumental variable (IV) estimators available] [3], where the causal estimate () is derived by dividing the estimated regression coefficient of the outcome on the single nucleotide polymorphism (SNP) () by the estimated regression coefficient of the exposure on the SNP (). b Two-sample MR. c Bidirectional MR. d Mediation and two-step MR. e Multivariable MR. f Factorial MR
Recent Mendelian randomization studies
| Type | Exposure | Outcome | Importance | References |
|---|---|---|---|---|
| Drug target validation |
| Metabolites | MR study suggests that genetic polymorphisms in the | [ |
| Drug target validation |
| CHD | MR study suggests that lowering LDL cholesterol level via inhibition of | [ |
| Drug target validation | PCSK9 (evolucumab) | CHD | Genetic evidence suggests that polymorphisms in the | [ |
| Drug target validation | CRP | CHD | Despite serum levels of C-reactive protein having marked clinical utility as a biomarker of inflammation, MR studies have consistently failed to find evidence for a causal effect of serum C-reactive protein on risk of cardio-metabolic disease. These MR studies have potentially saved pharmaceutical companies from developing agents that would be destined to fail in RCTs | [ |
| Drug target validation | Lp-PLA2 (darapladib) | CHD | Many resources were spent on trials that showed therapeutic lowering of Lp-PLA2 level does not lower risk of CVD; some MR studies were published before the reporting of RCT results | [ |
| Drug target repurposing | IL6 (tocilizumab) | CHD | Repurposing blockade of the interleukin-6 receptor (tocilizumab) as therapeutic approach to prevention of coronary heart disease | [ |
| Predicting side effects of drug targets |
| Type 2 diabetes | MR studies indicated that the increased incidence of type 2 diabetes among statin users in RCTs was likely to be due to an on-target effect of statins on | [ |
| Predicting side effects of drug targets |
| Type 2 diabetes | Suggests | [ |
| Public health/clinical practice | Adiposity (BMI and waist–hip ratio) | CHD | Suggests adiposity causes CHD, heart failure and ischemic stroke. No trial has yet shown this causal relationship | [ |
| Public health/clinical practice | LDL cholesterol | Diabetes | Suggests LDL cholesterol lowering might generally lead to increased risk of diabetes mellitus, and has potential ramifications for drugs that lower LDL cholesterol level | [ |
| Public health/clinical practice | Triglycerides | CHD | MR studies suggest that lowering triglycerides will reduce risk of CHD. | [ |
| Public health/clinical practice | Educational attainment | CHD | MR findings suggest that policy interventions that increase education may reduce the burden of cardiovascular disease | [ |
| Public health/clinical practice | Vitamin D | Multiple Sclerosis | MR studies suggest that lowered vitamin D level is causally associated with increased susceptibility to MS. | [ |
| Public health/clinical practice | Alcohol | CVDs (including blood pressure, coronary artery calcification, and CHD) | Suggests alcohol is harmful to cardiovascular health at all doses of consumption, contrary to decades of observational data | [ |
| Public health/clinical practice | Telomere length | Cancers, cardiovascular diseases, and other diseases | Large-scale MR study suggests that longer telomeres increase risk for some cancers but reduce risk for some non-neoplastic diseases, including cardiovascular diseases, which highlight the value of MR at a phenome wide scale | [ |
| Public health/clinical practice | Obesity, type 2 diabetes, Metabolic factors | Pancreatic Cancer | MR study suggests a causal role of BMI and fasting insulin in pancreatic cancer etiology | [ |
| Public health/clinical practice | Blood lipids | Prostate cancer | MR study shows evidence that higher LDL-C and triglycerides levels increase aggressive prostate cancer risk | [ |
| Public health/clinical practice | Childhood adiposity | Type 1 diabetes | MR study provides genetic evidence for childhood adiposity as a risk factor for Type 1 diabetes | [ |
| Public health/clinical | Vitamin D | Cancer | MR study suggests that vitamin D supplementation should not currently be recommended as a strategy for primary cancer prevention. | [ |
| Public health/clinical practice | Cannabis use | Schizophrenia | MR study suggests cannabis use to be associated with increased risk of schizophrenia | [ |
| Complex molecular traits | Blood cell traits | Complex human traits | Large-scale MR as a follow-up of GWAS of complex molecular traits, which suggests causal relationship between blood cell indices and autoimmune diseases, schizophrenia, and coronary heart disease | [ |
| Complex molecular traits | Methylation QTLs | Cardiovascular and complex traits | Systematically explores the genetic influences on complex disease mediated by DNA methylation using MR and fine mapping. MR suggests causal relationships between methylation levels and 14 cardiovascular disease traits | [ |
| Complex molecular traits | Expression QTLs | Complex human traits | Integrated expression QTLs with summary GWAS results of human diseases using SMR (more details of the method in Table | [ |
| Complex molecular traits | Protein QTLs | Vascular, neoplastic, and autoimmune diseases | Identified causal roles for protein biomarkers in disease, which suggests a causal relationship for IL1RL1-IL18R1 loci on atopic dermatitis as well as MMP-12 on CHD | [ |
Potential pitfalls in the interpretation of MR Studies and suggestions for dealing with these
| Limitation | Description | Solution |
|---|---|---|
| Weak instrument bias | Weakly associated variants (F statistics < 10) can bias causal estimates towards the null for two-sample MR and towards the observational estimate for one-sample MR | Increase sample sizes through utilizing large publicly available GWAS datasets (e.g., UK Biobank) or summary GWAS results data. |
| Lack of reliable genetic instruments for exposure of interest | Genetic instruments are not available for some exposures | Conduct MR on a similar exposure (proxy phenotype) for which GWAS data is available, for example, BMI is often used as a proxy of overall adiposity [ |
| Population stratification | Spurious associations may arise in MR where the genetic variant and the outcome are associated with ancestral background in an admixed or stratified sample | Use genetic associations derived from within homogenous populations only. |
| Low power | Causal estimates are imprecise (wide confidence intervals) and the MR analysis lacks power to detect a causal effect (1 – probability (type II error)). Power is a function of sample size, variance explained in the exposure by the SNP, causal effect size, strength of confounding, and type 1 error rate. Approximate power can be determined using a freely available web application [ | Same as above for weak instruments |
| Horizontal pleiotropy | The genetic instrument is associated with the outcome via pathway that does not pass through the exposure of interest [ | Better understanding of the underlying biological function of genetic variants/genes (for example using selected candidate loci). |
| Linkage disequilibrium (LD) | Confounding can be re-introduced in the analysis through a variant in LD with the instrument that exerts an effect on the outcome through a pathway other than through the exposure of interest | As above for pleiotropy |
| Canalization/developmental compensation | An individual adapts in response to a genetic change so that the effect of that genetic change is reduced or absent. MR may produce causal estimates that are not representative of effects that would be produced by modifying the exposure | Extent of the impact of canalization on MR is currently unclear. Greater understanding of the patterns of gene expression/regulation during development is required to evaluate the plausibility and consequences of canalization |
| Complexity of biology | Due to the underlying complexity of biological pathways overly simplistic interpretations can be misleading. For example, a genetic variant in the IL-6 receptor ( | Improved understanding of underlying molecular biology and exact pathways involved |
| Winner’s curse | In the case of single-sample MR, utilizing the same sample as a discovery analysis for genetic instruments is not a good idea because estimates of the SNP-exposure association will be biased upwards. | In the case of single-sample MR, using an unweighted allelic score of several variants may provide a sensitivity analysis. |
| Trait heterogeneity | Genetic instruments are sometimes associated with multiple aspects of traits (exposures). Such heterogeneity does not preclude causal inference but it does undermine the ability to infer causality for particular dimensions of heterogeneous exposures and makes interpretation of MR analyses more difficult | Only biological knowledge was able to resolve the particular dimension of the biological pathway causally relevant to certain outcomes (e.g., diseases). Further research is required in this area |
| Collider bias | Collider bias occurs when the exposure and outcome of interest independently influence a third risk factor, and this third risk factor is conditioned upon. | Given the influence of selection and attrition on a study is known, this could lead to biased estimates of both phenotype and genetic association. For example, having DNA available on most participants in a birth cohort study offers the possibility of investigating the extent to which polygenic scores predict subsequent participation, which in turn would enable sensitivity analyses of the extent to which bias might distort estimates |
Databases and bioinformatic toolkits for performing MR
| Name | Note | Web link | Ref |
|---|---|---|---|
| MR-Base | GWAS summary database of more than 1100 GWAS trails and online platform to automate MR |
| [ |
| MR-PRESSO | R package that allows for the evaluation of pleiotropy in multi-instrument Mendelian randomization |
| [ |
| Two-sample MR | R package for MR analysis, directly links to MR-Base database via API |
| [ |
| Mendelian randomization | R package for MR analysis, links to Phenoscanner database |
| [ |
| MR robust | STATA package for MR analysis |
| [ |
| Summary-data-based Mendelian randomization (SMR) | Linux package for MR analysis for testing expression QTL on complex diseases |
| [ |
| PHESANT | R package for performing phenome scans in UK Biobank, including MR phenome-wide association studies (MR-pheWAS) |
| [ |
| PhenoSpD | R scripts to estimate multiple testing correction for hypothesis free MR |
| [ |
Methods for dealing with limitations of MR
| Category | Method | Description | Reference |
|---|---|---|---|
| Estimation of causal effect | Inverse variance weighted | Traditional MR method which uses a meta-analysis approach to combine the Wald ratio estimates of the causal effect obtained from different SNPs. The point estimates obtained from IVW MR are equivalent to a weighted linear regression of SNP-outcome associations on SNP-exposure associations with the intercept constrained to zero | [ |
| MR-Egger | Unlike IVW, MR-Egger regression is not constrained to have a slope through zero, therefore its causal estimate represents a genotype-outcome dose response relationship which takes pleiotropic effects into account. It requires the InSIDE assumption to hold, which means the strength of the gene-exposure association should not correlate with the strength of bias due to pleiotropy | [ | |
| Weighted median | Defined as the median of a weighted empirical density function of the ratio estimates. Can consistently estimate the causal effect if at least 50% of the information in the analysis comes from valid instruments | [ | |
| Mode-based estimate (MBE) | The MBE provides a consistent estimate of the causal effect if the most common pleiotropy value across instruments is zero. This is termed the Zero Modal Pleiotropy Assumption (ZEMPA) | [ | |
| Pleiotropy-robust MR (PPMR) | Provides an unbiased causal estimate in the presence of pleiotropy, by subtracting the pleiotropic effect (of an instrument) estimated in a subgroup of the population for whom the instrument is not associated with the exposure. It assumes that such a sub-population exists, and the measure of genetic pleiotropy in this sub-population is the same as the general population | [ | |
| Generalized gene-environment interaction models | These methods provide unbiased estimates of the causal effect in the presence of horizontal pleiotropy by using the gene-environment interaction term in a linear interaction model as a valid instrument. It requires there exists variation in the strength of gene-exposure association across subgroups of the environmental covariate | [ | |
| Likelihood-base methods | Assumes a linear relationship between the exposure and outcome and a bivariate normal distribution for the genetic estimates used. Naturally incorporate uncertainty and correlation between SNP-exposure and SNP-outcome parameter estimates. Yield efficient estimates when its specific modelling assumptions are met (and is therefore robust to weak instrument bias in this case), but is sensitive to model misspecification (for example due to invalid instruments). | [ | |
| Bayesian model averaging | Assumes prior probabilities for three pleiotropy models: (1) no pleiotropy; (2) pleiotropy with average value zero satisfying the InSIDE assumption; (3) non zero average pleiotropy satisfying the InSIDE assumption. Bayesian model averaging is then used for inference | [ | |
| Testing for pleiotropy | MR-Egger intercept | Intercept of the MR-Egger regression captures the average pleiotropic effect across all genetic variants | [ |
| Cochran Q (IVW), Rucker’s Q (MR-Egger) | Measure of heterogeneity between genetic instruments used which could indicate pleiotropic effects | [ | |
| Cook’s distance/studentized residuals | Standard regression diagnostics that can be used to detect outliers that may distort the causal effect estimation | [ | |
| Leave-one-out analysis | Systematic removal of genetic instruments from MR analysis to identify influential outliers | [ | |
| Assessment of instrument strength | Mean F-statistic (IVW) | Used to measure the strength of genetic instruments in IVW and hence assess the influence of weak instrument bias. F < 10 is considered problematic | [ |
| I2 (MR-Egger) | Adaption of the I2 heterogeneity statistic in meta-analysis which measures the degree of regression dilution bias in the MR-Egger estimate in the two-sample setting | [ | |
| Data visualization | Funnel plot | Plot of instrument strength versus causal effect estimate. Used to detect evidence of directional pleiotropy in MR-Egger analyses | [ |
| Scatter plot | Scatter of genotype-outcome associations versus genotype-exposure associations. Used to detect departures from MR assumptions, and compare regression slopes from different MR analysis | [ | |
| Radial plot | Plots square root instrument strength times the causal estimate on the vertical axis versus the square root instrument strength on the horizontal axis. Facilitates a more straightforward detection of outliers in an IVW and MR-Egger analysis. It can also be used as a basis for a generalized form of MR-Egger regression “radial MR-Egger” | [ | |
| Q-contribution plots | Depicts the contribution of individual genetic variants towards the overall heterogeneity observed in Cochran’s Q statistic (for IVW) and Rucker’s Q’ statistic (for MR-Egger). Used to identify pleiotropic variants by assessing each SNP’s contribution against a chi-squared distribution on 1 degree of freedom | [ | |
| Forest plot | Compares the MR estimates for each genetic instrument to detect pleiotropic effects | [ |