| Literature DB >> 33868364 |
Lijuan Lin1, Ruyang Zhang1,2,3,4, Hui Huang1, Ying Zhu1, Yi Li5, Xuesi Dong1,6, Sipeng Shen1, Liangmin Wei1, Xin Chen1, David C Christiani2,7,8, Yongyue Wei1,2,3,4, Feng Chen1,2,3,4.
Abstract
Mendelian randomization (MR) can estimate the causal effect for a risk factor on a complex disease using genetic variants as instrument variables (IVs). A variety of generalized MR methods have been proposed to integrate results arising from multiple IVs in order to increase power. One of the methods constructs the genetic score (GS) by a linear combination of the multiple IVs using the multiple regression model, which was applied in medical researches broadly. However, GS-based MR requires individual-level data, which greatly limit its application in clinical research. We propose an alternative method called Mendelian Randomization with Refined Instrumental Variable from Genetic Score (MR-RIVER) to construct a genetic IV by integrating multiple genetic variants based on summarized results, rather than individual data. Compared with inverse-variance weighted (IVW) and generalized summary-data-based Mendelian randomization (GSMR), MR-RIVER maintained the type I error, while possessing more statistical power than the competing methods. MR-RIVER also presented smaller biases and mean squared errors, compared to the IVW and GSMR. We further applied the proposed method to estimate the effects of blood metabolites on educational attainment, by integrating results from several publicly available resources. MR-RIVER provided robust results under different LD prune criteria and identified three metabolites associated with years of schooling and additional 15 metabolites with indirect mediation effects through butyrylcarnitine. MR-RIVER, which extends score-based MR to summarized results in lieu of individual data and incorporates multiple correlated IVs, provided a more accurate and powerful means for the discovery of novel risk factors.Entities:
Keywords: Mendelian randomization; educational attainment; genetic score; metabolomics; multiple correlated instrumental variables
Year: 2021 PMID: 33868364 PMCID: PMC8044958 DOI: 10.3389/fgene.2021.618829
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 1Diagram of Mendelian randomization and flowchart of the proposed MR-RIVER method. (A) Mendelian randomization inferring the causal association of the exposure and outcome: (i) IVs are associated with the exposure X; (ii) IVs and outcome Y are independent, conditional on exposure X and unmeasured confounders U; (iii) IVs and confounders U are independent. (B) Flowchart of the proposed MR-RIVER method for multiple genetic variants in causal inference.
FIGURE 2Comparison of refined and traditional coefficients under different correlation structures. Expected values are the regression coefficients obtained from the multivariable regression model with all the variables used to generate dependent variable Y plotted against predicted values obtained from the refined method (refined coefficients) and traditional single-locus analyses (traditional coefficients). Refined and traditional coefficients were compared with the bias from expected coefficients under different correlation structures through a regression model. Red equation represents the relationship between expected coefficients and refined coefficients, and green equation represents traditional coefficients.
FIGURE 3Comparison of refined and traditional coefficients under different effect sizes. Expected values are the regression coefficients obtained from the multivariable regression model with all the variables used to generate dependent variable Y plotted against predicted values obtained from the refined method (refined coefficients) and traditional single-locus analyses (traditional coefficients). Refined and traditional coefficients were compared with the bias from expected coefficients under different effect sizes through a regression model. Red equation represents the relationship between expected coefficients and refined coefficients, and green equation represents traditional coefficients.
FIGURE 4Statistical properties of MR methods under different correlations. Correlation between IVs plotted against: (A) type I error under the null hypothesis; (B) performance of power under the alternative hypothesis with b = 1; (C) bias under the alternative hypothesis; and (D) mean square error.
FIGURE 5Study workflow for educational attainment MR analysis.
FIGURE 6MR-RIVER and GSMR analysis for causal association between metabolites and educational attainment. Relationship between individual metabolites with –log10 (P-value) of the association. Upper yellow values represent MR-RIVER results, and lower blue values represent GSMR results. Associated metabolites are annotated.
Relative bias of imputed datasets with three imputation methods.
| Method | Metabolite | |||
| MR-RIVER | Butyrylcarnitine | −0.0430 | 0.0081 | 1.08 × 10–07 |
| 1,5-Anhydroglucitol (1,5-AG) | −0.1916 | 0.0367 | 1.77 × 10–07 | |
| Homocitrulline | −0.2687 | 0.0708 | 1.47 × 10–04 | |
| GSMR | Biliverdin | −0.0284 | 0.0036 | 2.92 × 10–15 |
| 1,5-Anhydroglucitol (1,5-AG) | −0.1838 | 0.0339 | 5.83 × 10–08 | |
| X-12092 | 0.0283 | 0.0056 | 3.85 × 10–07 |
FIGURE 7Diagram of MR analysis between metabolites and mediation analysis. (A) MR inferring the causal association of remaining metabolites (X) on previously identified metabolites (Y). Mediation analysis of the rest of metabolites on risk of education years through the identified metabolites. (B) Metabolites that indirectly mediate the effect on education years through butyrylcarnitine in mediation analysis. b_CI represents effect of metabolites on butyrylcarnitine and 95% confidence interval (95% CI). IE_CI represents indirect effect of metabolites on education years and 95% CI, and IE_pval represents P-values. (C) Causal network of blood metabolites on education years. Blue circles indicate metabolites that are directly identified, while yellow circles have indirect effect through blue metabolites. Red lines represent positive effects, and blue lines indicate negative effects.