| Literature DB >> 30066639 |
Susana Conde1, Xiaoguang Xu1, Hui Guo1, Markus Perola2, Teresa Fazia3, Luisa Bernardinelli3, Carlo Berzuini4.
Abstract
BACKGROUND: Recent advances in data analysis methods based on principles of Mendelian Randomisation, such as Egger regression and the weighted median estimator, add to the researcher's ability to infer cause-effect links from observational data. Now is the time to gauge the potential of these methods within specific areas of biomedical research. In this paper, we choose a study in metabolomics as an illustrative testbed. We apply Mendelian Randomisation methods in the analysis of data from the DILGOM (Dietary, Lifestyle and Genetic determinants of Obesity and Metabolic syndrome) study, in the context of an effort to identify molecular pathways of cardiovascular disease. In particular, our illustrative analysis addresses the question whether body mass, as measured by body mass index (BMI), exerts a causal effect on the concentrations of a collection of 137 cardiometabolic markers with different degrees of atherogenic power, such as the (highly atherogenic) lipoprotein metabolites with very low density (VLDLs) and the (protective) high density lipoprotein metabolites.Entities:
Keywords: Cardiovascular disease; Causal inference; Egger regression; Metabolomics; Multiple instrumental variables; Weighted median estimator
Mesh:
Substances:
Year: 2018 PMID: 30066639 PMCID: PMC6069804 DOI: 10.1186/s12859-018-2178-2
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Conditional independence graph representation for a class of Mendelian randomisation problems that does not violate Conditions (a), (b) and (c) in the main text. Here Z represents the instrumental variable(s), the symbol X represents an (intermediate) phenotype or exposure, Y represents the outcome, and U a set of imperfectly observed confounders of the relationship between exposure and outcome
Percentage allele frequencies for the 18 instrumental SNPs in our DILGOM analysis, computed over 688 individuals
| SNP label (dbSNP) | 0 | 1 | 2 |
|---|---|---|---|
| RS143298427 | 97.1 | 2.3 | - |
| RS148966272 | 96.5 | 2.8 | - |
| RS34364548 | 93.5 | 2 | - |
| RS13413025 | 92.9 | 3.2 | 0.1 |
| RS1502591 | 49.7 | 39.2 | 6.5 |
| RS1504056 | 41.3 | 44.3 | 14.4 |
| RS76755887 | 93.2 | 5.5 | 0.4 |
| RS1109179 | 10.6 | 43.5 | 42.6 |
| RS17052428 | 96.2 | 3.8 | - |
| RS142181699 | 97.1 | 2 | - |
| RS10269617 | 92.4 | 6.1 | - |
| RS17159014 | 94 | 2.8 | - |
| RS17109797 | 73.1 | 24.9 | 1.7 |
| RS10459315 | 52.8 | 36.3 | 7.8 |
| RS4782306 | - | 3.3 | 94 |
| RS141336523 | 94.0 | 5.2 | 0.1 |
| RS4816160 | 71.8 | 26.0 | 2.2 |
| RS116920478 | 91.7 | 7.0 | 0.4 |
Allele coding: 0 stands for homozygous major, 1 for heterozygous, and 2 for homozygous minor, except for SNPs RS1109179 and RS4782306, where the homozygous categories are inverted
Fig. 2Heatmap representation of the Pearson correlations and clustering of the log-transformed metabolite concentrations
Fig. 3Illustration of Egger regression approach to Mendelian randomisation. Each point in the scatter plot represents an instrumental SNP. The vertical axis represents the estimated slope of the regression of Y (serum total triglycerides) on the SNP. The horizontal axis represents the estimated slope of the regression of X on the SNP. The black and red lines correspond to the Egger regression and weighted median estimate of the causal effect of X on Y, respectively. In this particular example, the intercept of the Egger regression is not significantly different from zero, which we interpret to indicate little evidence of directional pleiotropy (p=0.574). The estimates of the causal effect produced by Egger regression and by the weighted median estimator are both significant and in reasonable concordance. Grey dashed lines represent the 95 percent confidence intervals for the slopes in the regressions
Fig. 4Scatter plot of the − log10p-values for the WME estimates of the causal effects of log BMI on each studied metabolite in each of ten clusters of metabolites. Each plot in the figure refers to the metabolites in a particular cluster, with cluster 1 represented by the left top subplot, and cluster 10 represented by the right bottom subplot. In the plots, the sign of each p-value equals the sign of the corresponding causal effect. The horizontal axes index the metabolites in each cluster, their order depending on the clustering method. The black dashed lines indicate thresholds for significance (± log100.05). The blue lines are smoothing splines