| Literature DB >> 31668394 |
Carl Beuchel1, Susen Becker2, Julia Dittrich3, Holger Kirsten4, Anke Toenjes5, Michael Stumvoll5, Markus Loeffler4, Holger Thiele6, Frank Beutner6, Joachim Thiery7, Uta Ceglarek7, Markus Scholz8.
Abstract
OBJECTIVE: Human blood metabolites are influenced by a number of lifestyle and environmental factors. Identification of these factors and the proper quantification of their relevance provides insights into human biological and metabolic disease processes, is key for standardized translation of metabolite biomarkers into clinical applications, and is a prerequisite for comparability of data between studies. However, so far only limited data exist from large and well-phenotyped human cohorts and current methods for analysis do not fully account for the characteristics of these data. The primary aim of this study was to identify, quantify and compare the impact of a comprehensive set of clinical and lifestyle related factors on metabolite levels in three large human cohorts. To achieve this goal, we improve current methodology by developing a principled analysis approach, which could be translated to other cohorts and metabolite panels.Entities:
Keywords: Acylcarnitines; Amino acids; Clinical factors; Lifestyle factors; Metabolomics; Network analysis
Mesh:
Substances:
Year: 2019 PMID: 31668394 PMCID: PMC6734104 DOI: 10.1016/j.molmet.2019.08.010
Source DB: PubMed Journal: Mol Metab ISSN: 2212-8778 Impact factor: 7.422
Subject characteristics of the three cohorts considered. For continuous variables, median and IQR are shown. For binary variables, total numbers and percentages are provided.
| LIFE-Adult | LIFE-Heart | Sorbs | |
|---|---|---|---|
| Area of collection | Leipzig, Germany | Leipzig, Germany | Upper Lusatia |
| N | 9481 | 5767 | 974 |
| Sex (female/male) | 4952 (52.2%) | 1712 (29.7%) | 574 (58.9%) |
| age (years) | 57.91 [47.7–68.2] | 63.11 [54.4–71.7] | 48.7 [35.6–60.9] |
| WHR | 0.93 [0.863–0.994] | 0.98 [0.909–1.04] | 0.87 [0.804–0.949] |
| BMI (kg/m2) | 26.58 [23.9–29.9] | 28.41 [25.7–31.8] | 26.5 [23.3–29.7] |
| fasting hours (hours) | 12 [11–14] | 3 [1.67–12.3] | >8 |
| Lipid modifying agents (yes/no) | 1272 (13.4%) | 2066 (35.8%) | 176 (18.1%) |
| sex hormones (yes/no) | 751 (7.9%) | 52 (0.9%) | 111 (11.4%) |
| diabetes status (yes/no) | 1090 (11.5%) | 1720 (29.8%) | 86 (8.8%) |
| HBa1c (%) | 5.32 [5.08–5.59] | 5.7 [5.38–6.18] | 5.4 [5.1–5.7] |
| self-reported diabetes (yes/no) | 996 (10.5%) | 1547 (26.8%) | 71 (7.3%) |
| diabetes medication (yes/no) | 840 (8.9%) | 1258 (21.8%) | 57 (5.9%) |
| smoking status (current, previous, never) | 2034 (21.5%)/2706 (28.5%)/4483 (47.3%) | 1581 (27.4%)/2108 (36.6%)/2063 (35.8%) | 150 (15.4%)/195 (20%)/616 (63.2%) |
| Blood pressure (systolic) | 127 [117–138] | 136 [125–150] | 132 [121–145] |
| Blood pressure (diastolic) | 75 [68.5–81.5] | 83.5 [76–90.5] | 80 [73–87] |
| Pulse pressure | 51 [44–60] | 53 [44–63] | 52 [44–61] |
| Cholesterol (mmol/l) | 5.52 [4.85–6.26] | 5.18 [4.4–6.01] | 5.25 [4.63–5.94] |
| LDL-Cholesterol (mmol/l) | 3.45 [2.84–4.11] | 3.15 [2.48–3.87] | 3.32 [2.71–3.98] |
| HDL-Cholesterol (mmol/l) | 1.57 [1.28–1.9] | 1.22 [1.01–1.48] | 1.57 [1.33–1.89] |
| Blood hemoglobin (mmol/l) | 14 [13.2–15] | 14.3 [13.2–15.3] | 8.8 [8.3–9.3] |
| Erythrocytes (10ˆ12/l) | 4.66 [4.38–4.94] | 4.67 [4.34–4.97] | 4.73 [4.47–4.98] |
| Reticulocytes (per 1000) | 12.1 [9.6–14.8] | 12.9 [10.5–16.1] | 10.6 [8.4–13] |
| Hematocrit (%) | 41 [39.2–43.6] | 42 [39–44] | 42 [39.2–43.8] |
| Platelets (10ˆ9/l) | 237 [204–275] | 230 [194–271] | 229 [201–263] |
| Leucocytes (10ˆ9/l) | 5.94 [5–7.1] | 7.9 [6.4–9.9] | 5.25 [4.4–6.23] |
| Neutrophils (%) | 57.6 [51.9–63.2] | 66.5 [59.9–72.8] | 54.65 [48.7–60.5] |
| Lymphocytes (%) | 30.2 [25.1–35.5] | 22.3 [16.8–28.2] | 33.3 [27.9–38.6] |
| Monocytes (%) | 8 [6.8–9.4] | 8.5 [7.1–10.1] | 8.1 [6.9–9.5] |
| Basophils (%) | 0.6 [0.4–0.8] | 0.3 [0.2–0.5] | 0.03 [0.02–0.04] |
| Eosinophils (%) | 2.5 [1.6–3.6] | 1.4 [0.7–2.5] | 0.14 [0.09–0.21] |
Figure 1Comparison of INT-LR method with alternatives – selected results of simulation study: Shown is the distribution of p-values from the simulation study comparing INT-LR approach (Linear regression with inverse-normal transformation) with other methodological approaches (binary/ordinal logistic regression for binary/categorical data, asinh-transformation followed by linear regression and Spearman's Correlation Coefficient for rank data). Results from nine different simulated scenarios are presented, differing in the simulated effect β (no effect: β = 0, small effect: β = 0.02, and large effect β = 0.1) and variable numbers of measurements below the detection limit (0%, 20%, and 80%). The percentage of hypotheses with nominal significance (i.e. p < 0.05) is shown (based on 1000 replications). For scenarios with β = 0, this number is required to be 0.05 (false positive control) and for scenarios with β > 0 as large as possible (good power). The binary model is only applicable in case of zeros. Overall, method INT-LR performed best. Results of additional scenarios are reported in Supplementary Figure 6 and Supplementary Table 5.
Figure 2Heat map of univariable associations between metabolite levels and clinical or lifestyle-related factors. Explained variance by the single factor is color-coded (1 ≙ 100%) with direction of effect (red = positive correlation, blue = negative correlation). Maximum values across the three cohorts are presented. Stars indicated associations significant after adjusting for multiple testing. Rows and columns are ordered according to a hierarchical clustering. Cohort-specific plots can be found in Supplementary Figures 8–10.
Figure 3Heat map of multivariable association results between clinical and lifestyle-related factors and metabolite levels. Partial explained variance (1 ≙ 100%) is color-coded according to the direction of the effect (positive = red, negative = blue). Maximum values across the three cohorts are presented. Rows and columns are ordered according to a hierarchical clustering. To avoid collinearity, strongly correlated factors were pruned (see methods). Cohort-specific plots can be found in Supplementary Figures 11–13.
Figure 4Bi-partite network of metabolites (yellow) and factors (blue) based on multivariable associations explaining at least 1% of variance. Thickness of edges corresponds to the maximum partial explained variance over the three cohorts. An interactive version of this plot is available at https://cfbeuchel.shinyapps.io/interactivefig4/.
Figure 5Distributions of uni- and multivariable explained variances of clinical and lifestyle-related factors and comparison between cohorts. Boxplots show the distribution of explained variances (respectively partial explained variances for multivariable models) for the different metabolites. The dashed line represents an exemplarily r2 cutoff (1%) to mark strong effects.
Figure 6Heatmap of partial-rof interaction effects of study with the 22 factors and study regarding the 63 metabolites. Significance is indicated as an asterisk and was computed via likelihood-ratio test of multivariable linear regression models. The full model includes main effects for each factor and study and their interactions. It is compared with a reduced model not containing the considered interaction effect. Correction for multiple testing was applied by a hierarchical Bonferroni procedure (see methods).