| Literature DB >> 35893246 |
Nikolaos G Bliziotis1, Leo A J Kluijtmans1, Gerjen H Tinnevelt2, Parminder Reel3, Smarti Reel3, Katharina Langton4, Mercedes Robledo5,6, Christina Pamporaki4, Alessio Pecori6, Josie Van Kralingen7, Martina Tetti6, Udo F H Engelke1, Zoran Erlic8, Jasper Engel9, Timo Deutschbein10,11, Svenja Nölting12, Aleksander Prejbisz13, Susan Richter14, Jerzy Adamski15,16,17,18, Andrzej Januszewicz13, Filippo Ceccato19, Carla Scaroni19, Michael C Dennedy20, Tracy A Williams6, Livia Lenzini21, Anne-Paule Gimenez-Roqueplo22,23, Eleanor Davies7, Martin Fassnacht10,24,25, Hanna Remde10, Graeme Eisenhofer4,14, Felix Beuschlein8,12, Matthias Kroiss10,12,24,25, Emily Jefferson3,26, Maria-Christina Zennaro22,23, Ron A Wevers1, Jeroen J Jansen2, Jaap Deinum27, Henri J L M Timmers27.
Abstract
Despite considerable morbidity and mortality, numerous cases of endocrine hypertension (EHT) forms, including primary aldosteronism (PA), pheochromocytoma and functional paraganglioma (PPGL), and Cushing's syndrome (CS), remain undetected. We aimed to establish signatures for the different forms of EHT, investigate potentially confounding effects and establish unbiased disease biomarkers. Plasma samples were obtained from 13 biobanks across seven countries and analyzed using untargeted NMR metabolomics. We compared unstratified samples of 106 PHT patients to 231 EHT patients, including 104 PA, 94 PPGL and 33 CS patients. Spectra were subjected to a multivariate statistical comparison of PHT to EHT forms and the associated signatures were obtained. Three approaches were applied to investigate and correct confounding effects. Though we found signatures that could separate PHT from EHT forms, there were also key similarities with the signatures of sample center of origin and sample age. The study design restricted the applicability of the corrections employed. With the samples that were available, no biomarkers for PHT vs. EHT could be identified. The complexity of the confounding effects, evidenced by their robustness to correction approaches, highlighted the need for a consensus on how to deal with variabilities probably attributed to preanalytical factors in retrospective, multicenter metabolomics studies.Entities:
Keywords: confounders; metabolomics; multicenter; plasma NMR; preanalytical conditions
Year: 2022 PMID: 35893246 PMCID: PMC9394285 DOI: 10.3390/metabo12080679
Source DB: PubMed Journal: Metabolites ISSN: 2218-1989
Patient and sample characteristics. Abbreviations are explained in the main text.
| PHT ( | EHT ( | PA ( | PPGL ( | CS ( | |
|---|---|---|---|---|---|
| PATIENT CHARACTERISTICS | |||||
| PATIENT AGE | 55 [ | 49 [ | 48 [ | 50 [ | 47 [ |
|
|
| 0.1 |
| ||
| PATIENT SEX (F/M) | 61/45 | 135/96 | 47/57 | 58/36 | 30/3 |
| 0.9 | 0.1 | 0.6 |
| ||
| PREANALYTICAL SAMPLE CHARACTERISTICS | |||||
| SAMPLE AGE (days) | 2393 [127–6418] * | 1125 [11–3442] * | 535 [52–2280] * | 1548 [11–3442] * | 162 [19–1186] * |
|
|
|
|
| ||
| SAMPLE CENTER OF ORIGIN | |||||
| FRPA1 | 17 (16%) | 66 (29%) | 40 (38%) | 26 (28%) | 0 |
| FRPA2 | 0 | 11 (4.8%) | 0 | 0 | 11 (33%) |
| GBGL2 | 49 (46%) | 0 | 0 | 0 | 0 |
| GYDR | 20 (19%) | 28 (12%) | 8 (7.7%) | 19 (20%) | 1 (3.0%) |
| GYLU | 0 | 1 (0.4%) | 0 | 1 (1.1%) | 0 |
| GYMU | 0 | 4 (1.7%) | 0 | 4 (4.3%) | 0 |
| GYWU | 0 | 1 (0.4%) | 0 | 1 (1.1%) | 0 |
| IRGA | 0 | 3 (1.3%) | 0 | 0 | 3 (9.1%) |
| ITPD | 0 | 20 (8.7%) | 2 (1.9%) | 4 (4.3%) | 14 (42%) |
| ITPD3 | 0 | 9 (3.9%) | 8 (7.7%) | 0 | 1 (3.0%) |
| ITTU3 | 20 (16%) | 51 (22%) | 46 (44%) | 2 (2.1%) | 3 (9.1%) |
| NLNI | 0 | 6 (2.6%) | 0 | 6 (6.4%) | 0 |
| PLWW | 0 | 31 (13%) | 0 | 31 (33%) | 0 |
|
|
|
|
| ||
| ANALYTICAL SAMPLE CHARACTERISTICS | |||||
| BATCH | 22 [ | 25 [ | 22 [ | 31 [ | 24 [ |
| 0.1 | 0.8 |
| 0.9 | ||
| RUN ORDER | 7 [ | 7 [ | 7 [ | 9 [ | 7 [ |
| 0.6 | 0.7 | 0.5 | 0.9 | ||
* Each continuous variable, such as patient age, is presented as a mean or median (depending on normality, indicated with the asterisk) and a range. ** For continuous variables, a p-value was obtained from a t/Wilcoxon test (depending on normality), comparing each disease group (Endocrine Hypertension (EHT), Primary Aldosteronism (PA), Pheochromocytoma/Paraganglioma (PPGL) or Cushing’s Syndrome (CS)) to the control group (Primary Hypertension, PHT). *** For categorical variables, such as patient sex, a p-value was obtained from a Fisher test, comparing each disease group (EHT, PA, PPGL or CS) to the control group (PHT).
Figure 1Results obtained after peak picking, as well after grouping and filling of the NMR spectra. Only the first half of samples analyzed are depicted.
Figure 2PCA plots of the first two principal components, calculated from all samples and all 86 peaks. In score plot (a), samples were colored according to disease group (CS, PA, PHT or PPGL), whereas in score plot (b), samples were colored according to the centers in which they were collected and score plot (c) depicts samples colored according to sample age, with the median value as the cutoff. In all scores plots, the percentage of explained variance per component is depicted in the plot axes. Though PC2 scores are slightly higher for PHT samples compared to EHT, samples were strikingly different from center to center. The 95% ellipses in plot (b) were calculated based on a score plot colored according to the two main clusters according to PC1, i.e., cluster 1 (orange) and cluster 2 (blue), and were included here to highlight what seems to be the most important source of variation in these data. Plot (d) is the loadings plot of the same PCA, with NMR peaks in blue and the corresponding metabolite names in red. Only peaks with a correlation cutoff above 0.5 are shown, as these arise from metabolites that most affect sample distribution in the scores plots (a–c).
Summary of accuracies from sPLSDA models via each approach for each scenario.
| Scenario | Metric | Initial Approach | Approach A | Approach B | Approach C |
|---|---|---|---|---|---|
| EHT-PHT | Balanced Accuracy | 79 (78–79) | 79 (79–79) | 67 (66–67) | 58 (57–59) |
| Sensitivity ** | 87 (87–87) | 84 (83–84) | 69 (69–70) | 62 (61–63) | |
| Specificity *** | 70 (70–71) | 74 (73–74) | 64 (63–65) | 53 (51–55) | |
| PA-PHT | Balanced Accuracy | 83 (83–84) | 83 (83–83) | 69 (69–70) | 69 (68–70) |
| Sensitivity ** | 90 (89–90) | 89 (89–90) | 70 (69–71) | 77 (76–79) | |
| Specificity *** | 77 (77–78) | 77 (77–77) | 69 (68–70) | 61 (60–62) | |
| PPGL-PHT | Balanced Accuracy | 79 (78–79) | 81 (80–81) | 68 (68–69) | 68 (67–69) |
| Sensitivity ** | 88 (87–88) | 86 (85–87) | 69 (68–70) | 69 (68–70) | |
| Specificity *** | 70 (69–70) | 75 (75–76) | 67 (66–68) | 66 (65–68) | |
| CS-PHT | Balanced Accuracy | 85 (84–85) | - | 82 (81–82) | - |
| Sensitivity ** | 71 (71–72) | - | 79 (78–80) | - | |
| Specificity *** | 98 (98–99) | - | 84 (84–85) | - | |
| ALL-ALL | Balanced Accuracy | 65 (64–65) | - | 53 (52–53) | 57 (57–58) |
| CS TP Rate | 73 (72–74) | - | 72 (71–73) | - | |
| PA TP * Rate | 65 (64–65) | - | 53 (52–54) | 69 (67–70) | |
| PHT TP * Rate | 72 (72–72) | - | 45 (45–46) | 35 (33–36) | |
| PPGL TP * Rate | 50 (49–51) | - | 42 (41–43) | 69 (68–70) |
* TP stands for True Positive. ** Sensitivity is the TP rate of the disease group (EHT, PA, PPGL or CS). *** Specificity is the TP rate of the control group (PHT). All metrics are given as means, with the 95% confidence interval (in brackets).
Regression coefficients for peaks representative of each metabolite, obtained via sPLSDA on the EHT vs. PHT scenario, from each approach. Most peaks that were selected as predictors via the Initial Approach were not selected via Approach C. Lactate, which was selected by both, has a negative coefficient in the Approach C model, contrasting the Initial Approach. Metabolites highlighted in bold were found to have a strong relationship with a confounder (Table 4).
| Metabolite | NMR Signal (ppm) | Initial Approach | Approach A | Approach B | Approach C |
|---|---|---|---|---|---|
| Alanine | 1.457 | −0.19975 | −0.07187 | 0 | 0 |
|
| 3.917 | 0 | 0 | 0.019157 | |
| Creatinine | 4.041 | 0 | 0 | 0.177419 | 0 |
|
| 3.137 | 0 | 0 | 0.109629 | |
| Dimethylamine | 2.695 | 0 | 0 | −0.03439 | 0 |
| Dimethylglycine | 2.91 | 0 | 0 | 0.04487 | 0.027703 |
| Formate | 8.441 | −0.01988 | −0.01755 | 0.023133 | 0 |
|
| 2.433 | 0.148614 | 0.135957 | 0 | |
|
| 2.325 | −0.12554 | −0.14981 | 0 | |
|
| 5.22 | 0.039396 | 0.0097 | 0.147391 | |
|
| 3.548 | −0.0108 | 0 | 0 | |
|
| 3.555 | 0 | 0 | 0.091654 | |
|
| 4.108 | 0.025885 | 0 | −0.08734 | |
| Lysine | 2.997 | 0 | 0 | 0.03347 | 0 |
|
| 2.122 | 0.052659 | 0.02404 | 0 | |
|
| 3.346 | 0.062726 | 0.050343 | 0.04658 | |
| Proline | 1.996 | −0.02291 | −0.00628 | −0.13954 | −0.01636 |
|
| 2.356 | 0.312859 | 0.32791 | 0.197295 | |
| Threonine | 4.24 | 0 | 0 | 0.040696 | 0 |
| Tyrosine | 7.168 | 0 | 0 | −0.0194 | 0 |
| Valine | 0.981 | 0 | 0 | 0 | −0.00058 |
| Unknown Metabolites | 3.162 | 0.009448 | 0.017788 | 0.236056 | 0 |
| 3.262 | 0 | 0 | −0.15528 | −0.05692 | |
| 3.284 | −0.03909 | −0.02878 | 0 | ||
| 3.612 | 0 | 0 | −0.11482 | 0 | |
| 3.67 | 0 | 0 | −0.12957 | 0 |
Figure 3The identification of metabolites from NMR signals. After significant differences were detected and observed directly in spectra (a), spiking experiments were carried (b) out to validate assignments performed by means of 2D NMR experiments, namely J-resolved (c) and correlation spectroscopy (d).
Metabolites selected for exclusion due to a strong relationship with a confounder.
| Metabolite | NMR Peaks (ppm) | Dataset | Reason * | FRPA1 PHT/Cluster 2/High Sample Age |
|---|---|---|---|---|
| Acetylcarnitine | 3.177 | PA-PHT, PPGL-PHT | PLSDA CLUSTER, SAMPLE AGE | ↓ |
| Creatine | 3.021, 3.917 | PA-PHT, PPGL-PHT | PLSDA CLUSTER, SAMPLE AGE | ↑ |
| Dimethyl sulfone | 3.137 | PA-PHT, PPGL-PHT | PLSDA SAMPLE AGE | ↑ |
| Glucose | 5.220, 5.227 | PA-PHT, PPGL-PHT | PLSDA CLUSTER, SAMPLE AGE | ↓ |
| Glutamate | 2.047, 2.060, 2.075, 2.095, 2.103, 2.108, 2.113, 2.122, 2.132, 2.140, 2.145, 2.325, 2.332, 2.341, 2.356 | PA-PHT, PPGL-PHT | FRPA1 PHT, PLSDA CLUSTER, SAMPLE AGE | ↑ |
| Glutamine | 2.095, 2.103, 2.108, 2.113, 2.122, 2.132, 2.140, 2.145, 2.418, 2.428, 2.433, 2.444, 2.449, 2.460 | PA-PHT, PPGL-PHT | FRPA1 PHT, PLSDA CLUSTER, SAMPLE AGE | ↓ |
| Glycerol | 3.555, 3.567 | PA-PHT | PLSDA CLUSTER, SAMPLE AGE | ↑ |
| Glycine | 3.548 | PA-PHT, PPGL-PHT | PLSDA SAMPLE AGE | ↓ |
| Lactate | 1.321, 1.307, 4.080, 4.094, 4.108, 4.121 | PA-PHT, PPGL-PHT | PLSDA CLUSTER, SAMPLE AGE | ↑ |
| Methanol | 3.346 | PA-PHT, PPGL-PHT | PLSDA CLUSTER, SAMPLE AGE | ↓ |
| Methionine | 2.122 | PA-PHT, PPGL-PHT | FRPA1 PHT, PLSDA SAMPLE AGE | ↓ |
| Ornithine | 3.041, 3.057 | PA-PHT, PPGL-PHT | PLSDA CLUSTER, SAMPLE AGE | ↑ |
| Pyruvate | 2.356 | PA-PHT, PPGL-PHT | PLSDA CLUSTER, SAMPLE AGE | ↓ |
| Unknown metabolite | 3.284 | PA-PHT, PPGL-PHT | PLSDA CLUSTER, SAMPLE AGE | ↑ |
* Peaks were excluded either because they were found to be important in discriminating samples in a Partial Least Squares Discriminant Analysis (PLSDA) of center cluster, i.e., the separation of centers according to the first dimension in the PCA score plot of Figure 2b, in a PLSDA of sample age (with the median sample age as a cutoff), or because they were found to be higher or lower in the FRPA1 PHT group of samples.
Patient and sample characteristics, after whole center exclusions (Approach C). Abbreviations are explained in the main text.
| PHT ( | EHT ( | PA ( | PPGL ( | |
|---|---|---|---|---|
| PATIENT CHARACTERISTICS | ||||
| PATIENT AGE | 44 [ | 49 [ | 48 [ | 50 [ |
|
| 0.05 | |||
| PATIENT SEX (F/M) | 15/25 | 65/53 | 24/30 | 41/23 |
| 0.07 | 0.5 |
| ||
| PREANALYTICAL SAMPLE CHARACTERISTICS | ||||
| SAMPLE AGE | 366 [127–1307] * | 748 [83–2841] * | 380 [83–1598] * | 1419 [121–2841] |
|
| 1 |
| ||
| SAMPLE CENTER OF ORIGIN | ||||
| GYDR | 20 (50%) | 27 (23%) | 8 (15%) | 19 (30%) |
| GYLU | 0 | 1 (0.8%) | 0 | 1 (1.6%) |
| GYMU | 0 | 4 (3.4%) | 0 | 4 (6.3%) |
| GYWU | 0 | 1 (0.8%) | 0 | 1 (1.6%) |
| ITTU3 | 20 (50%) | 48 (41%) | 46 (85%) | 2 (3.1%) |
| NLNI | 0 | 6 (5.1%) | 0 | 6 (9.4%) |
| PLWW | 0 | 31 (26%) | 0 | 31 (48%) |
|
|
|
| ||
| ANALYTICAL SAMPLE CHARACTERISTICS | ||||
| BATCH | 19 [ | 27 [ | 22 [ | 37 [ |
|
| 0.1 |
| ||
| RUN ORDER | 7 [ | 9 [ | 8 [ | 9 [ |
| 0.3 | 0.4 | 0.4/1 | ||
* Each continuous variable, such as patient age, is presented as a mean or median (depending on normality, indicated with the asterisk) and a range. ** For continuous variables, a p-value was obtained from a t/Wilcoxon test (depending on normality), comparing each disease group (EHT, PA, or PPGL) to the control group (PHT). The PPGL column has an additional p-value obtained from the comparison of PA to PPGL. *** For categorical variables, such as patient sex, a p-value was obtained from a Fisher test, comparing each disease group (EHT, PA, or PPGL) to the control group (PHT). The PPGL column has an additional p-value obtained from the comparison of PA to PPGL.