| Literature DB >> 31125048 |
Jesse Fest1,2, Lisanne S Vijfhuizen3, Jelle J Goeman4, Olga Veth3, Anni Joensuu5,6, Markus Perola5,6, Satu Männistö5, Eivind Ness-Jensen7, Kristian Hveem7, Toomas Haller8, Neeme Tonisson8,9, Kairit Mikkel8, Andres Metspalu8, Cornelia M van Duijn2, Arfan Ikram2, Bruno H Stricker2, Rikje Ruiter2, Casper H J van Eijck1, Gert-Jan B van Ommen3, Peter A C ʼt Hoen3,10.
Abstract
Most patients with pancreatic cancer present with advanced disease and die within the first year after diagnosis. Predictive biomarkers that signal the presence of pancreatic cancer in an early stage are desperately needed. We aimed to identify new and validate previously found plasma metabolomic biomarkers associated with early stages of pancreatic cancer. Prediagnostic blood samples from individuals who were to receive a diagnosis of pancreatic cancer between 1 month and 17 years after sampling (N = 356) and age- and sex-matched controls (N = 887) were collected from five large population cohorts (HUNT2, HUNT3, FINRISK, Estonian Biobank, Rotterdam Study). We applied proton nuclear magnetic resonance-based metabolomics on the Nightingale platform. Logistic regression identified two interesting hits: glutamine (P = 0.011) and histidine (P = 0.012), with Westfall-Young family-wise error rate adjusted P values of 0.43 for both. Stratification in quintiles showed a 1.5-fold elevated risk for the lowest 20% of glutamine and a 2.2-fold increased risk for the lowest 20% of histidine. Stratification by time to diagnosis suggested glutamine to be involved in an earlier process (2 to 5 years before diagnosis), and histidine in a process closer to the actual onset (<2 years). Our data did not support the branched-chain amino acids identified earlier in several US cohorts as potential biomarkers for pancreatic cancer. Thus, although we identified glutamine and histidine as potential biomarkers of biological interest, our results imply that a study at this scale does not yield metabolomic biomarkers with sufficient predictive value to be clinically useful per se as prognostic biomarkers.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31125048 PMCID: PMC6594461 DOI: 10.1210/en.2019-00165
Source DB: PubMed Journal: Endocrinology ISSN: 0013-7227 Impact factor: 4.736
Figure 1.Schematic overview of the sample set used for data analysis and the different data analysis approaches performed in the current study. aAny individual containing missing values in metabolomics measurements or phenotypical information were assumed to be missing at random and were removed from the data set. bAny individual containing missing values in phenotypical information were removed from the data set. PC, pancreatic cancer.
Baseline Characteristics
| HUNT2 (n = 590) | HUNT3 (n = 194) | EGCUT (n = 227) | FR (n = 272) | RS (n = 173) | ALL (n = 1456) | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Cases | Controls | Statistics | Cases | Controls | Statistics | Cases | Controls | Statistics | Cases | Controls | Statistics | Cases | Controls | Statistics | Cases | Controls | Statistics | |
| Total, n (%) | 158 (26.8%) | 432 (73.2%) | 64 (33%) | 130 (67%) | 76 (33.5%) | 151 (66.5%) | 57 (21%) | 215 (79%) | 89 (51.4%) | 84 (48.6%) | 444 (30.5%) | 1012 (69.5%) | ||||||
| Female, n (%) | 80 (50.6%) | 220 (50.9%) |
| 37 (57.8%) | 76 (58.5%) |
| 44 (57.9%) | 88 (58.3%) |
| 22 (38.6%) | 82 (38.1%) |
| 54 (60.7%) | 45 (53.6%) |
| 237 (53.4%) | 511 (50.5%) |
|
| Age, y, mean (SD) | 66.8 ± 11.1 | 64.8 ± 11.2 |
| 69.5 ± 9.5 | 69.5 ± 9.3 |
| 63.4 ± 10.6 | 63.3 ± 10.5 |
| 59.7 ± 8.8 | 59.8 ± 8.9 |
| 71.6 ± 8.5 | 70.9 ± 9.2 |
| 66.6 ± 10.7 | 64.6 ± 10.8 |
|
| BMI, kg/m2, mean (SD) | 27.3 ± 4.1 | 26.9 ± 3.7 |
| 28.3 ± 3.6 | 27 ± 4.2 |
| 29.1 ± 5.8 | 28.5 ± 5.4 |
| 27.2 ± 3.9 | 27.7 ± 4.3 |
| 27.4 ± 4.1 | 27.3 ± 4.1 |
| 27.8 ± 4.4 | 27.3 ± 4.3 |
|
| DM, n (%) | 11 (7%) | 18 (4.2%) |
| 10 (15.6%) | 8 (6.2%) |
| 33 (43.4%) | 32 (21.2%) |
| 6 (10.5%) | 21 (9.8%) |
| 12 (13.5%) | 2 (2.4%) |
| 72 (16.2%) | 81 (8%) |
|
| Smoking, n (%) | 0 = 49 (31%) | 0 = 187 (43.3%) |
| 0 = 22 (34.4%) | 0 = 58 (44.6%) |
| 0 = 38 (50%) | 0 = 100 (66.2%) |
| 0 = 21 (36.8%) | 0 = 106 (49.3%) |
| 0 = 22 (24.7%) | 0 = 20 (23.8%) |
| 0 = 152 (34.2%) | 0 = 471 (46.5%) |
|
| 1 = 48 (30.4%) | 1 = 140 (32.4%) | 1 = 19 (29.7%) | 1 = 55 (42.3%) | 1 = 18 (23.7%) | 1 = 20 (13.2%) | 1 = 14 (24.6%) | 1 = 66 (30.7%) | 1 = 37 (41.6%) | 1 = 50 (59.5%) | 1 = 136 (30.6%) | 1 = 331 (32.7%) | |||||||
| 2 = 59 (37.3%) | 2 = 93 (21.5%) | 2 = 20 (31.3%) | 2 = 13 (10%) | 2 = 20 (26.3%) | 2 = 31 (20.5%) | 2 = 19 (33.3%) | 2 = 41 (19.1%) | 2 = 27 (30.3%) | 2 = 12 (14.3%) | 2 = 145 (32.7%) | 2 = 190 (18.8%) | |||||||
| NA = 2 (1.3%) | NA = 12 (2.8%) | NA = 3 (4.7%) | NA = 4 (3.1%) | NA = 0 (0%) | NA = 0 (0%) | NA = 3 (5.3%) | NA = 2 (0.9%) | NA = 3 (3.4%) | NA = 2 (2.4%) | NA = 11 (2.5%) | NA = 20 (2%) | |||||||
| Fasted, n (%) | 0 = 134 (84.9%) | 0 = 363 (84%) |
| 0 = 46 (71.9%) | 0 = 97 (74.6%) |
| 0 = 35 (46.1%) | 0 = 97 (64.2%) |
| 0 = 4 (7%) | 0 = 4 (1.9%) |
| 0 = 26 (29.2%) | 0 = 21 (25%) |
| 0 = 245 (55.2%) | 0 = 582 (57.5%) |
|
| 0 = ≤4 h, 1 = 4 h to 8 h | 1 = 19 (12%) | 1 = 56 (13%) | 1 = 9 (14.1%) | 1 = 18 (13.8%) | 1 = 7 (9.2%) | 1 = 24 (15.9%) | 1 = 38 (66.7%) | 1 = 171 (79.5%) | 1 = 1 (1.1%) | 1 = 0 (0%) | 1 = 74 (16.7%) | 1 = 269 (26.6%) | ||||||
| 2 = >8 h | 2 = 3 (1.9%) | 2 = 9 (2.1%) | 2 = 6 (9.4%) | 2 = 5 (3.8%) | 2 = 6 (7.9%) | 2 = 15 (9.9%) | 2 = 14 (24.6%) | 2 = 40 (18.6%) | 2 = 56 (62.9%) | 2 = 57 (67.8%) | 2 = 85 (19.1%) | 2 = 126 (12.5%) | ||||||
| NA = 2 (1.3%) | NA = 4 (0.9%) | NA= 3 (4.7%) | NA = 10 (7.7%) | NA = 28 (36.8%) | NA = 15 (9.9%) | NA = 1 (1.8%) | NA = 0 (0%) | NA = 6 (6.7%) | NA = 6 (7.1%) | NA = 40 (9%) | NA = 35 (3.5%) | |||||||
Values are number counts (percentages) or mean ± SD. P values are from χ2 test (categorical variables) or Student t test (continuous variables) comparing cases and controls.
Abbreviation: NA, not available.
P < 0.05.
Top Hits From Logistic Regression Analysis
| Metabolite | Estimate | SE |
|
| Adjusted |
|---|---|---|---|---|---|
| Histidine | −0.188 | 0.074 | −2.529 | 0.011 | 0.4274 |
| Glutamine | −0.175 | 0.069 | −2.525 | 0.012 | 0.4274 |
| DHA.FA | 0.195 | 0.081 | 2.393 | 0.017 | 0.5214 |
| FAw3.FA | 0.170 | 0.075 | 2.272 | 0.023 | 0.6203 |
| M.HDL.P | −0.151 | 0.072 | −2.085 | 0.037 | 0.7646 |
| M.HDL.L | −0.149 | 0.072 | −2.076 | 0.038 | 0.7695 |
| DHA | 0.153 | 0.075 | 2.029 | 0.043 | 0.7975 |
| M.HDL.PL | −0.145 | 0.072 | −2.020 | 0.043 | 0.8016 |
| M.HDL.C | −0.139 | 0.071 | −1.941 | 0.052 | 0.8513 |
| M.HDL.CE | −0.138 | 0.072 | −1.929 | 0.054 | 0.8589 |
| M.HDL.PL | 0.141 | 0.074 | 1.898 | 0.058 | 0.8756 |
Abbreviations: DHA, docosahexaenoic acid; HDL, high-density lipoprotein; DHA.FA, ratio of docosahexaenoic acid to all fatty acids; FAw3.FA, ratio of ω-3 fatty acids to total acids; M.HDL.P, concentration of medium HDL particles; M.HDL.PL: phospholipids in medium-sized HDL particles.
Logistic regression with single metabolite measure, sex, age, BMI, smoking status, T2DM status, fasting status, and cohort as covariates.
The estimates are the fitted β coefficients from the logistic regression model. As the input metabolite data were scaled, the estimates can be interpreted as follows: the OR for developing pancreatic cancer in a case with a typical low metabolite score of 1 SD below the average z score (= −1) would amount to 1.22 for β of −0.1 and 1.49 for β of −0.2. The z value mentioned in the table is the test statistic from the logistic regression models.
Figure 2.Concentrations (logarithmic scale) of (A–D) glutamine and (E–H) histidine in the blood circulation in controls and cases, that is, those individuals who developed pancreatic cancer within a time window after blood sampling. (A and E) Distribution of the concentrations of controls (light blue) and cases (dark blue) in the different cohorts analyzed (EGCUT, FR, HUNT2, HUNT3, RS). (B and F) Distribution of concentrations in nondiabetics (light blue) and individuals diagnosed with T2DM (dark blue). (C and G) Distribution of concentrations in controls and cases sampled within 2 y before diagnosis, between 2 and 5 y before diagnosis, and >5 y before diagnosis. (D and H) Distribution of concentrations in nonfasting individuals (light blue), individuals who had a meal between 4 and 8 h before blood draw (dark blue), and fasting individuals (green, last meal was >8 h before blood draw). Box plots reflect the distribution of the concentrations in individual samples, including the middle quartiles (25th to 75th percentile of the data points are in the boxes); the horizontal band; the median value; the lower whiskers representing the data points up to 1.5 × the interquartile range (IQR) below the 25th percentile; the upper whiskers representing the data points up to 1.5 × IQR above the 75th percentile; the data points outside these ranges plotted as individual data points.
Top Hits From Meta-Analysis
| Metabolite |
| CI | Unadjusted |
|
|
|---|---|---|---|---|---|
| Glutamine | −0.19538 | −0.33:−0.06 | 0.004 | 0.9037087 | 0 |
| DHA.FA | 0.17259 | 0.04:0.3 | 0.0083 | 1 | 0 |
| M.HDL.PL | −0.17905 | −0.32:−0.04 | 0.0103 | 1 | 0 |
| M.HDL.P | −0.17856 | −0.32:−0.04 | 0.0104 | 1 | 0 |
| M.HDL.L | −0.17732 | −0.31:−0.04 | 0.0104 | 1 | 0 |
| FAw3.FA | 0.16222 | 0.03:0.29 | 0.0126 | 1 | 0 |
| Histidine | −0.25164 | −0.46:−0.05 | 0.0156 | 1 | 0.53 |
| M.HDL.FC | −0.15636 | −0.29:−0.02 | 0.0251 | 1 | 0 |
| M.HDL.C | −0.15174 | −0.29:−0.02 | 0.0267 | 1 | 0 |
| M.HDL.CE | −0.14723 | −0.28:−0.01 | 0.0306 | 1 | 0 |
| DHA | 0.13222 | 0:0.26 | 0.0438 | 1 | 0 |
Abbreviations: DHA, docosahexaenoic acid; DHA.FA, ratio of docosahexaenoic acid to total fatty acids; FAw3.FA, ratio of ω-3 fatty acids to total acids; HDL, high-density lipoprotein; M.HDL.C, total cholesterol in medium-sized HDL particles; M.HDL.CE, cholesterol esters in medium-sized HDL particles; M.HDL.FC, free cholesterol in medium-sized HDL particles; M.HDL.L, total lipids in medium-sized HDL particles; M.HDL.P, concentration of medium-sized HDL particles; M.HDL.PL, phospholipids in medium-sized HDL particles.
Meta-analysis across the five cohorts of logistic regression results with single metabolite measure, sex, age BMI, smoking status, T2DM status, and fasting status as covariates. β is effect size and can be interpreted as detailed in footnote b to Table 2. P value is Bonferroni–Holm-adjusted P value. I2 is the statistic used for heterogeneity between cohorts.
Figure 3.Forest plots from random effects meta-analysis across different cohorts for (A) glutamine and (B) histidine. The meta-analysis was performed on the β coefficients and SD from the logistic regressions run for each cohort separately. In the logistic regression, pancreatic cancer status was modeled as a function of log-transformed and standardized metabolite concentration, sex, age, BMI, smoking status, T2DM, and fasting status. Shown are the estimated effect size, the SE on this estimate, the estimated OR and the CI on this ratio, the weight of the individual cohort on the calculation of the final estimate, the heterogeneity measure (modeling differences between cohorts), and the unadjusted and Bonferroni–Holm-corrected P values for the respective metabolites.
ORs for Developing Pancreatic Cancer in Different Glutamine and Histidine Strata
| Based on Control Data | Controls, n | Cases, n | OR | 5% CI | 95% CI |
| |
|---|---|---|---|---|---|---|---|
| Glutamine | |||||||
| 0% | 0.269 | 180 | 94 | 1 | — | — | |
| 20% | 0.4538 | 176 | 66 | 0.72 | 0.49 | 1.05 | 0.0852 |
| 40% | 0.487 | 177 | 62 | 0.67 | 0.46 | 0.98 | 0.0404 |
| 60% | 0.5157 | 176 | 71 | 0.77 | 0.53 | 1.12 | 0.1734 |
| 80% | 0.55358 | 178 | 62 | 0.66 | 0.46 | 0.98 | 0.0376 |
| Histidine | |||||||
| 0% | 0.03927 | 178 | 110 | 1 | — | — | |
| 20% | 0.060498 | 177 | 71 | 0.65 | 0.45 | 0.93 | 0.0199 |
| 40% | 0.064778 | 177 | 58 | 0.53 | 0.36 | 0.78 | 0.0011 |
| 60% | 0.068174 | 177 | 66 | 0.6 | 0.42 | 0.87 | 0.0073 |
| 80% | 0.072638 | 178 | 51 | 0.46 | 0.31 | 0.69 | 0.0001 |
Variables Selected by the LASSO Regression
| Cohort | Time Condition |
| Selected Variables |
| Significance |
|---|---|---|---|---|---|
| Full data | 2 y | 6.06 | M.VLDL.FC_, UnSat, SFA.FA | 0.175 | |
| Full data | 5 y | 28.3 | S.VLDL.FC_, Gln | 0.0114 |
|
| Full data | Max( | 2.02 | XL.VLDL.TG, XL.HDL.TG, M.HDL.PL, XXL.VLDL.PL_., XXL.VLDL.CE_, L.VLDL.PL_, L.VLDL.FC_, M.LDL.TG_, XL.HDL.CE_, XL.HDL.FC_, L.HDL.FC_, FreeC, SMs, LA, DHA.FA, LA.FA, Glc, Cit, Ala, Gln, His, Val, Phe, AcAce, bOHBut, Crea | 0.102 | |
The results of the cross-validated LASSO-penalized logistic regression for the full dataset are shown. For each regression the penalty parameter (λ) and the selected covariates (separated by commas) are given. For every model where metabolites were selected, the significance of the presence of all of the selected metabolites in the model, compared with the model without presence of metabolites, is tested in a global test, and its P value is given here. Note that the P value is only for the metabolites, not for the clinical covariates.
Abbreviations: AcAce, acetoacetate; Ala, alanine; bOHBut, 3-hydroxybutyrate; Cit, citrate; Crea, creatinine; DHA.FA, ratio of docosahexaenoic acid to total fatty acids; FreeC, free cholesterol; Glc, glucose; Gln, glutamine; His, histidine; LA, linoleic acid; LA.FA, ratio of linoleic acid to total fatty acids; L.HDL.FC_, free cholesterol to total lipids ratio in large HDLs; L.VLDL.FC_, free cholesterol to total lipids ratio in large VLDLs; L.VLDL.PL_, phospholipids to total lipids ratio in large VLDLs; M.HDL.PL, phospholipids in medium-sized HDLs; M.LDL.TG_, triglycerides to total lipids ratio in medium LDLs; M.VLDL.FC_, free cholesterol to total lipids ratio in medium VLDLs; Phe, phenylalanine; SFA.FA, ratio of saturated fatty acids to total fatty acids; SM, sphingomyelin; S.VLDL.FC_, free cholesterol to total lipids ratio in small VLDLs; UnSat, estimated degree of unsaturation; Val, valine; XL.HDL.CE_, cholesterol ester to total lipids ratio in very large HDLs; XL.HDL.FC_, free cholesterol to total lipids ratio in very large HDLs; XL.HDL.TG, triglycerides in very large HDLs; XXL.VLDL.CE_, cholesterol esters to total lipids ratio in chylomicrons and extremely large VLDLs; XL.VLDL.TG, triglycerides in extra-large VLDL particles; XXL.VLDL.PL_, phospholipids to total lipids ratio in chylomicrons and extremely large VLDLs.
Figure 4.Receiver operator curves for classification of pancreatic cancer cases (sampled up to 5 y before diagnosis) and controls for (A) training set (70% of all individuals) and (B) performance testing set (30% of all individuals unseen during the variable selection). In red, the null model is shown in which only the clinical covariates (sex, age, BMI, smoking status, T2DM, and fasting status) were included in the regression. In blue, the alternative model is shown where the metabolites selected by the LASSO regression were included in addition to the clinical covariates. The AUCs are indicated, as well as the specificity (1 − false-positive rate) at 70% sensitivity.