| Literature DB >> 30082654 |
S Pamela K Shiao1, James Grayson2, Chong Ho Yu3.
Abstract
For personalized healthcare, the purpose of this study was to examine the key genes and metabolites in the one-carbon metabolism (OCM) pathway and their interactions as predictors of colorectal cancer (CRC) in multi-ethnic families. In this proof-of-concept study, we included a total of 30 participants, 15 CRC cases and 15 matched family/friends representing major ethnic groups in southern California. Analytics based on supervised machine learning were applied, with the target variable being specified as cancer, including the ensemble method and generalized regression (GR) prediction. Elastic Net with Akaike's Information Criterion with correction (AICc) and Leave-One-Out cross validation GR methods were used to validate the results for enhanced optimality, prediction, and reproducibility. The results revealed that despite some family members sharing genetic heritage, the CRC group had greater combined gene polymorphism-mutations than the family controls (p < 0.1) for five genes including MTHFR C677T, MTHFR A1298C, MTR A2756G, MTRR A66G, and DHFR 19bp. Blood metabolites including homocysteine (7 µmol/L), methyl-folate (40 nmol/L) with total gene mutations (≥4); age (51 years) and vegetable intake (2 cups), and interactions of gene mutations and methylmalonic acid (MMA) (400 nmol/L) were significant predictors (all p < 0.0001) using the AICc. The results were validated by a 3% misclassification rate, AICc of 26, and >99% area under the receiver operating characteristic curve. These results point to the important roles of blood metabolites as potential markers in the prevention of CRC. Future intervention studies can be designed to target the ways to mitigate the enzyme-metabolite deficiencies in the OCM pathway to prevent cancer.Entities:
Keywords: colorectal cancer; diverse ethnic groups; generalized regression with validation; metabolites and genes; one carbon metabolism pathways
Year: 2018 PMID: 30082654 PMCID: PMC6164460 DOI: 10.3390/jpm8030026
Source DB: PubMed Journal: J Pers Med ISSN: 2075-4426
Comparison on demographic factors between control and cancer groups.
| Factors | Control (Groups 1, 2) | Cancer (Groups 3, 4) | |||
|---|---|---|---|---|---|
| 1-Healthy | 2-Chronic Diseases | 3-Cancer | 4-Advanced |
| |
| ( | ( | ( | ( | ||
| Gender | |||||
| Male | 0 (0%) | 4 (36.4%) | 5 (100%) | 2 (20%) | 0.008 |
| Female | 4 (100%) | 11 (63.6%) | 0 (0%) | 8 (80%) | |
| Age (Years) | 34 ± 14 | 43 ± 12 | 50 ± 11 | 60 ± 9 | 0.006 |
| (19–51) | (21–58) | (38–62) | (44–72) | ||
| Posthoc | <4 ( | <4 ( | |||
| BMI | 24 ± 3.2 | 28 ± 8.5 | 24 ± 2.2 | 31 | 0.24 |
| (17–28) | (21–49) | (19–29) | (19–51) | ||
| Weight (Kg) | 63 ± 6.8 | 77 ± 26 | 72 ± 11 | 79 | 0.59 |
| (57–71) | (52–141) | (59–88) | (45–138) | ||
| Vegetable intake | 2.3 ± 0.0 | 2 ± 0.8 | 2.6 ± 0.6 | 1.6 ± 0.7 | 0.087 |
| Cup Servings | (2–3) | (1–3) | (2–3) | (1–3) | |
| Posthoc | <3 ( | ||||
| Fruit | 1.3 ± 1.0 | 1.5 ± 0.7 | 1.8 ± 0.5 | 0.9 ± 0.7 | 0.073 |
| Cup Servings | (0–2) | (0–2) | (1–2) | (0–2) | |
| Posthoc | <3 ( | ||||
| Whole grain cups | 1.5 ± 0.6 | 1.7 ± 0.7 | 1.8 ± 0.8 | 1.8 ± 0.8 | 0.92 |
| (1–2) | (1–3) | (1–2) | (0–2) | ||
| Liquid cups | 5.8 ± 1.5 | 5.5 ± 1.6 | 6.2 ± 1.6 | 5.3 ± 1.5 | 0.56 |
| (5–8) | (4–8) | (5–8) | (4–8) | ||
| Race | |||||
| White (10) | 1 (25%) | 3 (27.3%) | 2 (20%) | 4 (40%) | 0.68 |
| Asian (9) | 2 (50%) | 3 (27.3%) | 3 (30%) | 1 (10%) | |
| Hispanic (9) | 1 (25%) | 4 (36.4%) | 0 (0%) | 4 (40%) | |
| African (2) | 0 (0%) | 1 (9.1%) | 0 (0%) | 1 (10%) | |
Nonparametric test, Posthoc by Wilcoxon test. 4 groups: Inflammation status indicated by chronic health diseases (Group 2) or advanced cancer stage (Group 4); M: median; SD: standard deviation; BMI: body mass index.
Comparisons on gene polymorphisms between control and cancer groups.
| Genotype | Control (Groups 1, 2) | Cancer (Groups 3, 4) |
| ||
|---|---|---|---|---|---|
| Enzyme Deficiency | 1-Healthy | 2-Chronic Disease | 3-Cancer | 4-Advanced | |
| ( | ( | ( | ( | ||
| 0 (CC) | 2 (50%) | 5 (45.4%) | 2 (40%) | 2 (20%) | 0.70 |
| 1 (CT) | 1 (25%) | 5 (45.4%) | 2 (40%) | 7 (70%) | |
| 2 (TT) | 1 (25%) | 1 (9.1%) | 1 (20%) | 1 (10%) | |
| 0 (AA) | 2 (50%) | 7 (63.6%) | 4 (80%) | 7 (70%) | 0.82 |
| 1 (AC) | 2 (50%) | 4 (36.4%) | 1 (20%) | 2 (20%) | |
| 2 (CC) | 0 (0%) | 0 (0%) | 0 (0%) | 1 (10%) | |
| 0 (AA) | 2 (50%) | 7 (63.6%) | 4 (80%) | 3 (30%) | 0.40 |
| 1 (AG) | 2 (50%) | 2 (18.2%) | 1 (20%) | 6 (60%) | |
| 2 (GG) | 0 (0%) | 2 (18.2%) | 0 (0%) | 1 (10%) | |
| 0 (AA) | 2 (66.7%) | 6 (54.5%) | 4 (40%) | 0.93 | |
| 1 (AG) | 0 (0%) | 3 (27.3%) | 1 (20%) | 4 (40%) | |
| 2 (GG) | 1 (33.3%) | 2 (18.2%) | 1 (20%) | 2 (20%) | |
| 00 (++) | 1 (25%) | 5 (45.4%) | 0 (0%) | 3 (30%) | 0.69 |
| 01 (+−) | 2 (50%) | 4 (36.4%) | 2 (40%) | 4 (40%) | |
| 11 (−−) | 1 (25%) | 2 (18.2%) | 3 (60%) | 3 (30%) | |
| Total Mutation | |||||
| ≥4 | 1 (25%) | 4 (36.4%) | 1 (20%) | 8 (80%) | 0.077 |
| 3.25 ± 0.50 | 3.36 ± 1.57 | 2.20 ± 1.30 | 3.90 ± 1.45 | 0.16 | |
| (3–4) | (1–6) | (1–4) | (1–6) | ||
| Posthoc | <4 ( | ||||
Nonparametric test, Posthoc by Wilcoxon test. 4 groups: Inflammation status indicated by chronic health diseases (Group 2) or advanced cancer stage (Group 4). MTHFR: methylenetetrahydrofolate; MTR: methionine synthase; MTRR: methionine synthase reductase; DHFR: dihydrofolate reductase.
Comparisons on metabolites in the blood plasma among control and cancer groups.
| Metabolites | Control (Groups 1, 2) | Cancer (Groups 3, 4) |
| ||
|---|---|---|---|---|---|
| M + SD | 1-Healthy | 2-Chronic Disease | 3-Cancer | 4-Advanced | |
| Homocysteine (µmol/L) | 4.5 ± 1.8 | 5.1 ± 1.0 | 8.6 ± 3.8 | 9.1 ± 4.2 | 0.014 |
| Posthoc | <4 ( | <3 ( | |||
| SAM (nmol/L) | 85 ± 24 | 89 ± 17 | 129 ± 61 | 102± 21 | 0.12 |
| SAH (nmol/L) | 25 ± 14 | 23 ± 7.2 | 52 ± 51 | 29 + 13 | 0.25 |
| Posthoc | <3 ( | ||||
| SAM/SAH Ratio | 4.1 ± 1.9 | 4.2 ± 1.1 | 3.2 ± 1.1 | 3.9 ± 1.1 | 0.56 |
| ADMA (nmol/L) | 573 ± 198 | 519 ± 110 | 666 ± 223 | 557 ± 110 | 0.77 |
| SDMA (nmol/L) | 488 ± 130 | 466 ± 78 | 885 ± 671 | 516 ± 109 | 0.44 |
| Methionine (nmol/L) | 37 ± 10 | 30 ± 7.3 | 32 ± 4.8 | 26 ± 6.2( | 0.14 |
| Posthoc | <3 ( | ||||
| MMA (nmol/L) | 249 ± 48 | 285 ± 229 | 359 ± 72 | 274 ± 97 | 0.025 |
| Posthoc | <3 ( | <3 ( | |||
| Cystathionine (nmol/L) | 423 ± 267 | 243 ± 147 | 470 ± 221 | 244 ± 102 | 0.043 |
| Posthoc | <3 ( | <3 ( | |||
| Betaine (nmol/L) | 71 ± 18 | 63 ± 20 | 61 ± 24 | 53 ± 11 | 0.45 |
| Vitamin B-6 (nmol/L) | 50 ± 16 | 60 ± 42 | 64 ± 52 | 46 ± 24 | 0.95 |
| 5-MTHF (nmol/L) | 30 ± 10 | 48 ± 19 | 36 ± 5.3 | 36 ± 16 | 0.063 |
| Posthoc | <2 ( | ||||
| Choline (nmol/L) | 12 ± 5.7 | 9.7 ± 2.8 | 14 ± 7.5 | 10 ± 3.1 | 0.50 |
Nonparametric test, Posthoc by Wilcoxon test; 4 groups: inflammation status indicated by chronic health diseases (Group 2) or advanced cancer stage (Group 4); SAM: S-adenosylmethionine; SAH: S-adenosylhomocysteine; ADMA: Asymmetric dimethylarginine; SDMA: Symmetric dimethylarginine; MMA: Methylmalonic acid; 5-MTFH: 5-methyltetrahydrofolate or methyl-folate.
Baseline logistic regression model and generalized regression elastic net models on the prediction of colorectal cancer from gene-metabolite interaction, with one interaction term.
| Logistic Regression Original Model | Generalized Regression Elastic Net Model | |||||
|---|---|---|---|---|---|---|
| AICc Validation | Leave-One-Out Validation | |||||
| Parameters | Estimate | Estimate | Estimate | |||
| (Intercept) | −5.6 | 0.93 | 0.4 | 0.78 | 1.1 | 0.45 |
| MMA * Gene mutations | −42 | 0.68 | −30 | <0.0001 | −11 | <0.0001 |
| Homocysteine | −15 | 0.77 | −12 | <0.0001 | −5.7 | <0.0001 |
| Methyl-folate | 14 | 0.69 | 9.1 | <0.0001 | 3.4 | 0.0019 |
| Gene mutations | 14 | 0.86 | 11 | <0.0001 | 4.0 | 0.0188 |
| Vegetable intake | 28 | 0.62 | 17 | <0.0001 | 5.6 | 0.0005 |
| Age | −14 | 0.63 | −8.7 | <0.0001 | −2.9 | 0.0024 |
| MMA | −0.4 | 0.996 | −1.7 | 0.28 | 0 | 1.0 |
| Misclassification Rate | 0.2 | – | 0.03 | – | 0.04 | – |
| AICc | 27 | – | 26 | – | – | – |
| Area under the curve | 1.0 | – | 0.998 | – | 0.997 | – |
MMA: Methylmalonic acid; *: Interaction; –: Not available; AICc: Akaike’s information criterion with corrections: AUC: Area under the curve.
Figure 1Receiver operating characteristic curve and area under the curve (AUC) for baseline logistic regression model (a) and generalized regression Elastic Net with Akaike’s information criterion with corrections (AICc) validation model (b) and leave-one-out validation model (c) on the predictors of colorectal cancer from gene-metabolite interaction, with one interaction term.
Figure 2Prediction profiler (a) for significant predictors of colorectal cancer, and (b) interaction profiles of included parameters. Note. Non-parallel lines denote interactions between parameters in association with probability of cancer status (p (GroupCa = 1)), predictive parameters coded in 2 levels by median values; MTHF 40: Methyl folate level 40 nmol/L; tHCY 7: Total homocysteine 7 µmol/L; totmu4: total gene mutation score ≥4; MMA 300: Methylmalonic acid 300 nmol/L; vegtbl 2: Vegetable intake 2 cups.
Baseline logistic regression model and generalized regression Elastic Net models on the prediction of colorectal cancer from gene-metabolite interactions, with two interaction terms.
| Logistic Regression Original Model | Generalized Regression Elastic Net Model | |||||
|---|---|---|---|---|---|---|
| AICc Validation | Leave-One-Out Validation | |||||
| Parameters | Estimate | Estimate | Estimate | |||
| (Intercept) | −0.4 | 0.997 | −0.36 | 0.79 | 1.2 | 0.38 |
| MMA * Gene mutations | −35 | 0.77 | −29 | <0.0001 | −9.2 | <0.0001 |
| Homocysteine | −13 | 0.63 | −12 | <0.0001 | −4.9 | <0.0001 |
| Methyl-folate (MTHF) | 10 | 0.48 | 8.7 | <0.0001 | 2.8 | 0.0093 |
| Gene mutations ≥4 | 17 | 0.92 | 12 | 0.0007 | 3.2 | 0.0496 |
| Vegetable intake | 20 | 0.35 | 16 | <0.0001 | 4.4 | 0.0033 |
| Age | −10 | 0.45 | −8.1 | <0.0001 | −2.5 | 0.0096 |
| MMA | −1.9 | 0.99 | −0.7 | 0.64 | 0 | 1.0 |
| MTHF * Gene mutations | −4.0 | 0.98 | −0.2 | 0.92 | 0 | 1.0 |
| Misclassification Rate | 0.03 | – | 0.03 | – | 0.04 | – |
| AICc | 30 | – | 30 | – | – | – |
| Area under the curve | 0.998 | – | 0.998 | – | 0.997 | – |
MMA: Methylmalonic acid; *: Interaction; –: Not available; AICc: Akaike’s information criterion with corrections: AUC: Area under the curve.
Figure 3Receiver operating characteristic curve and AUC for baseline logistic regression model (a) and generalized regression Elastic Net AICc validation model (b) and leave-one-out validation model (c) on the predictors of colorectal cancer from gene-metabolite interactions, with two interaction terms.