| Literature DB >> 30360514 |
Rachel S Kelly1,2, Michael J McGeachie3,4, Kathleen A Lee-Sarwar5,6,7, Priyadarshini Kachroo8,9, Su H Chu10,11, Yamini V Virkud12,13, Mengna Huang14,15, Augusto A Litonjua16,17,18, Scott T Weiss19,20, Jessica Lasky-Su21,22.
Abstract
To explore novel methods for the analysis of metabolomics data, we compared the ability of Partial Least Squares Discriminant Analysis (PLS-DA) and Bayesian networks (BN) to build predictive plasma metabolite models of age three asthma status in 411 three year olds (n = 59 cases and 352 controls) from the Vitamin D Antenatal Asthma Reduction Trial (VDAART) study. The standard PLS-DA approach had impressive accuracy for the prediction of age three asthma with an Area Under the Curve Convex Hull (AUCCH) of 81%. However, a permutation test indicated the possibility of overfitting. In contrast, a predictive Bayesian network including 42 metabolites had a significantly higher AUCCH of 92.1% (p for difference < 0.001), with no evidence that this accuracy was due to overfitting. Both models provided biologically informative insights into asthma; in particular, a role for dysregulated arginine metabolism and several exogenous metabolites that deserve further investigation as potential causative agents. As the BN model outperformed the PLS-DA model in both accuracy and decreased risk of overfitting, it may therefore represent a viable alternative to typical analytical approaches for the investigation of metabolomics data.Entities:
Keywords: Bayesian networks; Partial Least-Squares Discriminant analysis; arginine metabolism; asthma; overfitting
Year: 2018 PMID: 30360514 PMCID: PMC6316795 DOI: 10.3390/metabo8040068
Source DB: PubMed Journal: Metabolites ISSN: 2218-1989
Figure 1Study Schematic.
Characteristics of the asthma cases and controls at age three.
| Variable | Controls ( | Cases ( | ||||
|---|---|---|---|---|---|---|
|
| % |
| % | |||
| Gender | Male | 183 | 52.0% | 36 | 61.0% | 0.208 |
| Female | 169 | 48.0% | 23 | 39.0% | ||
| Race | White | 119 | 33.8% | 15 | 25.4% | 0.003 |
| Black | 159 | 45.2% | 40 | 67.8% | ||
| Other | 74 | 21.0% | 4 | 6.8% | ||
| Treatment Group | Placebo | 182 | 51.7% | 28 | 47.5% | 0.576 |
| Intervention | 170 | 48.3% | 31 | 52.5% | ||
| BMI | Mean (SD) | 16.6 (2.1) | 17.1 (1.9) | 0.063 | ||
Figure 2Discrimination of asthma at age three based on plasma metabolomic profiles by PLS-DA (AUCCH = 0.810) and by Bayesian network (AUCCH = 0.921) on the full dataset. AUCCH–Area under the Convex Hull of the Receiver Operator Characteristic curve; BN–Bayesian Network; ROC–Receiver Operator Characteristic curve; ROCCH–ROC Convex Hull.
Metabolites with an influential loading in the first PLSDA component.
| Metabolite | Super Pathway | Sub Pathway | HMDB ID | Loading |
|---|---|---|---|---|
| glycochenodeoxycholate sulfate | Lipid | Primary Bile Acid Metabolism | −9.12 | |
| stachydrine | Xenobiotics | Food Component/Plant | HMDB04827 | −6.41 |
| Amino Acid | Urea cycle; Arginine and Proline Metabolism | −6.31 | ||
| glycolithocholate sulfate | Lipid | Secondary Bile Acid Metabolism | HMDB02639 | −4.87 |
| methyl glucopyranoside (alpha + beta) | Xenobiotics | Food Component/Plant | −4.54 | |
| theobromine | Xenobiotics | Xanthine Metabolism | HMDB02825 | −4.44 |
| cysteine s-sulfate | Amino Acid | Methionine, Cysteine, SAM and Taurine Metabolism | HMDB00731 | −4.23 |
| 4-vinylguaiacol sulfate | Xenobiotics | Food Component/Plant | −4.14 | |
| taurolithocholate 3-sulfate | Lipid | Secondary Bile Acid Metabolism | HMDB02580 | −4.12 |
| 3-hydroxyhippurate | Xenobiotics | Benzoate Metabolism | HMDB06116 | −3.72 |
| 2,3-dihydroxyisovalerate | Xenobiotics | Food Component/Plant | HMDB12141 | −3.34 |
| 4-methylcatechol sulfate | Xenobiotics | Benzoate Metabolism | −3.33 | |
| vanillic alcohol sulfate | Amino Acid | Tyrosine Metabolism | −3.25 | |
| 3-(3-hydroxyphenyl)propionate | Xenobiotics | Benzoate Metabolism | HMDB00375 | −3.21 |
| p-cresol-glucuronide | Amino Acid | Tyrosine Metabolism | HMDB11686 | −3.2 |
| CMP | Nucleotide | Pyrimidine Metabolism, Cytidine containing | HMDB00095 | −3.16 |
| indolepropionate | Amino Acid | Tryptophan Metabolism | HMDB02302 | −3.11 |
| beta-cryptoxanthin | Xenobiotics | Food Component/Plant | HMDB33844 | −3.08 |
| xylose | Carbohydrate | Pentose Metabolism | HMDB00098 | −3.05 |
| tauro-beta-muricholate | Lipid | Primary Bile Acid Metabolism | HMDB00932 | −3.04 |
| 5-hydroxyindoleacetate | Amino Acid | Tryptophan Metabolism | HMDB00763 | −2.93 |
| gamma-glutamylglutamate | Peptide | Gamma-glutamyl Amino Acid | HMDB11737 | −2.87 |
| ferulic acid 4-sulfate | Xenobiotics | Food Component/Plant | HMDB29200 | −2.8 |
| cinnamoylglycine | Xenobiotics | Food Component/Plant | HMDB11621 | −2.71 |
| tryptophan betaine | Amino Acid | Tryptophan Metabolism | HMDB61115 | −2.7 |
| 1,2,3-benzenetriol sulfate (2) | Xenobiotics | Chemical | −2.66 | |
| catechol sulfate | Xenobiotics | Benzoate Metabolism | HMDB59724 | −2.65 |
| quinate | Xenobiotics | Food Component/Plant | HMDB03072 | −2.62 |
| inosine 5’-monophosphate (IMP) | Nucleotide | Purine Metabolism, (Hypo)Xanthine/Inosine containing | HMDB00175 | −2.59 |
| gamma-glutamylvaline | Peptide | Gamma-glutamyl Amino Acid | HMDB11172 | −2.58 |
| ergothioneine | Xenobiotics | Food Component/Plant | HMDB03045 | −2.49 |
| ribitol | Carbohydrate | Pentose Metabolism | HMDB00508 | −2.49 |
| glycerophosphoinositol | Lipid | Phospholipid Metabolism | −2.49 | |
| umbelliferone sulfate | Xenobiotics | Food Component/Plant | −2.43 | |
| pyrraline | Xenobiotics | Food Component/Plant | HMDB33143 | −2.4 |
| 4-acetylphenyl sulfate | Xenobiotics | Drug | −2.37 | |
| gamma-glutamylisoleucine | Peptide | Gamma-glutamyl Amino Acid | HMDB11170 | −2.36 |
| Amino Acid | Urea cycle; Arginine and Proline Metabolism | −2.34 | ||
| 3-methoxycatechol sulfate (1) | Xenobiotics | Benzoate Metabolism | −2.32 | |
| 4-vinylphenol sulfate | Xenobiotics | Benzoate Metabolism | HMDB04072 | −2.29 |
| sphinganine | Lipid | Sphingolipid Metabolism | HMDB00269 | −2.29 |
| hydantoin-5-propionic acid | Amino Acid | Histidine Metabolism | HMDB01212 | −2.23 |
| trigonelline ( | Cofactors and Vitamins | Nicotinate and Nicotinamide Metabolism | HMDB00875 | −2.19 |
| Xenobiotics | Benzoate Metabolism | HMDB60013 | −2.18 | |
| 4-allylphenol sulfate | Xenobiotics | Food Component/Plant | −2.16 | |
| 2-aminophenol sulfate | Xenobiotics | Chemical | HMDB61116 | −2.14 |
| citrulline | Amino Acid | Urea cycle; Arginine and Proline Metabolism | HMDB00904 | −2.13 |
| uridine 3’-monophosphate (3’-UMP) | Nucleotide | Pyrimidine Metabolism, Uracil containing | −2.09 | |
| 3-methyl catechol sulfate (1) | Xenobiotics | Benzoate Metabolism | −2.09 | |
| 3-hydroxypyridine sulfate | Xenobiotics | Chemical | −2.06 | |
| isoursodeoxycholate | Lipid | Secondary Bile Acid Metabolism | HMDB00686 | −2.05 |
| propyl 4-hydroxybenzoate sulfate | Xenobiotics | Benzoate Metabolism | 2.18 | |
| 2’- | Nucleotide | Pyrimidine Metabolism, Cytidine containing | 2.33 | |
| methyl-4-hydroxybenzoate sulfate | Xenobiotics | Benzoate Metabolism | 2.37 |
Metabolites included in the Bayesian network.
| Metabolite | Super Pathway | Sub Pathway | HMDB ID |
|---|---|---|---|
| 1-linoleoylglycerol (18:2) | Lipid | Monoacylglycerol | |
| 1-methylhistidine | Amino Acid | Histidine Metabolism | HMDB00001 |
| 1-methylnicotinamide | Cofactors and Vitamins | Nicotinate and Nicotinamide Metabolism | HMDB00699 |
| 2,3-dihydroxyisovalerate (X) | Xenobiotics | Food Component/Plant | HMDB12141 |
| 3-(3-hydroxyphenyl)propionate (X) | Xenobiotics | Benzoate Metabolism | HMDB00375 |
| 3-carboxy-4-methyl-5-propyl-2-furanpropanoate (CMPF) | Lipid | Fatty Acid, Dicarboxylate | HMDB61112 |
| 3-hydroxypyridine sulfate (X) | Xenobiotics | Chemical | |
| 3-methylhistidine | Amino Acid | Histidine Metabolism | HMDB00479 |
| 3,4-methyleneheptanoate (X) | Xenobiotics | Food Component/Plant | |
| 4-guanidinobutanoate | Amino Acid | Guanidino and Acetamido Metabolism | HMDB03464 |
| 4-hydroxyhippurate (X) | Xenobiotics | Benzoate Metabolism | HMDB13678 |
| aspartate | Amino Acid | Alanine and Aspartate Metabolism | HMDB00191 |
| beta-cryptoxanthin (X) | Xenobiotics | Food Component/Plant | HMDB33844 |
| catechol sulfate (X) | Xenobiotics | Benzoate Metabolism | HMDB59724 |
| cis-4-decenoylcarnitine (C10:1) | Lipid | Fatty Acid Metabolism (Acyl Carnitine) | |
| citrulline | Amino Acid | Urea cycle; Arginine and Proline Metabolism | HMDB00904 |
| CMP | Nucleotide | Pyrimidine Metabolism, Cytidine containing | HMDB00095 |
| Ergothioneine (X) | Xenobiotics | Food Component/Plant | HMDB03045 |
| eugenol sulfate (X) | Xenobiotics | Food Component/Plant | |
| ferulic acid 4-sulfate (X) | Xenobiotics | Food Component/Plant | HMDB29200 |
| fructose | Carbohydrate | Fructose, Mannose and Galactose Metabolism | HMDB00660 |
| glucose | Carbohydrate | Glycolysis, Gluconeogenesis, and Pyruvate Metabolism | HMDB00122 |
| glycerophosphoinositol | Lipid | Phospholipid Metabolism | |
| guanosine | Nucleotide | Purine Metabolism, Guanine containing | HMDB00133 |
| isobutyrylcarnitine (C4) | Amino Acid | Leucine, Isoleucine and Valine Metabolism | HMDB00736 |
| maltose | Carbohydrate | Glycogen Metabolism | HMDB00163 |
| methyl glucopyranoside (alpha + beta) (X) | Xenobiotics | Food Component/Plant | |
| Amino Acid | Urea cycle; Arginine and Proline Metabolism | ||
| Amino Acid | Urea cycle; Arginine and Proline Metabolism | ||
| Amino Acid | Urea cycle; Arginine and Proline Metabolism | ||
| Amino Acid | Lysine Metabolism | HMDB01325 | |
| o-cresol sulfate (X) | Xenobiotics | Benzoate Metabolism | |
| perfluorooctanesulfonic acid (PFOS) (X) | Xenobiotics | Chemical | HMDB59586 |
| Pyrraline (X) | Xenobiotics | Food Component/Plant | HMDB33143 |
| S-allylcysteine (X) | Xenobiotics | Food Component/Plant | HMDB34323 |
| succinylcarnitine (C4) | Energy | TCA Cycle | HMDB61717 |
| Theobromine (X) | Xenobiotics | Xanthine Metabolism | HMDB02825 |
| thymol sulfate (X) | Xenobiotics | Food Component/Plant | HMDB01878 |
| trigonelline ( | Cofactors and Vitamins | Nicotinate and Nicotinamide Metabolism | HMDB00875 |
| umbelliferone sulfate (X) | Xenobiotics | Food Component/Plant | |
| vanillic alcohol sulfate | Amino Acid | Tyrosine Metabolism | |
| xylose | Carbohydrate | Pentose Metabolism | HMDB00098 |
Figure 3Markov neighborhood of metabolic Bayesian network for the identification of asthma at age three. Bayesian network of Year-3 asthma in VDAART. Pictured is the Markov neighborhood of the CGBN network predictive of asthma. Node size and color is proportional to degree. Directed edges represent statistical conditional dependence of the target node on the source node. Edge thickness is proportional to the statistical evidence for the edge (log Bayes Factor). Metabolite names are appended with “(X)” for those metabolites indicated as xenobiotic by Metabolon.
Metabolites that were identified as influential in the PLS-DA first component and were included in the Bayesian Network.
| Metabolite | Super Pathway | Sub Pathway | HMDB ID |
|---|---|---|---|
| vanillic alcohol sulfate | Amino Acid | Tyrosine Metabolism | |
| citrulline | Amino Acid | Urea cycle; Arginine and Proline Metabolism | HMDB00904 |
| Amino Acid | Urea cycle; Arginine and Proline Metabolism | ||
| Amino Acid | Urea cycle; Arginine and Proline Metabolism | ||
| xylose | Carbohydrate | Pentose Metabolism | HMDB00098 |
| trigonelline ( | Cofactors and Vitamins | Nicotinate and Nicotinamide Metabolism | HMDB00875 |
| Glycerophosphoinositol | Lipid | Phospholipid Metabolism | |
| CMP | Nucleotide | Pyrimidine Metabolism, Cytidine containing | HMDB00095 |
| 3-(3-hydroxyphenyl)propionate | Xenobiotics | Benzoate Metabolism | HMDB00375 |
| catechol sulfate | Xenobiotics | Benzoate Metabolism | HMDB59724 |
| 2,3-dihydroxyisovalerate | Xenobiotics | Food Component/Plant | HMDB12141 |
| beta-cryptoxanthin | Xenobiotics | Food Component/Plant | HMDB33844 |
| ergothioneine | Xenobiotics | Food Component/Plant | HMDB03045 |
| ferulic acid 4-sulfate | Xenobiotics | Food Component/Plant | HMDB29200 |
| methyl glucopyranoside (alpha + beta) | Xenobiotics | Food Component/Plant | |
| pyrraline | Xenobiotics | Food Component/Plant | HMDB33143 |
| umbelliferone sulfate | Xenobiotics | Food Component/Plant | |
| theobromine | Xenobiotics | Xanthine Metabolism | HMDB02825 |
| 3-hydroxypyridine sulfate | Xenobiotics | Chemical |
Figure 4Super pathways of metabolites identified as constituent of the Bayesian network and as influential in the 1 component of the PLS-DA model, and those which are common to both methods.