| Literature DB >> 25750701 |
Abstract
A new paradigm is proposed for assessing confidence in the identification of known metabolites in metabonomics studies using NMR spectroscopy approaches. This new paradigm is based upon the analysis of the amount of metabolite identification information retrieved from NMR spectra relative to the molecular size of the metabolite. Several new indices are proposed including: metabolite identification efficiency (MIE) and metabolite identification carbon efficiency (MICE), both of which can be easily calculated. These indices, together with some guidelines, can be used to provide a better indication of known metabolite identification confidence in metabonomics studies than existing methods. Since known metabolite identification in untargeted metabonomics studies is one of the key bottlenecks facing the science currently, it is hoped that these concepts based on molecular spectroscopic informatics, will find utility in the field.Entities:
Keywords: Metabolite identification carbon efficiency (MICE); Metabolite identification efficiency (MIE); Metabolomics; Metabonomics; Molecular spectroscopic information; NMR spectroscopy
Year: 2015 PMID: 25750701 PMCID: PMC4348432 DOI: 10.1016/j.csbj.2015.01.002
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Metabolite identification parameters calculated for the 75 metabolites.
| Parameter | Calculation |
|---|---|
| A. Total number of heavy atoms | Sum of features 2 to 5 in |
| B. Total number of spectroscopic information bits available from 1D 1H NMR | Sum of features 8 to 11 |
| C. Total number of spectroscopic information bits available from 1D 1H and 2D 1H COSY NMR | Sum of features 8 to 12 |
| D. Total number of spectroscopic information bits available from 1D 1H and 2D 1H COSY and HSQC NMR | Sum of features 8 to 13 |
| E. Total number of spectroscopic information bits available from 1D 1H and 2D 1H COSY, HSQC and HMBC NMR | Sum of features 8 to 14 |
| F. Theoretical metabolite identification carbon efficiency (MICE) for 1D 1H NMR | (Sum of features 8 to 11)/number of carbon atoms |
| G. Theoretical metabolite identification carbon efficiency (MICE) for 1D 1H and 2D 1H COSY NMR | (Sum of features 8 to 12)/number of carbon atoms |
| H. Theoretical metabolite identification carbon efficiency (MICE) for 1D 1H and 2D 1H COSY and HSQC NMR | (Sum of features 8 to 13)/number of carbon atoms |
| I. Theoretical metabolite identification carbon efficiency (MICE) for 1D 1H and 2D 1H COSY, HSQC and HMBC NMR | (Sum of features 8 to 14)/number of carbon atoms |
| J. Theoretical metabolite identification efficiency (MIE) for 1D 1H NMR | (Sum of features 8 to 11)/number of heavy atoms |
| K. Theoretical metabolite identification efficiency (MIE) for 1D 1H and 2D 1H COSY NMR | (Sum of features 8 to 12)/number of heavy atoms |
| L. Theoretical metabolite identification efficiency (MIE) for 1D 1H and 2D 1H COSY and HSQC NMR | (Sum of features 8 to 13)/number of heavy atoms |
| M. Theoretical metabolite identification efficiency (MIE) for 1D 1H and 2D 1H COSY, HSQC and HMBC NMR | (Sum of features 8 to 14)/number of heavy atoms |
The 75 metabolites identified by NMR spectroscopy in recent metabonomics studies on human and mouse urine.
| Metabolite class | Common name | IUPAC name |
|---|---|---|
| Carboxylic acids | Formic acid | Formic acid |
| Acetic acid | Acetic acid | |
| Propionic acid | Propanoic acid | |
| Butyric acid | Butanoic acid | |
| Isobutyric acid | 2-Methylpropanoic acid | |
| Isovaleric acid | 2-Methylbutanoic acid | |
| Ketoleucine | 4-Methyl-2-oxopentanoic acid | |
| Benzoic acid | benzoic acid | |
| Phenylacetic acid | 2-Phenylacetic acid | |
| Para-hydroxy-phenylacetic acid | 2-(4-Hydroxyphenyl)acetic acid | |
| Hydrocinnamic acid | 3-Phenylpropanoic acid | |
| Hydroxycarboxylic acids | Glycolic acid | 2-Hydroxyacetic acid |
| Lactic acid | (2S)-2-hydroxypropanoic acid | |
| 2-Hydroxyisobutyric acid | 2-Hydroxy-2-methylpropanoic acid | |
| 3-Hydroxyisobutyric acid | (2S)-3-hydroxy-2-methylpropanoic acid | |
| Dicarboxylic acids | Succinic acid | Butanedioic acid |
| (2S)-2-hydroxybutanedioic acid | ||
| Tartaric acid | (2R,3R)-2,3-Dihydroxybutanedioic acid | |
| Methylsuccinic acid | 2-Methylbutanedioic acid | |
| Glutaric acid | Pentanedioic acid | |
| 2-Hydroxyglutaric acid | (2S)-2-hydroxypentanedioic acid | |
| 2-Ketoglutaric acid | 2-Oxopentanedioic acid | |
| 2-Isopropylmalic acid | (2S)-2-hydroxy-2-(propan-2-yl)butanedioic acid | |
| Tricarboxylic acid | Citric acid | 2-Hydroxypropane-1,2,3-tricarboxylic acid |
| Isocitric acid | 1-Hydroxypropane-1,2,3-tricarboxylic acid | |
| cis-Aconitic acid | (1Z)-Prop-1-ene-1,2,3-tricarboxylic acid | |
| Trans-aconitic acid | (1E)-Prop-1-ene-1,2,3-tricarboxylic acid | |
| Small alcohols | Ethanol | Ethanol |
| Chiral 2, 3-butanediol | (2R,3R)-butane-2,3-diol or (2S,3S)-butane-2,3-diol | |
| Meso-2, 3-butanediol | (2R,3S)-2,3-butanediol | |
| Ketones | Butanone | Butan-2-one |
| Acetoin | 3-Hydroxybutan-2-one | |
| Sugars and sugar acids | (3R,4S,5R)-oxane-2,3,4,5-tetrol | |
| (3S,4R,5S,6S)-6-methyloxane-2,3,4,5-tetrol | ||
| (3R,4S,5S,6R)-6-(hydroxymethyl)oxane-2,3,4,5-tetrol | ||
| Mannitol | (2R,3R,4R,5R)-hexane-1,2,3,4,5,6-hexol | |
| (2R,3S,4S,5S)-2,3,4,5-tetrahydroxyhexanedioic acid | ||
| (2S,3S,4S,5R,6S)-3,4,5,6-tetrahydroxyoxane-2-carboxylic acid | ||
| Para-cresol glucuronide | (2S,3S,4S,5R,6S)-3,4,5-trihydroxy-6-(4-methylphenoxy)oxane-2-carboxylic acid | |
| Amines | Methylamine | Methanamine |
| Dimethylamine | Dimethylamine | |
| Trimethylamine | Trimethylamine | |
| Trimethylamine | ||
| Ethanolamine | 2-Aminoethan-1-ol | |
| Choline | (2-Hydroxyethyl)trimethylazanium | |
| 3-Methylhistamine | 2-(1-Methyl-1H-imidazol-5-yl)ethan-1-amine | |
| Hypotaurine | 2-Aminoethane-1-sulfinic acid | |
| Taurine | 2-Aminoethane-1-sulfonic acid | |
| 3-Indoxyl sulphate | 1H-indol-3-yloxidanesulfonic acid | |
| Putrescine | Butane-1,4-diamine | |
| Creatinine | 2-Imino-1-methylimidazolidin-4-one | |
| Creatine | 2-(1-Methylcarbamimidamido)acetic acid | |
| (3R)-3-hydroxy-4-(trimethylazaniumyl)butanoate | ||
| Amino acids and amides | Glycine | 2-Aminoacetic acid |
| 2-(Methylamino)acetic acid | ||
| Dimethylglycine | 2-(Dimethylamino)acetic acid | |
| 2-(Trimethylazaniumyl)acetate | ||
| 2-Acetamidoacetic acid | ||
| 2-Propanamidoacetic acid | ||
| 2-Butanamidoacetic acid | ||
| 2-(3-Methylbutanamido)acetic acid | ||
| Hippuric acid, benzoylglycine | 2-(Phenylformamido)acetic acid | |
| Phenylacetylglycine | 2-(2-Phenylacetamido)acetic acid | |
| Guanidoacetic acid | 2-Carbamimidamidoacetic acid | |
| Ureidopropionic acid | 3-(Carbamoylamino)propanoic acid | |
| (2S)-2-aminopropanoic acid | ||
| Beta-alanine | 3-Aminopropanoic acid | |
| Pyroglutamic acid | (2S)-5-oxopyrrolidine-2-carboxylic acid | |
| (2S)-2-amino-3-(1H-imidazol-4-yl)propanoic acid | ||
| 1-Methylhistidine | (2S)-2-amino-3-(1-methyl-1H-imidazol-4-yl)propanoic acid | |
| Allantoin | (2,5-Dioxoimidazolidin-4-yl)urea | |
| Trigonelline | 1-Methylpyridin-1-ium-3-carboxylate | |
| 1-Methylnicotinamide | 3-Carbamoyl-1-methylpyridin-1-ium | |
| Cytosine | 6-Amino-1,2-dihydropyrimidin-2-one | |
| Other metabolites | Para-cresol sulphate | (4-Methylphenyl)oxidanesulfonic acid |
The 14 molecular and spectroscopic features calculated for the 75 metabolites.
| 1. Number of hydrogen atoms | 2. Number of carbon atoms | 3. Number of oxygen atoms |
| 4. Number of nitrogen atoms | 5. Number of sulphur atoms | 6. Nominal mass in Da |
| 7. Number of chiral centres | 8. Number of 1H NMR chemical shifts | 9. Number of multiplicities |
| 10. Number of 2- or 3-bond H, H coupling constants | 11. Second order flag = 0 or 1 | 12. Number of 2D 1H COSY cross-peaks |
| 13. Number of 2D 1H, 13C HSQC cross-peaks | 14. Number of 2D 1H, 13C HMBC cross-peaks |
Fig. 1The 600 MHz 1H NMR spectrum of the urine from a C57BL/6 mouse and an expansion in the region of the methyl signals from lactic acid and the two anomers of L-fucose. The spectrum is moderately resolution-enhanced by Gaussian multiplication.
Fig. 2An expansion of the 600 MHz 2D 1H J-resolved NMR spectrum of the urine from a C57BL/6 mouse in the region of the methyl signals from lactic acid and the two anomers of L-fucose, underneath the corresponding region of the 1D 1H NMR spectrum.
The four levels of known metabolite identification from the CAWG 2007 [36].
| Level 1 | |
| Level 2 | |
| Level 3 | |
| Level 4 |
The number of bits of spectroscopic information per metabolite theoretically contained in the group of 75 metabolites, from four NMR-based metabonomics approaches: each bit corresponds to a bit of metabolite identification information.
| Feature/methodology | 1D 1H NMR | 1D 1H NMR plus 2D COSY | 1D 1H NMR plus 2D COSY and HSQC | 1D 1H NMR plus 2D COSY, HSQC and HMBC |
|---|---|---|---|---|
| Minimum number of bits | 2 | 2 | 3 | 3 |
| Maximum number of bits | 42 | 56 | 70 | 106 |
| Average number of bits | 9.2 | 11.3 | 14.7 | 24.3 |
| Median number of bits | 7 | 8 | 11 | 16 |
| Standard deviation | 7.9 | 10.6 | 13.1 | 21.8 |
Fig. 3The distribution of the theoretical number of bits of metabolite identification (ID) information available from three different NMR approaches across the 75 metabolites. The number of bits is calculated in bins ranging from 0 to 4, 5 to 8 etc. up to 105 to 108 bits. Each bit represents a 1H NMR chemical shift, multiplicity, coupling constant, 2nd order flag, COSY cross-peak, HSQC cross peak or HMBC cross peak, that theoretically should be observed for the metabolite in question. Data for approaches using 1H plus COSY data not shown for clarity of presentation.
Fig. 4The number of metabolite identification (ID) information bits theoretically available from 1D 1H NMR plotted against the number of carbon atoms for all 75 metabolites. Three outliers due to xylose, fucose and glucose are highlighted with filled, as opposed to open diamonds.
Fig. 5The theoretical metabolite identification carbon efficiency (MICE) for all 75 metabolites and for four separate metabolic profiling approaches: 1D 1H NMR alone, 1D 1H plus COSY, 1D 1H plus COSY and HSQC data and 1D 1H plus COSY, HSQC and HMBC data. The histogram shows the number of metabolites for each approach with MICE values in bins of 0 to 1, > 1 to 2, > 2 to 3 etc. up to > 17 to18.
A theoretical analysis of the total number of metabolite identification information bits, metabolite identification efficiency (MIE) and metabolite identification carbon efficiency (MICE) for chiral (24) vs non-chiral (n = 51) metabolites in this study (all analyses at the level of data from 1D 1H and 2D COSY and HSQC NMR.
| Feature/parameter | Average value | Standard deviation |
|---|---|---|
| Total number of metabolite identification information bits for chiral metabolites | 24.58 | 17.98 |
| Total number of metabolite identification information bits for non-chiral metabolites | 9.98 | 6.03 |
| Metabolite identification efficiency MIE, chiral | 2.36 | 1.52 |
| Metabolite identification efficiency MIE, non-chiral | 1.27 | 0.56 |
| Metabolite identification carbon efficiency (MICE), chiral | 4.45 | 3.03 |
| Metabolite identification carbon efficiency (MICE), non-chiral | 2.19 | 0.88 |
The bits of metabolite identification information per metabolite actually obtained from four NMR-based metabonomics approaches in the group of 75 metabolites.
| Feature/methodology | 1D 1H NMR | 1D 1H NMR plus 2D COSY | 1D 1H NMR plus 2D COSY and HSQC | 1D 1H NMR plus 2D COSY, HSQC and HMBC |
|---|---|---|---|---|
| Minimum number of bits | 2 | 2 | 2 | 2 |
| Maximum number of bits | 22 | 28 | 31 | 31 |
| Average number of bits | 6.2 | 7.5 | 9.2 | 10.3 |
| Median number of bits | 5 | 6 | 8 | 9 |
| Standard deviation | 4.5 | 5.8 | 6.5 | 6.8 |
A comparison of the total amount of metabolite identification information actually obtained versus that theoretically available from four NMR-based metabonomics approaches across the group of 75 metabolites as a whole.
| Feature/methodology | 1D 1H NMR | 1D 1H NMR plus 2D COSY | 1D 1H NMR plus 2D COSY and HSQC | 1D 1H NMR plus 2D COSY, HSQC and HMBC |
|---|---|---|---|---|
| Theoretical total number of metabolite identification bits available | 688 | 849 | 1099 | 1824 |
| Actual total number of metabolite identification bits observed | 467 | 560 | 689 | 771 |
Fig. 6The actual experimental metabolite identification carbon efficiency (MICE) for all 75 metabolites and for four separate metabolic profiling approaches: 1D 1H NMR alone, 1D 1H plus COSY, 1D 1H plus COSY and HSQC data and 1D 1H plus COSY, HSQC and HMBC data. The histogram shows the number of metabolites for each approach with MICE values in bins of 0 to 1, > 1 to 2, > 2 to 3 etc. up to > 17 to18.
Fig. 7The actual experimental metabolite identification efficiency (MIE) for all 75 metabolites and for four separate metabolic profiling approaches: 1D 1H NMR alone, 1D 1H plus COSY, 1D 1H plus COSY and HSQC data and 1D 1H plus COSY, HSQC and HMBC data. The histogram shows the number of metabolites for each approach with MIE values in bins of 0 to 0.2, > 0.2 to 0.4, > 0.4 to 0.6 etc. up to > 3.0 to 3.2.
Fig. 8A histogram of the number of metabolites in the collection of 75 metabolites analysed here against the metabolite identification hydrogen fraction (MIHF) in buckets of 0.1 from 0 to 1. The analysis is shown for four separate NMR approaches to metabolite identification: use of 1D 1H NMR data alone and the additional uses of COSY, HSQC and HMBC data.
the actual NMR-based metabolic identification information available from 1D 1H plus COSY and HSQC NMR experiments on eight metabolites with MICE scores of < 1.0.
| Common name | Number of carbon atoms | Number of 1D 1H δH | Number of mult. | Number of nJHH | Actual 2nd order flag | Number of COSY cross-peaks | Number of HSQC peaks | Actual total info 1D 1H, COSY & HSQC | Actual MICE based on 1D 1H COSY HSQC |
|---|---|---|---|---|---|---|---|---|---|
| Phenylacetic acid | 8 | 1 | 1 | 0 | 1 | 0 | 4 | 7 | 0.9 |
| Methylsuccinic acid | 5 | 1 | 1 | 1 | 0 | 1 | 0 | 4 | 0.8 |
| Trans-aconitic acid | 6 | 2 | 2 | 0 | 0 | 0 | 1 | 5 | 0.8 |
| Choline | 5 | 1 | 1 | 0 | 0 | 0 | 1 | 3 | 0.6 |
| 7 | 1 | 1 | 0 | 0 | 0 | 1 | 3 | 0.4 | |
| Dimethylglycine | 4 | 1 | 1 | 0 | 0 | 0 | 1 | 3 | 0.8 |
| 5 | 1 | 1 | 0 | 0 | 0 | 1 | 3 | 0.6 | |
| 5 | 1 | 1 | 1 | 0 | 1 | 0 | 4 | 0.8 |
| Term | Meaning |
|---|---|
| 1D | One-dimensional |
| 2D | Two-dimensional |
| CAWG | Chemical Analysis Working Group |
| CE–MS | Capillary electrophoresis mass spectrometry |
| COSY | COrrelated SpectroscopY |
| δH | Hydrogen-1 or proton NMR chemical shift |
| δC | Carbon-13 NMR chemical shift |
| GC–MS | Gas chromatography mass spectrometry |
| HMBC | Heteronuclear multiple bond correlation spectroscopy |
| HMDB | Human Metabolome Database |
| HSQC | Heteronuclear single quantum correlation spectroscopy |
| ID | Identification |
| 3JH,H | Three-bond spin–spin coupling between two hydrogens etc |
| JRES | J-resolved spectroscopy |
| LC–MS | Liquid chromatography mass spectrometry |
| MIE | Metabolite identification efficiency |
| MICE | Metabolite identification carbon efficiency |
| MIHF | Metabolite Identification Hydrogen Fraction |
| MS | Mass spectrometry |
| MSI | Metabolomics Standards Initiative |
| NOESY | Nuclear Overhauser spectroscopy |
| NMR | Nuclear magnetic resonance |
| TOCSY | TOtal Correlation SpectroscopY |
| TSP | Sodium 3-(trimethylsilyl) propionate-2, 2, 3, 3-d4 |
| UPLC–MS | Ultra-performance liquid chromatography mass spectrometry |