| Literature DB >> 24773139 |
Kerem Bingol1, Lei Bruschweiler-Li, Da-Wei Li, Rafael Brüschweiler.
Abstract
A customized metabolomics NMR database, termed (1)H((13)C)-TOCCATA, is introduced, which contains complete (1)H and (13)C chemical shift information on individual spin systems and isomeric states of common metabolites. Since this information directly corresponds to cross sections of 2D (1)H-(1)H TOCSY and 2D (13)C-(1)H HSQC-TOCSY spectra, it allows the straightforward and unambiguous identification of metabolites of complex metabolic mixtures at (13)C natural abundance from these types of experiments. The (1)H((13)C)-TOCCATA database, which is complementary to the previously introduced TOCCATA database for the analysis of uniformly (13)C-labeled compounds, currently contains 455 metabolites, and it can be used through a publicly accessible web portal. We demonstrate its performance by applying it to 2D (1)H-(1)H TOCSY and 2D (13)C-(1)H HSQC-TOCSY spectra of a cell lysate from E. coli, which yields a substantial improvement over other databases, as well as 1D NMR-based approaches, in the number of compounds that can be correctly identified with high confidence.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24773139 PMCID: PMC4051244 DOI: 10.1021/ac500979g
Source DB: PubMed Journal: Anal Chem ISSN: 0003-2700 Impact factor: 6.986
Figure 1Identification of metabolites by 1D 1H NMR spectral matching at the example of E. coli cell lysate using the Chenomx NMR software. Overlay of 1D 1H NMR spectra of metabolites from the Chenomx database (blue) on 1D 1H NMR spectrum of E. coli cell lysate (black). Putrescine (A) and alanine (B) possess at least one (partially) isolated peak in the lysate spectrum that matches a peak in the corresponding database spectrum. On the other hand, each of the peaks of lysine (C) and uridine (D) overlap with other peaks in the lysate spectrum, which makes their unambiguous identification impossible.
Metabolites Identified in 2D 1H–1H TOCSY Spectrum of E. coli Cell Lysate by Querying against the 1H(13C)-TOCCATA Databasea
| RMSD | M | shift | RMSD | M | shift | ||
|---|---|---|---|---|---|---|---|
| valine (4) | 0.002 | 0 | –0.016 | p-toluic acid (2) | 0.003 | 0 | –0.013 |
| lysine (5) | 0.004 | 0 | –0.018 | cytosine (2) | 0.000 | 0 | –0.015 |
| isoleucine (6) | 0.002 | 0 | –0.017 | propionic acid (2) | 0.000 | 0 | –0.015 |
| leucine (3) | 0.003 | 0 | –0.017 | ethanolamine (2) | 0.003 | 0 | –0.019 |
| proline (6) | 0.006 | 0 | –0.018 | n-acetyl-glutamate (4) | 0.008 | 0 | –0.015 |
| alanine (2) | 0.001 | 0 | –0.020 | citrulline (4) | 0.003 | 0 | –0.016 |
| ethanol (2) | 0.000 | 0 | –0.016 | cytidine (2) | 0.005 | 0 | –0.024 |
| arginine (5) | 0.003 | 0 | –0.013 | spermidine (2) | 0.001 | 0 | –0.015 |
| β-alanine (2) | 0.003 | 0 | –0.018 | 2-aminobutyrate (3) | 0.002 | 0 | –0.018 |
| γ-aminobutyrate (3) | 0.004 | 0 | –0.017 | threonine (3) | 0.002 | 0 | –0.020 |
| nicotinic acid (4) | 0.002 | 0 | –0.018 | uridine (6) | 0.008 | 0 | –0.016 |
| tyrosine (2) | 0.003 | 0 | –0.015 | N-α-acetyl-ornithine (4) | 0.005 | 0 | –0.004 |
| phenylalanine (3) | 0.002 | 0 | –0.009 | N-acetyl-glutamine (4) | 0.010 | 0 | –0.006 |
| uracil (2) | 0.001 | 0 | –0.009 | methionine-sulfoxide 1 (3) | 0.008 | 0 | –0.055 |
| lactate (2) | 0.002 | 0 | –0.019 | methionine-sulfoxide 2 (4) | 0.015 | 0 | –0.056 |
| phosphoenolpyruvate (2) | 0.005 | 0 | –0.029 | coenzyme A 1 (2) | 0.001 | 0 | –0.012 |
| putrescine (2) | 0.000 | 0 | –0.011 | coenzyme A 2 (2) | 0.001 | 0 | –0.007 |
| thymidine 1 (6) | 0.002 | 0 | –0.011 | pantothenate (2) | 0.001 | 0 | –0.016 |
| thymidine 2 (2) | 0.004 | 0 | –0.005 | glutamate (3) | 0.001 | 0 | –0.016 |
| 2-deoxycytidine 1 (2) | 0.001 | 0 | –0.013 | adenosine (6) | 0.008 | 0 | –0.010 |
| 2-deoxycytidine 2 (7) | 0.005 | 0 | –0.011 | adenosine-3-monophosphate (5) | 0.004 | 0 | –0.010 |
| NADP+ (4) | 0.003 | 0 | –0.018 | inosine (6) | 0.012 | 0 | –0.009 |
| tryptophan (4) | 0.003 | 0 | 0.008 |
The numbers behind certain compound names not in parentheses are used only when more than one spin systems of a metabolite is observed in the Table and they denote the different spin systems of the metabolite.
Chemical shift root-mean-square difference (in units of ppm) between the input and database chemical shifts.
Integer mismatch parameter, which is the absolute value of the difference between the number of input and database chemical shifts.
Amount by which the input chemical shifts were uniformly shifted (in ppm) so that the RMSD with respect to the database chemical shifts is minimized.
Figure 2Overlay of reconstructions of 1H–1H TOCSY spectra from databases (orange) with the experimental 1H–1H TOCSY spectrum of E. coli cell lysate (black). (A) The reconstruction of the TOCSY spectrum (orange) is based on spin-system information from the 1H(13C)-TOCCATA database. (B) The reconstruction of the TOCSY spectrum (orange) is based on entire 1D 1H NMR spectra from the BMRB database. A list with all 41 metabolites used for reconstruction in both panels is given in Table 1.
Figure 3Screenshot of the 1H(13C)-TOCCATA web server. The peak list of a 1H HSQC-TOCSY trace from the 2D 13C–1H HSQC-TOCSY spectrum is queried against the database. Query returns the best matching compound (in this case ribose ring of inosine) with the chemical shift root-mean-square difference (rmsd) before and after a uniform shift of −0.004 ppm was applied. A mismatch number M = 0 indicates that the number of query peaks and database peaks for inosine were the same.
Metabolites Identified in 2D 13C–1H HSQC-TOCSY Spectrum of E. coli Cell Lysate by Querying against the 1H(13C)-TOCCATA Databasea
| RMSD | M | shift | RMSD | M | shift | ||
|---|---|---|---|---|---|---|---|
| valine 1H (4) | 0.002 | 0 | –0.018 | phosphoenolpyruvate 1H (2) | 0.007 | 0 | –0.033 |
| valine 13C (4) | 0.015 | 0 | –0.090 | phosphoenolpyruvate 13C (1) | 0.000 | 0 | –0.562 |
| lysine 1H (5) | 0.002 | 0 | –0.016 | serine 1H (2) | 0.001 | 0 | –0.019 |
| lysine 13C (5) | 0.110 | 0 | –0.162 | serine 13C (2) | 0.018 | 0 | –0.102 |
| malate 1H (3) | 0.002 | 0 | –0.020 | methanol 1H (1) | 0.000 | 0 | –0.014 |
| malate 13C (2) | 0.012 | 0 | –0.127 | methanol 13C (1) | 0.000 | 0 | –0.182 |
| alanine 1H (2) | 0.002 | 0 | –0.016 | glycine 1H (1) | 0.000 | 0 | –0.018 |
| alanine 13C (2) | 0.021 | 0 | –0.129 | glycine 13C (1) | 0.000 | 0 | –0.162 |
| leucine 1H (5) | 0.003 | 0 | –0.014 | succinate 1H (1) | 0.000 | 0 | –0.020 |
| leucine 13C (5) | 0.156 | 0 | –0.200 | succinate 13C (1) | 0.000 | 0 | –0.053 |
| threonine 1H (3) | 0.004 | 0 | –0.020 | N-acetyl-alanine 1H (2) | 0.005 | 0 | –0.019 |
| threonine 13C (3) | 0.034 | 0 | –0.062 | N-acetyl-alanine 13C (2) | 0.040 | 0 | –0.198 |
| β-alanine 1H (2) | 0.004 | 0 | –0.013 | acetic acid 1H (1) | 0.000 | 0 | –0.014 |
| β-alanine 13C (2) | 0.044 | 0 | –0.078 | acetic acid 13C (1) | 0.000 | 0 | –0.124 |
| uracil 1H (2) | 0.000 | 0 | –0.008 | putrescine 1H (2) | 0.001 | 0 | –0.012 |
| uracil 13C (2) | 0.046 | 0 | –0.033 | putrescine 13C (2) | 0.005 | 0 | –0.099 |
| tyrosine 1 1H (3) | 0.003 | 0 | –0.028 | thymidine 1 1H (2) | 0.004 | 0 | –0.006 |
| tyrosine 1 13C (2) | 0.014 | 0 | 0.084 | thymidine 1 13C (2) | 0.040 | 0 | –0.101 |
| tyrosine 2 1H (2) | 0.003 | 0 | –0.017 | thymidine 2 1H (6) | 0.004 | 0 | –0.010 |
| tyrosine 2 13C (2) | 0.049 | 0 | –0.097 | thymidine 2 13C (5) | 0.020 | 0 | –0.143 |
| phenylalanine 1 1H (3) | 0.003 | 0 | –0.021 | cytidine 1H (2) | 0.007 | 0 | –0.019 |
| phenylalanine 1 13C (2) | 0.030 | 0 | –0.090 | cytidine 13C (2) | 0.015 | 0 | –0.212 |
| phenylalanine 2 1H (3) | 0.004 | 0 | –0.005 | dTMP 1 1H (2) | 0.002 | 0 | –0.015 |
| phenylalanine 2 13C (3) | 0.015 | 0 | –0.059 | dTMP 1 13C (2) | 0.035 | 0 | –0.085 |
| arginine 1H (4) | 0.003 | 0 | –0.008 | dTMP 2 1H (5) | 0.020 | 0 | –0.036 |
| arginine 13C (4) | 0.088 | 0 | –0.069 | dTMP 2 13C (5) | 0.127 | 0 | –0.017 |
| γ-aminobutyrate 1H (3) | 0.003 | 0 | –0.015 | uridine 1 1H (6) | 0.005 | 0 | –0.008 |
| γ-aminobutyrate 13C (3) | 0.034 | 0 | –0.089 | uridine 1 13C (5) | 0.054 | 0 | –0.097 |
| aspartate 1H (3) | 0.003 | 0 | –0.012 | uridine 2 1H (2) | 0.010 | 0 | –0.004 |
| aspartate 13C (2) | 0.015 | 0 | –0.094 | uridine 2 13C (2) | 0.010 | 0 | –0.127 |
| glutamate 1H (3) | 0.001 | 0 | –0.011 | adenosine 1H (6) | 0.006 | 0 | –0.010 |
| glutamate 13C (3) | 0.048 | 0 | –0.042 | adenosine 13C (5) | 0.008 | 0 | –0.056 |
| lactate 1H (2) | 0.000 | 0 | –0.014 | inosine 1H (6) | 0.017 | 0 | –0.004 |
| lactate 13C (2) | 0.019 | 0 | –0.081 | inosine 13C (5) | 0.049 | 0 | –0.113 |
| nicotinic acid 1H (4) | 0.003 | 0 | –0.013 | glutathione reduced 1H (3) | 0.008 | 0 | –0.010 |
| nicotinic acid 13C (4) | 0.043 | 0 | –0.094 | glutathione reduced 13C (3) | 0.036 | 0 | –0.166 |
| fumarate 1H (1) | 0.000 | 0 | –0.012 | cystathionine 1H (3) | 0.006 | 0 | –0.023 |
| fumarate 13C (1) | 0.000 | 0 | –0.097 | cystathionine 13C (3) | 0.147 | 0 | –0.398 |
The numbers behind certain compound names that are not in parentheses are used only when more than one spin systems of a metabolite is observed in the Table and they denote the different spin systems of the metabolite. “1H” and “13C” labels behind compound names indicates whether the queried trace is a 1H HSQC-TOCSY trace or 13C HSQC-TOCSY trace.
Chemical shift root-mean-square difference (in units of ppm) between the input and database chemical shifts.
Integer mismatch parameter, which is the absolute value of the difference between the number of input and database chemical shifts.
Amount by which the input chemical shifts were uniformly shifted (in ppm) so that the RMSD with respect to the database chemical shifts is minimized.