| Literature DB >> 22022254 |
Qiang Zhu1, Tao Qin, Ying-Ying Jiang, Cong Ji, De-Xin Kong, Bin-Guang Ma, Hong-Yu Zhang.
Abstract
Although the metabolic networks of the three domains of life consist of different constituents and metabolic pathways, they exhibit the same scale-free organization. This phenomenon has been hypothetically explained by preferential attachment principle that the new-recruited metabolites attach preferentially to those that are already well connected. However, since metabolites are usually small molecules and metabolic processes are basically chemical reactions, we speculate that the metabolic network organization may have a chemical basis. In this paper, chemoinformatic analyses on metabolic networks of Kyoto Encyclopedia of Genes and Genomes (KEGG), Escherichia coli and Saccharomyces cerevisiae were performed. It was found that there exist qualitative and quantitative correlations between network topology and chemical properties of metabolites. The metabolites with larger degrees of connectivity (hubs) are of relatively stronger polarity. This suggests that metabolic networks are chemically organized to a certain extent, which was further elucidated in terms of high concentrations required by metabolic hubs to drive a variety of reactions. This finding not only provides a chemical explanation to the preferential attachment principle for metabolic network expansion, but also has important implications for metabolic network design and metabolite concentration prediction.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22022254 PMCID: PMC3192814 DOI: 10.1371/journal.pcbi.1002214
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Mean values of some chemical descriptors for KEGG-recorded metabolites.
| Descriptors | Characterization | Mean values | ||
| Degree 1 (n = 1180) | Degree 2-6 (n = 3327) | Degree > 6 (n = 368) | ||
| ClogP | Partition coefficient octanol/water | 1.30 | 0.70 | −1.10 |
| FPSA3 | Ratio of atomic charge weighted partial positive surface area on total molecular surface area | 0.062 | 0.067 | 0.079 |
| LogD | Octanol-water partition coefficient calculated taking into account the ionization states of the molecule | 0.43 | −0.53 | −2.31 |
| Molecular Solubility | Water solubility, expressed as logS, where S is the solubility in mol/L | −2.91 | −2.82 | −0.98 |
calculated with Cerius2 (Version 4.11L. Accelrys Inc. San Diego, CA.).
calculated with Sybyl (Version 7.0. Tripos Associates Inc. St. Louis, MO.).
calculated with Pipeline Pilot (Student Edition. Version 6.1.5. SciTegic Accelrys Inc. San Diego, CA.).
Kruskal-Wallis Test significance at the 0.01 level.
Figure 1Correlations between topological and chemical properties of KEGG metabolites.
(A) Degree-ALogP (mean ± SE) correlation for KEGG metabolites (R = −0.778, P<0.001). (B) Degree-Molecular Solubility (mean ± SE) correlation for KEGG metabolites (R = 0.795, P<0.001).
Figure 2Correlations between topological and chemical properties of E. coli metabolites.
(A) Degree-Molecular Solubility (mean ± SE) correlation (R = 0.835, P<0.001). (B) Degree-PNSA3 (mean ± SE) correlation (R = 0.796, P<0.001). (C) Degree-Hydrophobe (mean ± SE) correlation (R = −0.743, P<0.005). PNSA3 is defined as atomic charge weighted partial negative surface area. Hydrophobe is the number of hydrophobe.
Mean values of some chemical descriptors for S. cerevisiae metabolites.
| Descriptors | Characterization | Mean values | ||
| Degree 1-3 (n = 301) | Degree 4-15 (n = 285) | Degree > 15 (n = 26) | ||
| ClogP | Partition coefficient octanol/water | 0.46 | −0.54 | −3.05 |
| FPSA3 | Ratio of atomic charge weighted partial positive surface area on total molecular surface area | 0.066 | 0.068 | 0.080 |
| LogD | Octanol-water partition coefficient calculated taking into account the ionization states of the molecule | −0.89 | −1.94 | −3.88 |
| Molecular Solubility | Water solubility, expressed as logS, where S is the solubility in mol/L | −2.47 | −1.99 | 0.11 |
calculated with Cerius2 (Version 4.11L. Accelrys Inc. San Diego, CA.
calculated with Sybyl (Version 7.0. Tripos Associates Inc. St. Louis, MO.).
calculated with Pipeline Pilot (Student Edition. Version 6.1.5. SciTegic Accelrys Inc. San Diego, CA.).
Kruskal-Wallis Test significance at the 0.05 level.
Kruskal-Wallis Test significance at the 0.01 level.
Figure 3Degree-concentration correlation for E. coli metabolites (P<0.01, Kruskal-Wallis test).
Figure 4Theoretical fitting of E. coli metabolite concentrations by chemical properties.
A stepwise multiple linear regression analysis was conducted to select the most meaningful chemical properties that correlate with concentration (C). The final regression equation is: −LogC = 6.105 + 0.431 × "ClogP" + 15.595 × "FNSA3" + 16.727 × "FPSA3" − 5.333 × "RPCG". The negative logarithm of fitted concentrations (−LogC f) for 80 E. coli metabolites correlates well with that of experimental values (−LogC e) (R = 0.704, P<0.0001).
Mean values of some chemical descriptors for early and late metabolites of S. cerevisiae.
| Descriptors | Characterization | Mean values | |
| Early metabolites (n = 243) | Late metabolites (n = 369) | ||
| ClogP | Partition coefficient octanol/water | −1.98 | 0.98 |
| FPSA3 | Ratio of atomic charge weighted partial positive surface area on total molecular surface area | 0.079 | 0.061 |
| LogD | Octanol-water partition coefficient calculated taking into account the ionization states of the molecule | −3.12 | −0.44 |
| Molecular Solubility | Water solubility, expressed as logS, where S is the solubility in mol/L | −0.74 | −3.06 |
calculated with Cerius2 (Version 4.11L. Accelrys Inc. San Diego, CA.).
calculated with Sybyl (Version 7.0. Tripos Associates Inc. St. Louis, MO.).
calculated with Pipeline Pilot (Student Edition. Version 6.1.5. SciTegic Accelrys Inc. San Diego, CA.).
Mann-Whitney Test significance at the 0.01 level.
Figure 5Numerical simulations of metabolic network expansion.
The simulations were based on three rules: i) n metabolites are added in each expansion step (n = 1 in the present simulations); ii) the newly added metabolites have lower concentrations compared to the old ones; iii) the metabolites of higher concentrations have higher probability to be involved in the emerging reactions (edges). The simulations start with 1 metabolite with the initial concentration (C i) of 1,000,000 and terminate when a metabolite reaches a concentration (C f) of ≤ 10. The concentration decline (d) in each step is 1,000, with a random fluctuation (f) of 1,500. (A) The number of reactions (edges) added in each step is 5; (B) The number of reactions (edges) added in each step is 10. In both simulations, the number of metabolites (N) decays with the increase of degrees (D) and follows the equation N = aD.
Mean values of some chemical descriptors for hubs of KEGG-based network and cores of organic chemical network.
| Descriptors | Characterization | Mean values | |
| KEGG hubs (n = 279) | Chemical cores (n = 300) | ||
| ClogP | Partition coefficient octanol/water | −1.26 | 2.11 |
| FNSA3 | Ratio of atomic charge weighted partial negative surface area on total molecular surface area | −0.110 | −0.060 |
| FPSA3 | Ratio of atomic charge weighted partial positive surface area on total molecular surface area | 0.080 | 0.040 |
| LogD | Octanol-water partition coefficient calculated taking into account the ionization states of the molecule | −2.56 | 2.08 |
| Molecular Solubility | Water solubility, expressed as logS, where S is the solubility in mol/L | −0.80 | −2.61 |
| RPCG | Ratio of most positive charge on sum total positive charge (Relative positive charge) | 0.158 | 0.233 |
calculated with Cerius2 (Version 4.11L. Accelrys Inc. San Diego, CA.).
calculated with Sybyl (Version 7.0. Tripos Associates Inc. St. Louis, MO.).
calculated with Pipeline Pilot (Student Edition. Version 6.1.5. SciTegic Accelrys Inc. San Diego, CA.).
Mann-Whitney Test significance at the 0.01 level.
Figure 6Theoretical fitting of E. coli metabolite concentrations by the SVR model.
The negative logarithm of fitted concentrations (−LogC f) for 80 E. coli metabolites correlates well with that of experimental values (−LogC e): −LogC f = 0.9678 × −LogC e (R = 0.827, P<0.0001, regression without intercept).
Performance of SVR models evaluated by descriptor deletion.
| Deleted descriptor | Characterization | Squared correlation coefficient | Total mean squared error |
| Degree | Number of edges linked to the node of network | 0.4547 | 0.7094 |
| ClogP | Partition coefficient octanol/water | 0.5185 | 0.6304 |
| Amide Molecules | Number of amide | 0.5489 | 0.5952 |
| N Count | Number of Nitrogen atoms | 0.5674 | 0.5963 |
| 6mem rings Molecules | Number of 6 membered rings | 0.5680 | 0.5628 |
| FNSA3 | Ratio of atomic charge weighted partial negative surface area on total molecular surface area | 0.5691 | 0.5594 |
| HBD Count | Number of hydrogen bond donating groups in the molecule | 0.5717 | 0.5744 |
| FPSA3 | Ratio of atomic charge weighted partial positive surface area on total molecular surface area | 0.5778 | 0.5482 |
| ALogP | The Ghose and Crippen octanol-water partition coefficient | 0.5806 | 0.5449 |
| LScore Molecules | Floating point Lipinski measure | 0.5860 | 0.5373 |
| RPCG | Ratio of most positive charge on sum total positive charge (Relative positive charge) | 0.6045 | 0.5134 |
calculated by Network Analyzer Plugin in Cytoscape-2.7.0.
calculated with Cerius2 (Version 4.11L. Accelrys Inc. San Diego, CA.).
calculated with Tripos Benchware DataMiner (Version 1.6. Tripos Associates Inc. St. Louis, MO.).
calculated with Sybyl (Version 7.0. Tripos Associates Inc. St. Louis, MO.).
calculated with Pipeline Pilot (Student Edition. Version 6.1.5. SciTegic Accelrys Inc. San Diego, CA.).
derived from leave-one-out cross validation.
Comparison of predicted and experimental concentrations for some E. coli metabolites.
| Metabolite | Predicted concentration | Predicted concentration | Experimental concentration | |
| Lower limit | Upper limit | |||
| 13DPG | n.a. | 3.237 | 3.959 | n.d.j |
| 2PG | 3.347 | 3.292 | 3.770 | 2.394 |
| 3PG | 3.260 | 2.387 | 2.495 | 2.394 |
| 3PHP | 2.906 | 5.046 | 7.000 | n.d.j |
| DHAP | 3.221 | 3.155 | 3.252 | 3.174 |
| F6P | 3.416 | 3.796 | 6.000 | 3.319 |
| G1P | 3.935 | 3.959 | 6.000 | n.d.j |
| G6P | 3.577 | 3.301 | 3.523 | 3.319 |
| G3P | 3.170 | 4.301 | 5.046 | 3.174 |
| R5P | 3.341 | 3.959 | 4.699 | 3.824 |
| RU5P | 3.617 | 3.824 | 4.699 | 3.824 |
| X5P | 3.594i | 3.959 | 6.000 | 3.824 |
Abbreviations: 13DPG, 1,3-diphosphoglycerate; 2PG, 2-phospho-D-glycerate; 3PG, 3-phospho-D-glycerate; 3PHP, 3-phospho-hydroxypyruvate; DHAP, dihydroxyacetone phosphate; F6P, D-fructose-6-phosphate; G1P, D-glucose-1-phosphate; G6P, D-glucose-6-phosphate; G3P, D-glyceraldehyde-3-phosphate; R5P, D-ribose-5-phosphate; RU5P, ribulose-5-phosphate; X5P, xylulose 5-phosphate.
Negative logarithm (-Log) of E. coli metabolite concentrations (mol/L) predicted by SVR model.
Negative logarithm (-Log) of E. coli metabolite concentrations (mol/L) predicted by NET method [16].
Negative logarithm (-Log) of E. coli metabolite concentrations (mol/L) determined by prior experiments [16].
Not available, because the metabolite is not involved in the metabolic network of E. coli.
Mean of concentrations for α- and β-G1P.
Mean of concentrations for α- and β-G6P.
Mean of concentrations for D- and L-RU5P.
Mean of concentrations for D- and L-X5P.
Not determined.