| Literature DB >> 22220213 |
Le-Le Hu1, Chen Chen, Tao Huang, Yu-Dong Cai, Kuo-Chen Chou.
Abstract
Given a compound, how can we effectively predict its biological function? It is a fundamentally important problem because the information thus obtained may benefit the understanding of many basic biological processes and provide useful clues for drug design. In this study, based on the information of chemical-chemical interactions, a novel method was developed that can be used to identify which of the following eleven metabolic pathway classes a query compound may be involved with: (1) Carbohydrate Metabolism, (2) Energy Metabolism, (3) Lipid Metabolism, (4) Nucleotide Metabolism, (5) Amino Acid Metabolism, (6) Metabolism of Other Amino Acids, (7) Glycan Biosynthesis and Metabolism, (8) Metabolism of Cofactors and Vitamins, (9) Metabolism of Terpenoids and Polyketides, (10) Biosynthesis of Other Secondary Metabolites, (11) Xenobiotics Biodegradation and Metabolism. It was observed that the overall success rate obtained by the method via the 5-fold cross-validation test on a benchmark dataset consisting of 3,137 compounds was 77.97%, which is much higher than 10.45%, the corresponding success rate obtained by the random guesses. Besides, to deal with the situation that some compounds may be involved with more than one metabolic pathway class, the method presented here is featured by the capacity able to provide a series of potential metabolic pathway classes ranked according to the descending order of their likelihood for each of the query compounds concerned. Furthermore, our method was also applied to predict 5,549 compounds whose metabolic pathway classes are unknown. Interestingly, the results thus obtained are quite consistent with the deductions from the reports by other investigators. It is anticipated that, with the continuous increase of the chemical-chemical interaction data, the current method will be further enhanced in its power and accuracy, so as to become a useful complementary vehicle in annotating uncharacterized compounds for their biological functions.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22220213 PMCID: PMC3248422 DOI: 10.1371/journal.pone.0029491
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Distribution of the 4,366 and 3,137 compounds in the 11 metabolic pathway classes.
| Class code | Metabolic Pathway | Number of different compounds | |
| Group-I | Group-II | ||
| 4,366 | 3,137 | ||
| 1 | Carbohydrate Metabolism | 444 | 394 |
| 2 | Energy Metabolism | 129 | 120 |
| 3 | Lipid Metabolism | 610 | 383 |
| 4 | Nucleotide Metabolism | 145 | 132 |
| 5 | Amino Acid Metabolism | 563 | 483 |
| 6 | Metabolism of Other Amino Acids | 212 | 154 |
| 7 | Glycan Biosynthesis and Metabolism | 68 | 43 |
| 8 | Metabolism of Cofactors and Vitamins | 396 | 309 |
| 9 | Metabolism of Terpenoids and Polyketides | 713 | 499 |
| 10 | Biosynthesis of Other Secondary Metabolites | 722 | 519 |
| 11 | Xenobiotics Biodegradation and Metabolism | 858 | 570 |
| Overall | 4,860 | 3,606 | |
The 4,366 compounds in Group-I were screened from KEGG by selecting the compounds with the metabolic pathway information. The 3,137 compounds in Group-II were those retrieved from the 4,366 compounds that can interact with any other as annotated by STITCH database. Note that since a compound may occur in more than one pathway class, the sum of the compounds over the 11 pathway classes is greater than the number of different compounds for the cases of both Group-I and Group-II.
Figure 1Illustration to show the accuracy by each of the 11 order predictions for the 3,137 compounds by the 5-fold cross-validation.
It can be seen from the figure that from the first order to the last one, the 11 accuracies form a download-slope curve.
The accuracy predicted by each of the 11 orders for the metabolic pathway classes of the 3,137 compounds by the 5-fold cross-validation test.
| Class code | Accuracy (%) predicted by each order | ||||||||||
| 1st | 2nd | 3rd | 4th | 5th | 6th | 7th | 8th | 9th | 10th | 11th | |
| 1 | 80.96 | 8.38 | 5.08 | 1.78 | 1.02 | 1.27 | 0.25 | 1.02 | 0.00 | 0.25 | 0.00 |
| 2 | 31.67 | 30.00 | 18.33 | 7.50 | 4.17 | 3.33 | 0.00 | 3.33 | 0.00 | 0.83 | 0.83 |
| 3 | 73.89 | 6.27 | 6.27 | 4.44 | 1.57 | 2.61 | 2.35 | 0.52 | 0.52 | 1.31 | 0.26 |
| 4 | 65.15 | 11.36 | 6.82 | 5.30 | 3.03 | 1.52 | 2.27 | 0.00 | 4.55 | 0.00 | 0.00 |
| 5 | 61.70 | 19.88 | 10.97 | 5.38 | 1.04 | 0.83 | 0.21 | 0.00 | 0.00 | 0.00 | 0.00 |
| 6 | 29.87 | 27.27 | 11.69 | 11.69 | 5.19 | 5.19 | 3.90 | 3.25 | 1.95 | 0.00 | 0.00 |
| 7 | 20.93 | 25.58 | 11.63 | 9.30 | 6.98 | 4.65 | 2.33 | 4.65 | 0.00 | 2.33 | 11.63 |
| 8 | 61.17 | 17.15 | 9.39 | 4.21 | 3.24 | 2.91 | 0.97 | 0.32 | 0.32 | 0.32 | 0.00 |
| 9 | 74.35 | 8.42 | 4.01 | 3.41 | 1.20 | 1.20 | 2.20 | 2.40 | 1.00 | 1.20 | 0.60 |
| 10 | 68.98 | 8.67 | 4.62 | 3.66 | 5.01 | 4.24 | 1.54 | 1.54 | 0.77 | 0.58 | 0.39 |
| 11 | 78.77 | 8.42 | 5.09 | 2.81 | 2.63 | 1.40 | 0.35 | 0.35 | 0.18 | 0.00 | 0.00 |
| Overall | 77.97 | 14.19 | 8.07 | 4.88 | 2.93 | 2.55 | 1.43 | 1.28 | 0.70 | 0.57 | 0.38 |
See for the numbers-distribution of the 3,137 compounds among the 11 metabolic pathway classes.
Interactions of dihydrouracil with other compounds in the benchmark dataset of Group-II.
| KEGG ligand | Name | Code of Metabolic pathway class | KEGG ligand | Name | Code of Metabolic pathway class | Interaction confidence |
| C00429 | Dihydrouracil | 4, 6, 8 | C00106 | Uracil | 4, 6, 8 | 0.981 |
| C00429 | Dihydrouracil | 4, 6, 8 | C02642 | N-carbamoyl-be. | 4, 6, 8 | 0.945 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00006 | NADP | 2, 6, 8 | 0.921 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00005 | NADP(H) | 2, 6 | 0.902 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00001 | Hydroxyl radic. | 2, 8 | 0.899 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00013 | Pyrophosphate | 2 | 0.899 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00119 | Phosphoribosyl. | 1, 4, 5 | 0.899 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00906 | Dihydrothymine | 4 | 0.855 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00178 | Thymine | 4 | 0.814 |
| C00429 | Dihydrouracil | 4, 6, 8 | C07649 | 5-fluorouracil | 11 | 0.744 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00099 | Beta-alanine | 1, 4, 6, 8 | 0.650 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00380 | Cytosine | 4 | 0.551 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00262 | Hypoxanthine | 4 | 0.436 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00299 | Uridine | 4 | 0.433 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00295 | Orotic acid | 4 | 0.386 |
| C00429 | Dihydrouracil | 4, 6, 8 | C05145 | Beta-aminoisob. | 4 | 0.362 |
| C00429 | Dihydrouracil | 4, 6, 8 | C02067 | Pseudouridine | 4 | 0.353 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00881 | Deoxycytidine | 4 | 0.350 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00147 | Adenine | 4, 9 | 0.308 |
| C00429 | Dihydrouracil | 4, 6, 8 | C05100 | Beta-ureidoiso. | 4 | 0.286 |
| C00429 | Dihydrouracil | 4, 6, 8 | C03056 | 2,6-dihydroxyp. | 8 | 0.274 |
| C00429 | Dihydrouracil | 4, 6, 8 | C02565 | N-methylhydant. | 5 | 0.272 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00337 | Dihydroorotate | 4 | 0.262 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00757 | Berberine | 10 | 0.252 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00222 | Malonate semia. | 1, 11, 6 | 0.218 |
| C00429 | Dihydrouracil | 4, 6, 8 | C12650 | Capecitabine | 11 | 0.214 |
| C00429 | Dihydrouracil | 4, 6, 8 | C12673 | Tegafur | 11 | 0.210 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00522 | Pantoate | 8 | 0.207 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00864 | Pantothenic ac. | 6, 8 | 0.205 |
| C00429 | Dihydrouracil | 4, 6, 8 | C11736 | FUdR | 11 | 0.199 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00366 | Uric acid | 4 | 0.167 |
| C00429 | Dihydrouracil | 4, 6, 8 | C00219 | Arachidonic ac. | 3 | 0.154 |
See for the code of the metabolic pathway class.
Interactions of N-acetylgalactosamine 4-sulfate and cyclopropylamine with other compounds whose metabolic pathway classes are known.
| KEGG Ligand | Name | KEGG Ligand | Name | Code of Metabolic pathway class | Interaction Confidence |
| C16265 | N-acetylgalactosamine 4-sulfate | C00059 | sulfate | 2, 4, 5 | 0.956 |
| C16265 | N-acetylgalactosamine 4-sulfate | C00333 | glucuronic acid | 1 | 0.931 |
| C16265 | N-acetylgalactosamine 4-sulfate | C15923 | galactose | 1 | 0.904 |
| C16265 | N-acetylgalactosamine 4-sulfate | C01508 | xylose | 1 | 0.9 |
| C16265 | N-acetylgalactosamine 4-sulfate | C01721 | fucose | 1 | 0.899 |
| C16265 | N-acetylgalactosamine 4-sulfate | C01330 | Na(+) | 2 | 0.899 |
| C16265 | N-acetylgalactosamine 4-sulfate | C00116 | glycerol | 1, 3 | 0.899 |
| C16265 | N-acetylgalactosamine 4-sulfate | C00009 | phosphate | 2, 7 | 0.899 |
| C16265 | N-acetylgalactosamine 4-sulfate | C00053 | 3′-phospho.pho. | 2, 4, 7 | 0.47 |
| C16265 | N-acetylgalactosamine 4-sulfate | C02591 | sugar-1-phosph. | 1 | 0.312 |
| C16265 | N-acetylgalactosamine 4-sulfate | C01170 | UDP-GlcNAc | 1 | 0.27 |
| C16265 | N-acetylgalactosamine 4-sulfate | C03506 | indole-3-glyce. | 10, 5 | 0.256 |
| C16265 | N-acetylgalactosamine 4-sulfate | C01132 | N-acetyl-D-glucosamine | 1 | 0.235 |
| C16265 | N-acetylgalactosamine 4-sulfate | C00096 | GDP-mannose | 1, 7 | 0.183 |
| C14150 | cyclopropylamine | C06554 | cyanuric acid | 11 | 0.918 |
| C14150 | cyclopropylamine | C00014 | ammonia | 2, 4, 5, 6 | 0.907 |
| C14150 | cyclopropylamine | C14149 | N-cyclopropylammelide | 11 | 0.899 |
| C14150 | cyclopropylamine | C14148 | c0761 | 11 | 0.899 |
| C14150 | cyclopropylamine | C00001 | hydroxyl radicals | 2, 8 | 0.899 |
| C14150 | cyclopropylamine | C00969 | reuterin | 3 | 0.378 |
| C14150 | cyclopropylamine | C06547 | polyethylene | 11, 5 | 0.347 |
| C14150 | cyclopropylamine | C01234 | 1-aminocyclopropane-1-carboxylic acid | 1, 5 | 0.29 |
| C14150 | cyclopropylamine | C00218 | methylamine | 2 | 0.29 |
| C14150 | cyclopropylamine | C16267 | cyclopropanecarboxylic acid | 11 | 0.286 |
| C14150 | cyclopropylamine | C16318 | methyl jasmonate | 3 | 0.273 |
| C14150 | cyclopropylamine | C11512 | methyl jasmonate | 3 | 0.273 |
| C14150 | cyclopropylamine | C05593 | 3-hydroxyphenylacetic acid | 11, 5 | 0.267 |
| C14150 | cyclopropylamine | C00261 | benzaldehyde | 11 | 0.238 |
| C14150 | cyclopropylamine | C01054 | 2,3-oxidosqualene | 3 | 0.236 |
| C14150 | cyclopropylamine | C01013 | 3-hydroxypropionate | 1, 6 | 0.224 |
| C14150 | cyclopropylamine | C01471 | acrolein | 11 | 0.208 |
| C14150 | cyclopropylamine | C01746 | calcium channel blocker | 10 | 0.205 |
| C14150 | cyclopropylamine | C00571 | cyclohexylamine | 11 | 0.205 |
| C14150 | cyclopropylamine | C00144 | guanosine monophosphate | 4 | 0.205 |
| C14150 | cyclopropylamine | C00903 | cinnamaldehyde | 10 | 0.171 |
| C14150 | cyclopropylamine | C07113 | acetophenone | 11 | 0.168 |
| C14150 | cyclopropylamine | C01724 | lanosterol | 3 | 0.152 |
See for the code of the metabolic pathway class.