| Literature DB >> 18559112 |
Ryosuke L A Watanabe1, Enrique Morett, Edgar E Vallejo.
Abstract
BACKGROUND: Non-homology based methods such as phylogenetic profiles are effective for predicting functional relationships between proteins with no considerable sequence or structure similarity. Those methods rely heavily on traditional similarity metrics defined on pairs of phylogenetic patterns. Proteins do not exclusively interact in pairs as the final biological function of a protein in the cellular context is often hold by a group of proteins. In order to accurately infer modules of functionally interacting proteins, the consideration of not only direct but also indirect relationships is required. In this paper, we used the Bond Energy Algorithm (BEA) to predict functionally related groups of proteins. With BEA we create clusters of phylogenetic profiles based on the associations of the surrounding elements of the analyzed data using a metric that considers linked relationships among elements in the data set.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18559112 PMCID: PMC2474619 DOI: 10.1186/1471-2105-9-285
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Resulting classification for all methods
| Code | COGs | PREDICTED BEA | PREDICTED HIERARCHICAL | PREDICTED K-MEANS | PREDICTED PAM |
| Information storage and processing | |||||
| J | 217 | 213 | 321 | 544 | 754 |
| K | 132 | 122 | 4 | 0 | 0 |
| L | 184 | 181 | 39 | 0 | 0 |
| Cellular processes | |||||
| D | 32 | 32 | 0 | 0 | 0 |
| O | 110 | 104 | 26 | 0 | 0 |
| M | 155 | 153 | 140 | 84 | 0 |
| N | 133 | 130 | 46 | 81 | 100 |
| P | 160 | 149 | 10 | 0 | 0 |
| T | 97 | 83 | 0 | 0 | 0 |
| Metabolism | |||||
| C | 224 | 220 | 65 | 24 | 0 |
| G | 171 | 164 | 17 | 197 | 62 |
| E | 233 | 226 | 584 | 260 | 312 |
| F | 85 | 83 | 18 | 0 | 0 |
| H | 154 | 141 | 49 | 0 | 0 |
| I | 75 | 72 | 0 | 0 | 0 |
| Q | 62 | 55 | 0 | 0 | 0 |
| Poorly characterized | |||||
| R | 449 | 431 | 113 | 250 | 0 |
| S | 750 | 748 | 1875 | 1867 | 2079 |
Classification for COG Functional Categories (One classification for multiple classification COGs).
Validation 1 for DIP in the same cluster
| ALGORITHM | CORRECT | INCORRECT |
| BEA | 62.37662 | 37.62338 |
| K-MEANS | 40.09091 | 59.90909 |
| HIERARCHICAL | 22.77922 | 77.22078 |
| PAM | 8.41558 | 91.58442 |
Classification for Database of Interacting Proteins.
Validation 2 for DIP in the near cluster
| ALGORITHM | CORRECT | INCORRECT |
| BEA | 71.428571 | 28.571429 |
| K-MEANS | 46.1038961 | 53.8961039 |
| HIERARCHICAL | 22.7272727 | 77.2727273 |
| PAM | 21.4285714 | 78.5714286 |
Classification for Database of Interacting Proteins. Neighbor Cluster Maximum 1.
Validation 3 for DIP in the surrounding five clusters
| ALGORITHM | CORRECT | INCORRECT |
| BEA | 86.3636364 | 13.6363636 |
| K-MEANS | 57.7922078 | 42.2077922 |
| HIERARCHICAL | 40.9090909 | 59.0909091 |
| PAM | 55.1948052 | 44.8051948 |
Classification for Database of Interacting Proteins. Neighbor Cluster Maximum 5.
Validation 1 for ECOCYC in the same cluster
| ALGORITHM | CORRECT | INCORRECT |
| BEA | 84.375 | 15.625 |
| K-MEANS | 50.00 | 50.00 |
| HIERARCHICAL | 37.50 | 62.50 |
| PAM | 5.7291667 | 94.2708333 |
Classification for Database ECOCYC.
Validation 2 for ECOCYC in the near cluster
| ALGORITHM | CORRECT | INCORRECT |
| BEA | 88.5416667 | 11.4583333 |
| K-MEANS | 52.0833333 | 47.9166667 |
| HIERARCHICAL | 38.0208333 | 61.9791667 |
| PAM | 15.625 | 84.375 |
Classification for Database ECOCYC. Neighbor Cluster Maximum 1.
Validation 3 for ECOCYC in the surrounding five clusters
| ALGORITHM | CORRECT | INCORRECT |
| BEA | 95.8333333 | 4.1666667 |
| K-MEANS | 65.625 | 34.375 |
| HIERARCHICAL | 46.3541667 | 53.6458333 |
| PAM | 47.9166667 | 52.0833333 |
Classification for Database ECOCYC. Neighbor Cluster Maximum 5.
Figure 1Example of Bea Cluster. This figure shows an example of the clusters of BEA clustering.
Figure 2Example of BEA clustering heatmap. This figure shows an example of a heatmap of BEA clustering.
Figure 3Bea Cluster. This figure show the distribution of the clusters for BEA.
COG FUNCTIONAL CATEGORIES
| Code | COGs | Domains | Description | Pathways and functional systems |
| Information storage and processing | ||||
| J | 217 | 6449 | Translation, ribosomal structure and biogenesis | 4 |
| K | 132 | 5438 | Transcription | 3 |
| L | 184 | 5337 | DNA replication, recombination and repair | 2 |
| Cellular processes | ||||
| D | 32 | 842 | Cell division and chromosome partitioning | - |
| O | 110 | 3165 | Posttranslational modification, protein turnover, chaperones | - |
| M | 155 | 4079 | Cell envelope biogenesis, outer membrane | 1 |
| N | 133 | 3110 | Cell motility and secretion | 2 |
| P | 160 | 5112 | Inorganic ion transport and metabolism | 1 |
| T | 97 | 3627 | Signal transduction mechanisms | - |
| Metabolism | ||||
| C | 224 | 5594 | Energy production and conversion | 7 |
| G | 171 | 5262 | Carbohydrate transport and metabolism | 4 |
| E | 233 | 8383 | Amino acid transport and metabolism | 10 |
| F | 85 | 2364 | Nucleotide transport and metabolism | 5 |
| H | 154 | 4057 | Coenzyme metabolism | 11 |
| I | 75 | 2609 | Lipid metabolism | 2 |
| Q | 62 | 2754 | Secondary metabolites biosynthesis, transport and catabolism | - |
| Poorly characterized | ||||
| R | 449 | 11948 | General function prediction only | - |
| S | 750 | 6416 | Function unknown | - |
Classification for COG's Database of Protein Functional Category.