| Literature DB >> 17147822 |
Woochang Hwang1, Young-Rae Cho, Aidong Zhang, Murali Ramanathan.
Abstract
BACKGROUND: The sparse connectivity of protein-protein interaction data sets makes identification of functional modules challenging. The purpose of this study is to critically evaluate a novel clustering technique for clustering and detecting functional modules in protein-protein interaction networks, termed STM.Entities:
Year: 2006 PMID: 17147822 PMCID: PMC1764415 DOI: 10.1186/1748-7188-1-24
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Figure 1The pharmacodynamic signal transduction model. The pharmacodynamic signal transduction model whose impulse response is an Erlang distribution. The b is the time constant for signal transfer and c is the number of compartments.
Figure 2A simple network example. Each box contains the numerical values obtained from Equation 2 from nodes A, F, G, and H to other target nodes. Results for other nodes are not shown.
STM algorithm
| 1: V: set of nodes in Graph G |
| 2: F(c): Transduction behavior function |
| 3: S( |
| 4: C: the list of final clusters |
| 5: PreClusters: the list of preliminary clusters |
| 6: |
| 7: distance( |
| 8: set parameter |
| 9: signal( |
| 10: |
| 11: |
| 12: |
| 13: |
| 14: make cluster_ |
| 15: cluster_ |
| 16: PreClusters.add(cluster_ |
| 17: |
| 18: cluster_ |
| 19: |
| 20: |
| 21: |
Procedure: Merge(C)
| 1: C: the cluster list |
| 2: MaxPair: the cluster pair( |
| 3: Max.value: interconnections between cluster pair |
| 4: MaxPair ← findMaxPair(C,null) |
| 5: |
| 6: newCluster ← merge MaxPair |
| 7: Replace cluster |
| 8: Remove cluster |
| 9: MaxPair ← findMaxPair(C,newCluster) |
| 10: |
| 11: |
Figure 3Accumulation of lethal proteins for various percentiles. Accumulation of lethal proteins for various percentiles of degree (gray line), betweeness centrality (dashed line) or the STM signal transduction metric (solid line). The results are shown for the top 555 proteins obtained from the yeast PPI network and are ordered; the highest values of these metrics are closest to the origin.
STM clustering result on the yeast PPI dataset
| Distribution | |||||||
| Cluster | Size | Density | H | D | U | -Log | Function |
| 1 | 214 | 0.019 | 24.7 | 69.6 | 5.6 | 43.9 | Nuclear transport |
| 2 | 188 | 0.015 | 69.1 | 25.0 | 5.8 | 36.4 | Cell cycle and DNA processing |
| 3 | 181 | 0.022 | 22.0 | 72.3 | 5.5 | 17.2 | Cytoplasmic and nuclear protein degradation |
| 4 | 170 | 0.028 | 46.4 | 42.9 | 10.5 | 31.6 | Transported compounds (substrates) |
| 5 | 131 | 0.028 | 37.4 | 55.7 | 6.8 | 28.6 | Vesicular transport (Golgi network, etc.) |
| 6 | 125 | 0.030 | 60.8 | 33.6 | 5.6 | 32.2 | tRNA synthesis |
| 7 | 113 | 0.027 | 19.4 | 71.6 | 8.8 | 11.8 | Actin cytoskeleton |
| 8 | 79 | 0.045 | 17.7 | 73.4 | 8.8 | 12.3 | Homeostasis of protons |
| 9 | 78 | 0.033 | 26.9 | 62.8 | 10.2 | 12.5 | Ribosome biogenesis |
| 10 | 76 | 0.041 | 38.1 | 59.2 | 2.6 | 20.2 | rRNA processing |
| 11 | 72 | 0.030 | 5.6 | 84.7 | 9.7 | 6.2 | Calcium binding |
| 12 | 68 | 0.064 | 66.1 | 25.0 | 8.8 | 44.5 | mRNA processing |
| 13 | 61 | 0.041 | 40.9 | 52.4 | 6.5 | 11.5 | Cytoskeleton |
| 14 | 58 | 0.064 | 72.4 | 27.6 | 0.0 | 37.4 | General transcription activities |
| 15 | 53 | 0.048 | 15.0 | 71.6 | 13.2 | 7.9 | MAPKKK cascade |
| 16 | 50 | 0.064 | 66.0 | 32.0 | 2.0 | 33.5 | rRNA processing |
| 17 | 45 | 0.055 | 24.4 | 73.3 | 2.2 | 11.1 | Metabolism of energy reserves |
| 18 | 44 | 0.058 | 59.0 | 36.3 | 4.5 | 5.1 | Metabolism |
| 19 | 39 | 0.072 | 10.2 | 89.7 | 0.0 | 7.3 | Cell-cell adhesion |
| 20 | 36 | 0.125 | 58.3 | 36.1 | 5.5 | 16.9 | Vesicular transport |
| 21 | 29 | 0.091 | 55.1 | 44.8 | 0.0 | 8.3 | Phosphate metabolism |
| 22 | 28 | 0.074 | 14.2 | 78.5 | 7.1 | 4.5 | Lysosomal and vacuolar protein degradation |
| 23 | 27 | 0.119 | 29.6 | 66.6 | 3.7 | 7.3 | Cytokinesis (cell division)/septum formation |
| 24 | 26 | 0.153 | 53.8 | 46.1 | 0.0 | 28.6 | Peroxisomal transport |
| 25 | 25 | 0.090 | 28.0 | 68.0 | 4.0 | 4.6 | Regulation of C-compound and carbohydrate utilization |
| 26 | 25 | 0.116 | 68.0 | 28 | 4.0 | 12.9 | Cell fate |
| 27 | 22 | 0.151 | 59.0 | 36.3 | 4.5 | 11.4 | DNA conformation modification |
| 28 | 21 | 0.147 | 76.1 | 19.0 | 4.7 | 23.9 | Mitochondrial transport |
| 29 | 20 | 0.200 | 75.0 | 20.0 | 5.0 | 24.0 | rRNA synthesis |
| 30 | 19 | 0.228 | 78.9 | 15.7 | 5.2 | 17.9 | Splicing |
| 31 | 17 | 0.220 | 70.5 | 29.4 | 0.0 | 19.7 | Microtubule cytoskeleton |
| 32 | 17 | 0.183 | 23.5 | 76.4 | 0.0 | 8.2 | Regulation of nitrogen utilization |
| 33 | 15 | 0.304 | 86.6 | 13.3 | 0.0 | 31.3 | Energy generation |
| 34 | 14 | 0.142 | 50.0 | 42.8 | 7.1 | 9.0 | Small GTPase mediated signal transduction |
| 35 | 13 | 0.564 | 76.9 | 23.0 | 0.0 | 15.9 | Mitosis |
| 36 | 13 | 0.358 | 84.6 | 15.4 | 0.0 | 12.4 | DNA conformation modification |
| 37 | 13 | 0.410 | 69.2 | 23.0 | 7.6 | 17.6 | 3'-end processing |
| 38 | 13 | 0.179 | 61.5 | 30.7 | 7.6 | 6.7 | DNA recombination and DNA repair |
| 39 | 12 | 0.196 | 16.6 | 75.0 | 8.3 | 3.9 | Unspecified signal transduction |
| 40 | 12 | 0.363 | 58.3 | 41.6 | 0.0 | 14.7 | Posttranslational modification of amino acids |
| 41 | 12 | 0.166 | 16.6 | 75.0 | 8.3 | 2.4 | Autoproteolytic processing |
| 42 | 11 | 0.218 | 54.5 | 45.4 | 0.0 | 2.9 | Transcriptional control |
| 43 | 11 | 0.200 | 72.7 | 27.2 | 0.0 | 8.2 | Enzymatic activity regulation/enzyme regulator |
| 44 | 10 | 0.466 | 80.0 | 20.0 | 0.0 | 14.8 | Translation initiation |
| 45 | 9 | 0.361 | 77.7 | 22.2 | 0.0 | 12.8 | Translation initiation |
| 46 | 8 | 0.321 | 50.0 | 37.5 | 12.5 | 5.6 | Metabolism of energy reserves |
| 47 | 8 | 0.321 | 75.0 | 25.0 | 0.0 | 9.0 | Modification by ubiquitination, deubiquitination |
| 48 | 8 | 0.321 | 37.5 | 62.5 | 0.0 | 3.7 | Mitosis |
| 49 | 7 | 0.333 | 42.8 | 57.1 | 0.0 | 3.5 | DNA damage response |
| 50 | 7 | 0.333 | 57.1 | 28.5 | 14.2 | 4.1 | Vacuolar transport |
| 51 | 7 | 0.285 | 28.5 | 71.4 | 0.0 | 4.4 | Biosynthesis of serine |
| 52 | 6 | 0.333 | 50.0 | 33.3 | 16.6 | 2.38 | Modification by phosphorylation, dephosphorylation, etc. |
| 53 | 5 | 0.400 | 100 | 0.0 | 0.0 | 7.0 | Meiosis |
| 54 | 5 | 0.600 | 100 | 0.0 | 0.0 | 7.0 | Vacuolar transport |
| 55 | 5 | 0.400 | 100 | 0.0 | 0.0 | 8.5 | ER to Golgi transport |
| 56 | 5 | 0.400 | 20.0 | 40.0 | 40.0 | 1.8 | cAMP mediated signal transduction |
| 57 | 5 | 0.500 | 40.0 | 40.0 | 20.0 | 3.1 | Oxidative stress response |
| 58 | 5 | 0.500 | 80.0 | 20.0 | 0.0 | 4.4 | Intracellular signalling |
| 59 | 5 | 0.600 | 40.0 | 60.0 | 0.0 | 4.2 | Tetracyclic and pentacyclic triterpenes |
| 60 | 5 | 0.400 | 60.0 | 40.0 | 0.0 | 4.1 | Mitochondrial transport |
The first column is a cluster identifier; the Size column indicates the number of proteins in each cluster; the Density indicates the density of the cluster; the H column indicates the percentage of proteins concordant with the major function indicated in the last column; the D column indicates the percentage of proteins discordant with the major function and U column indicates percentage of proteins not assigned to any function.
Figure 4Distribution of the three classes of 60 clusters. Distribution of the three classes of 60 clusters: the hit percentage with the assigned function, discordant percentage from the assigned function, and unknown percentage.
Comparison of STM to competing clustering methods for clusters with 5 or more members
| Method | Number | Size | Discard(%) | Function | Location |
| Maximal clique | 120 | 5.65 | 98.4 | 10.6 | 7.93 |
| Quasi clique | 103 | 11.2 | 80.8 | 11.5 | 6.58 |
| Samantha | 64 | 7.9 | 79.9 | 9.16 | 4.89 |
| Minimum cut | 114 | 13.5 | 35.0 | 8.36 | 4.75 |
| Bwtweenness cut | 180 | 10.26 | 21.0 | 8.19 | 4.18 |
| MCL | 163 | 9.79 | 36.7 | 8.18 | 3.97 |
Comparison of STM to competing clustering methods for the yeast protein-protein interaction data set for clusters with 5 or more members. The Number column indicates the number of clusters identified by each method, the Size column indicates the average number of proteins in each cluster; the Discard% indicates the percentage of proteins not assigned to any cluster. The -log p values for biological function and cellular location are shown.
Comparison of STM to competing clustering methods for clusters with 9 or more members
| Method | Number | Size | Discard(%) | Function | Location |
| Maximal clique | N/A | N/A | N/A | N/A | N/A |
| Quasi clique | 46 | 16.7 | 86.7 | 15.3 | 9.34 |
| Samantha | 17 | 12.3 | 93.3 | 15.9 | 7.65 |
| Minimum cut | 44 | 24.3 | 55.0 | 14.8 | 8.78 |
| Bwtweenness cut | 78 | 14.4 | 50.5 | 11.3 | 6.05 |
| MCL | 55 | 16.7 | 69.4 | 11.5 | 5.42 |
Comparison of STM to competing clustering methods for the yeast protein-protein interaction data set for clusters with 9 or more members. The Maximal clique does not identify clusters with 9 or more members. The footnote is the same to Table 4.