| Literature DB >> 19259413 |
Tijana Milenković1, Natasa Przulj.
Abstract
MOTIVATION: Proteins are essential macromolecules of life and thus understanding their function is of great importance. The number of functionally unclassified proteins is large even for simple and well studied organisms such as baker's yeast. Methods for determining protein function have shifted their focus from targeting specific proteins based solely on sequence homology to analyses of the entire proteome based on protein-protein interaction (PPI) networks. Since proteins interact to perform a certain function, analyzing structural properties of PPI networks may provide useful clues about the biological function of individual proteins, protein complexes they participate in, and even larger subcellular machines.Entities:
Year: 2008 PMID: 19259413 PMCID: PMC2623288
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1The thirty 2-, 3-, 4-, and 5-node graphlets G0, G1, …, G29 and their automorphism orbits 0, 1, 2, …, 72. In a graphlet Gi, i ∈ {0, 1, …, 29}, nodes belonging to the same orbit are of the same shade (Pržulj, 2006).
Figure 2Signature vectors of proteins with signature similarities: (A) above 0.90; and (B) below 0.40. The 73 orbits are presented on the abscissa and the numbers of times that nodes touch a particular orbit are presented on the ordinate in log scale. In the interest of the aesthetics of the plot, we added 1 to all orbit frequencies to avoid the log-function to go to infinity in the case of orbit frequencies of 0.
Figure 3An example of a three-node cluster, consisting of proteins RPO26, SMD1, and SMB1. The categories of biological functions that the proteins belong to are presented bellow the protein names.
Figure 4The results of applying our method to the six yeast PPI networks (vonMering-core, vonMering, Krogan, DIP-core, DIP, and MIPS) and the three protein properties (subcellular localizations, biological functions, and protein complexes) in accordance with the flexible scheme: (A) hit-rates for subcellular localizations; (B) miss-rates for subcellular localizations; (C) hit-rates for biological functions; (D) miss-rates for biological functions; (E) hit-rates for protein complexes; (F) miss-rates for protein complexes.
Figure 5The results of applying our method to the three human PPI networks (BIOGRID, HPRD, and Rual) and the three protein properties (cellular components, tissue expressions, and biological processes): (A) hit-rates for cellular components; (B) miss-rates for cellular components; (C) hit-rates for tissue expressions; (D) miss-rates for tissue expressions; (E) hit-rates for biological processes; (F) miss-rates for biological processes.
Predicted functions with prediction hit-rates of 50% or higher according to the strict scheme for yeast proteins that are unannotated in MIPS and that have degrees higher than four in any of the six yeast PPI networks. The column denoted by “Protein of interest” contains a protein of interest for which the function is predicted. The column denoted by “Degree” contains the degree of a given protein in the corresponding PPI network. The column denoted by “PPI Network” contains the PPI network from which the protein function was derived. The column denoted by “Number of proteins in cluster” contains the total number of proteins in the cluster, including the protein of interest. The column denoted by “Number of unclassified proteins in cluster” contains the number of functionally unclassified proteins in a given cluster, including the protein of interest. The column denoted by “Majority (and predicted) function” contains the common functions amongst at least 50% proteins in the cluster that are also predicted functions for the protein of interest. The column denoted by “Number of proteins in cluster with the majority function” contains the number of nodes in the cluster with the majority function. The column denoted by “Hit-rate” contains the percentage of the total number of proteins in the cluster with the majority function; only the maximum hit-rate is reported for a protein of interest. Finally, the column denoted by “Miss-rate” contains the percentage of annotated nodes in the cluster that do not have a common function with any other annotated node in the cluster.
| Protein of interest | Degree | PPI network | Number of proteins in cluster | Number of unclassified proteins in cluster | Majority (and predicted) function | Number of proteins in cluster with the majority function | Hit-rate | Miss-rate |
|---|---|---|---|---|---|---|---|---|
| PWP1 (YLR196W) | 22 | vonMering | 23 | 1 | rRNA processing | 13 | 59.09% | 13.64% |
| STO1 (YMR125W2) | 42 | vonMering | 6 | 1 | ATP binding | 3 | 60.00% | 20.00% |
| YMR074C | 6 | vonMering | 3 | 1 | Ribosomal proteins | 2 | 100.00% | 0.00% |
| YMR310C | 51 | vonMering | 7 | 1 | Ribosomal proteins | 5 | 83.33% | 0.00% |
| YNL122C | 6 | vonMering | 3 | 1 | Aminoacyl-tRNA-synthetases | 2 | 100.00% | 0.00% |
| YOR093C | 15 | vonMering | 3 | 1 | Lipid, fatty acid and isoprenoid metabolism | 2 | 100.00% | 0.00% |
| COS6 (YGR295C) | 6 | DIP-core | 9 | 1 | Protein targeting, sorting and translation | 4 | 50.00% | 50.00% |
| YAL027W | 19 | Krogan | 9 | 3 | rRNA processing | 3 | 50.00% | 33.33% |
| YLR455W | 19 | Krogan | 7 | 2 | rRNA processing | 3 | 60.00% | 40.00% |
| PBY1 (YBR094W) | 23 | MIPS | 6 | 1 | Cell wall | 3 | 60.00% | 0.00% |
| YER084W | 5 | MIPS | 7 | 2 | Vacuolar/lysosomal transport | 3 | 60.00% | 20.00% |
| YPT35 (YHR105W) | 5 | MIPS | 7 | 2 | Nuclear transport | 3 | 60.00% | 40.00% |
| ILM1 (YJR118C) | 11 | MIPS | 8 | 2 | DNA repair | 3 | 50.00% | 16.67% |
| Meiotic recombination | 3 | |||||||
| Protein binding | 3 | |||||||
| IES2 (YNL215W) | 7 | MIPS | 5 | 1 | Transcriptional control | 2 | 50.00% | 50.00% |
| YAL018C | 9 | DIP | 3 | 1 | Protein targeting, sorting and translocation | 2 | 100.00% | 0.00% |
| OLA1 (YBR025C) | 8 | DIP | 9 | 2 | Protein binding | 4 | 57.14% | 14.29% |
| COS4 (YFL062W) | 22 | DIP | 5 | 1 | transport facilities | 2 | 50.00% | 50.00% |
| YFR016C | 5 | DIP | 5 | 1 | DNA conformation modification (e.g. chromatin) | 2 | 50.00% | 50.00% |
| YOR220W | 6 | DIP | 5 | 2 | Protein binding | 3 | 100.00% | 0.00% |
Predicted functions with prediction hit-rates higher than 50% according to the flexible scheme for yeast proteins that are unannotated in MIPS and that have degrees higher than four in any of the six yeast PPI networks. The columns have the same meaning as in Table 1
| Protein of interest | Degree | PPI Network | Number of proteins in cluster | Number of unclassified proteins in cluster | Majority (and predicted) function | Number of proteins in cluster with the majority function | Hit-rate | Miss-rate |
|---|---|---|---|---|---|---|---|---|
| PWP1 (YLR196W) | 22 | vonMering | 23 | 1 | Transcription | 17 | 77.27% | 0.00% |
| STO1 (YMR125W2) | 42 | vonMering | 6 | 1 | Protein with binding function or cofactor requirement (structural or catalytic) | 5 | 100.00% | 0.00% |
| Protein fate (folding, modification, destination) | 4 | |||||||
| Transcription | 3 | |||||||
| OLA1 (YBR025C) | 10 | vonMering | 3 | 1 | Cell rescue, defense and virulence | 2 | 100.00% | 0.00% |
| YMR074C | 6 | vonMering | 3 | 1 | Protein synthesis | 2 | 100.00% | 0.00% |
| YMR310C | 51 | vonMering | 7 | 1 | Protein synthesis | 5 | 83.33% | 0.00% |
| YNL122C | 6 | vonMering | 3 | 1 | Protein synthesis | 2 | 100.00% | 0.00% |
| YOR093C | 15 | vonMering | 3 | 1 | Metabolism | 2 | 100.00% | 0.00% |
| COS6 (YGR295C) | 6 | DIP-core | 9 | 1 | Cellular transport, transport facilities and transport routes | 6 | 75.00% | 12.50% |
| Protein fate (folding, modification, destination) | 5 | |||||||
| YAL027W | 19 | Krogan | 9 | 3 | Transcription | 4 | 66.67% | 0.00% |
| Protein with binding function or cofactor requirement (structural or catalytic) | 4 | |||||||
| GDT1 (YBR187W) | 6 | Krogan | 3 | 1 | Transcription | 2 | 100.00% | 0.00% |
| YLR455W | 19 | Krogan | 7 | 2 | Transcription | 4 | 80.00% | 0.00% |
| Protein with binding function or cofactor requirement (structural or catalytic) | 3 | |||||||
| PBY1 (YBR094W) | 23 | MIPS | 6 | 1 | Cell cycle and DNA processing | 3 | 60.00% | 0.00% |
| Cellular transport, transport facilities and transport routes | 3 | |||||||
| Biogenesis of cellular components | 3 | |||||||
| SHU2 (YDR078C) | 5 | MIPS | 4 | 1 | Protein fate (folding, modification, destination) | 2 | 66.67% | 0.00% |
| Protein with binding function or cofactor requirement (structural or catalytic) | 2 | |||||||
| YER084W | 5 | MIPS | 7 | 2 | Cellular transport, transport facilities and transport routes | 4 | 80.00% | 20.00% |
| Protein fate (folding, modification, destination) | 3 | |||||||
| YPT35 (YHR105W) | 5 | MIPS | 7 | 2 | Protein fate (folding, modification, destination) | 3 | 60.00% | 0.00% |
| Cellular transport, transport facilities and transport routes | 3 | |||||||
| EAF6 (YJR082C) | 16 | MIPS | 13 | 2 | Transcription | 6 | 54.55% | 9.09% |
| ILM1 (YJR118C) | 11 | MIPS | 8 | 2 | Cell cycle and DNA processing | 4 | 66.67% | 0.00% |
| YKL061W | 7 | MIPS | 4 | 1 | Metabolism | 2 | 66.67% | 33.33% |
| RAD33 (YML011C) | 5 | MIPS | 9 | 2 | Protein fate (folding, modification, destination) | 4 | 57.14% | 14.29% |
| Cellular transport, transport facilities and transport routes | 4 | |||||||
| IES2 (YNL215W) | 7 | MIPS | 5 | 1 | Cell cycle and DNA processing | 3 | 75.00% | 0.00% |
| SGT2 (YOR007C) | 5 | MIPS | 10 | 1 | Transcription | 5 | 55.56% | 0.00% |
| YPR084W | 6 | MIPS | 4 | 2 | Cellular transport, transport facilities and transport routes | 2 | 100.00% | 0.00% |
| YAL018C | 9 | DIP | 3 | 1 | Protein fate (folding, modification, destination) | 2 | 100.00% | 0.00% |
| Cellular transport, transport facilities and transport routes | 2 | |||||||
| UIP3 (YAR027W) | 38 | DIP | 5 | 2 | Interaction with the environment | 2 | 66.67% | 33.33% |
| YAR028W | 11 | DIP | 3 | 1 | Cellular transport, transport facilities and transport routes | 2 | 100.00% | 0.00% |
| OLA1 (YBR025C) | 8 | DIP | 9 | 2 | Protein with binding function or cofactor requirement (structural or catalytic) | 6 | 85.71% | 0.00% |
| YDL089W | 8 | DIP | 4 | 2 | Protein fate (folding, modification, destination) | 2 | 100.00% | 0.00% |
| YEL068C | 5 | DIP | 6 | 3 | Cell cycle and DNA processing | 2 | 66.67% | 0.00% |
| COS4 (YFL062W) | 22 | DIP | 5 | 1 | Cellular transport, transport facilities and transport routes | 3 | 75.00% | 25.00% |
| Biogenesis of cellular components | 3 | |||||||
| YHR140W | 61 | DIP | 16 | 2 | Metabolism | 10 | 71.43% | 7.14% |
| YET1 (YKL065C) | 51 | DIP | 34 | 4 | Cellular transport, transport facilities and transport routes | 16 | 53.33% | 0.00% |
| YLL023C | 22 | DIP | 17 | 2 | Cellular transport, transport facilities and transport routes | 10 | 66.67% | 0.00% |
| RAD33 (YML011C) | 5 | DIP | 9 | 3 | Metabolism | 4 | 66.67% | 0.00% |
| YNL092W | 29 | DIP | 3 | 1 | Metabolism | 2 | 100.00% | 0.00% |
| PRM1 (YNL279W) | 6 | DIP | 4 | 1 | Metabolism | 2 | 66.67% | 33.33% |
| Biogenesis of cellular components | 2 | |||||||
| YOR164C | 6 | DIP | 6 | 2 | Biogenesis of cellular components | 3 | 75.00% | 25.00% |
| YOR220W | 6 | DIP | 5 | 2 | Protein with binding function or cofactor requirement (structural or catalytic) | 3 | 100.00% | 0.00% |
| Cellular transport, transport facilities and transport routes | 2 | |||||||
| Interaction with the environment | 2 | |||||||
| Biogenesis of cellular components | 2 |
Figure 6Signature vectors of proteins belonging to the TP53 cluster. The cluster is formed using the threshold of 0.95. The axes have the same meaning as in Figure 2.