| Literature DB >> 22174556 |
Xiaomin Wang1, Zhengzhi Wang, Jun Ye.
Abstract
With the availability of more and more genome-scale protein-protein interaction (PPI) networks, research interests gradually shift to Systematic Analysis on these large data sets. A key topic is to predict protein complexes in PPI networks by identifying clusters that are densely connected within themselves but sparsely connected with the rest of the network. In this paper, we present a new topology-based algorithm, HKC, to detect protein complexes in genome-scale PPI networks. HKC mainly uses the concepts of highest k-core and cohesion to predict protein complexes by identifying overlapping clusters. The experiments on two data sets and two benchmarks show that our algorithm has relatively high F-measure and exhibits better performance compared with some other methods.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22174556 PMCID: PMC3228514 DOI: 10.1155/2011/480294
Source DB: PubMed Journal: J Biomed Biotechnol ISSN: 1110-7243
Figure 1The illustration of k-cores of a graph. (a) Graph G; (b) 2-core of G; (c) 3-core of G, it is also the highest k-core of G.
Figure 2The illustration of a typical cluster with densely connected core and less densely connected non-core.
Comparison with MCODE. P, R, and F stand for precision, recall, and F-measure, respectively, and their definitions are given in Section 3.1. MIPS data set contains 4,554 proteins and 12,526 interactions, and SGD-MC data set contains 4,448 proteins and 29,068 interactions. AC is the number of all clusters predicted by the algorithm; EC is the number of effective clusters (with a least one matching complex above overlap ratio 0.4) found by the algorithm; MC is the number of matched complexes in the benchmark set. The sizes of complexcat benchmark and Gavin benchmark are 217 and 204, respectively. For HKC the optimized parameters are T1, T2, and T3, respectively, and for MCODE the optimized parameters are NodeScoreCutoff, fluff (T for true, F for false), haircut (T for true, F for false), and other unspecified parameters adopt the default values.
| Algorithm | Data set | Benchmark | AC | EC | MC | Optimized parameters | |||
|---|---|---|---|---|---|---|---|---|---|
| MCODE | MIPS | 0.455 | 0.194 | 0.271 | 66 | 30 | 42 | 0.05, F, F | |
| complexcat | |||||||||
| MCODE | SGD-MC | 0.213 | 0.221 | 0.217 | 197 | 42 | 48 | 0.05, F, T | |
| MCODE | MIPS | 0.303 | 0.098 | 0.148 | 66 | 20 | 20 | 0.05, F, T | |
| Gavin | |||||||||
| MCODE | SGD- MC | 0.283 | 0.152 | 0.198 | 106 | 30 | 31 | 0, F, T | |
Figure 3The influence of different parameters on algorithm performance. Note the F-measure in y-axis is the average value of all F-measures with one parameter specified among the 120 groups of experiment results evaluated by the complexcat benchmark, triangles mark the result on MIPS network, and circles mark the result on SGD-MC network.
Figure 4The number of effective clusters and the number of matched complexes by HKC and MCODE with respect to different OR thresholds. The result is corresponding to MIPS data set and evaluated on complexcat benchmark. Triangles mark the results of HKC and circles mark the results of MCODE.
Figure 5Precision versus Recall plots of all WCODE and HKC results with different parameters on various data sets. (a) Input data set: MIPS, benchmark: complexcat. (b) Input data set: MIPS, benchmark: Gavin. (c) Input data set: SGD-MC, benchmark: complexcat. (d) Input data set: SGD-MC, benchmark: Gavin. For all four cases, the data points resulted by HKC are located in the upper right portion of the plot, corresponding to high values of F-measure, while most of the data points resulted by MCODE are located in the lower left part of the plot. Furthermore, the data points resulted by HKC are much more centralized than MCODE.
Figure 6Examples of clusters predicted by HKC in MIPS data set. The nodes encircled by the red-dotted line are known complex in the complexcat benchmark, the nodes contained within the blue circle are clusters predicted by HKC, and the yellow node in each cluster denotes the seed of the cluster. (a) A cluster of size 11 perfectly matches with the TRAPP complex. (b) A cluster of size 14 shares 14 proteins with the SAGA complex (size 16), and their overlap ratio is 0.93. The two proteins YCL010c and YGL066w that are not contained in the predicted cluster are isolated nodes with only one edge connecting with the cluster. (c) An example of a well-matched cluster, involving 7 proteins, among which 6 is in common with the complex of cytoplasmic translation initiation factor 3 (eIF3). (d) A novel cluster detected by HKC, which does not match with any known protein complexes in the complexcat benchmark, and the proteins in the black circle form the docking complex that facilitates the import of peroxisomal matrix proteins according to GO annotation.
All novel predictions by HKC on MIPS data set. Each column gives the original ID of clusters (ID), cluster score (S), number of nodes (N), number of edges (E), and the protein names in each cluster.
| ID | Protein names | |||
|---|---|---|---|---|
| 1 | 21.5 | 38 | 406 | YLR200w,YNL153c,YDR318w,YOR349w,YHR191c,YEL003w,YJL013c,YLR085c,YPL269w,YHR129c,YOR026w,YMR055c,YPL155c,YCL029c,YMR048w, YML094w,YJL030w,YGR188c,YCL016c,YPR135w,YML124c,YPR141c,YER016w,YOR058c,YER007w,YDR150w,YGR078c,YJR053w,YEL061c, YMR294w,YMR299c,YPL008w,YMR078c,YGL086w,YPL174c,YFL037w,YOL012c,YPL241c |
| 2 | 20.1 | 33 | 328 | YPL008w,YMR078c,YGR188c,YGL086w,YCL016c,YPL241c,YMR294w,YER007w,YOR058c,YER016w,YPR141c,YPR135w,YJL030w,YPL155c,YOR026w, YHR129c,YPL269w,YJL013c,YDR254w,YDR318w,YFL037w,YPL174c,YEL061c,YGR078c,YLR200w,YML124c,YML094w,YCL029c,YEL003w,YOR349w, YNL153c,YMR138w,YOR265w |
| 3 | 19.7 | 27 | 260 | YGL086w,YMR078c,YCL016c,YG |
| 4 | 18. 9 | 28 | 260 | YGR188c,YLR381w,YPL018w,YDR254w,YDR318w,YOL012c,YGL086w,YPL008w,YGR078c,YDL003w,YLR200w,YML094w,YMR048w,YLR085c, YEL003w,YOR349w,YNL153c,YNL298w,YMR078c,YEL061c,YER016w,YPR141c,YPR135w,YCL016c,YJL030w,YHR191c, YOR195w,YGL216w |
| 5 | 15.2 | 24 | 179 | YNL271c,YNL322c,YBL061c,YBR023c,YGR229c,YBL007c,YER155c,YLR330w,YHR030c,YNL298w,YDL029w,YJR075w,YGR078c,YLR200w,YLR337c, YML094w,YDR388w,YCR009c,YBR234c,YEL003w,YJL095w,YNL153c,YJL020c,YMR109w |
| 6 | 15.2 | 25 | 185 | YER155c,YEL003w,YGR078c,YML094w,YNL271c,YGR229c,YNL298w,YLR337c,YNL153c,YLR200w,YHR030c,YBL007c,YCR009c,YJL095w,YDR245w, YBL061c,YHR142w,YLR330w, YDL029w,YDR388w,YBR023c,YBR234c,YJR075w,YNL322c,YDR129c |
| 7 | 14.8 | 27 | 196 | YBL007c,YEL003w,YGR078c,YLR200w,YML094w,YNL153c,YNL298w,YJL095w,YHR111w,YGR229c,YBR234c,YCR009c,YER111c,YBR023c,YDR388w, YHR030c,YDL029w,YLR330w,YHR142w,YDR424c,YNL271c,YFR019w,YBL061c,YPL031c,YBR200w,YER155c,YOR326w |
| 8 | 14.6 | 25 | 178 | YLR337c,YNL322c,YNL271c,YBL007c,YLR200w,YBL061c,YHR142w,YLR330w,YML094w,YBR023c,YNL233w,YCR009c,YJR075w,YNL153c,YNL298w, YDL029w,YHR030c,YJL183w,YDR388w,YBR234c,YGR229c,YLR342w,YJL095w,YJL099w,YJR118c |
| 9 | 14.3 | 21 | 146 | YNL250w,YER016w,YPR141c,YHR191c,YGL163c,YMR078c,YMR190c,YKL113c,YOR144c,YCL061c,YER173w,YPR135w,YCL016c,YMR048w,YLR234w, YJL092w,YLR103c,YML032c,YPL194w,YNL273w,YJR043c |
| 10 | 14.3 | 21 | 147 | YML032c,YER016w,YJR043c,YMR078c,YNL273w,YLR103c,YPR135w,YCL016c,YMR048w,YHR191c,YGL163c,YMR190c,YDR004w,YOR144c,YER095w, YCL061c,YDR076w,YER173w,YNL250w,YJL092w,YKL113c |
| 11 | 14.2 | 21 | 146 | YNL298w,YHR030c,YJL095w,YMR109w,YBR023c,YDL029w,YBR234c,YGR078c,YLR200w,YBL061c,YML094w,YDR388w,YJL020c,YCR009c,YGR229c, YEL003w,YNL153c,YLR337c,YEL031w,YBL007c,YJR075w |
| 13 | 13.9 | 23 | 155 | YLR200w,YNL271c,YNL153c,YBR234c,YNL322c,YHR142w,YER111c,YBL007c,YCR009c,YGR229c,YJL095w,YNL298w,YLR330w,YBR023c,YJR075w, YBL061c,YDL029w,YHR030c,YJR118c,YDR388w,YBL047c,YNL233w,YLR342w |
| 14 | 13.7 | 23 | 153 | YHR191c,YML032c,YHR031c,YMR078c,YCL061c,YCL016c,YMR048w,YNL273w,YNL250w,YPR135w,YMR190c,YKL113c,YLR234w,YJR043c,YOR144c, YLR103c,YJL092w,YOL006c,YPL024w, YBR098w,YDR386w,YLR235c,YDR363w |
| 15 | 13.6 | 23 | 154 | YER173w,YKL113c,YML032c,YNL273w,YOR144c,YMR048w,YJR043c,YMR078c,YCL061c,YLR103c,YPR135w,YCL016c,YHR191c,YMR190c,YPL024w, YDR052c,YNL250w,YHR031c,YLR235c,YLR234w,YJL092w,YDL017w,YHR154w |
| 16 | 13.3 | 21 | 136 | YPL194w,YML032c,YNL250w,YJL092w,YLR103c,YMR078c,YCL016c,YHR191c,YGL163c,YJR043c,YMR190c,YKL113c,YNL273w,YCL061c,YER016w, YER173w,YPR135w,YMR048w,YOR144c,YDL013w,YER116c |
| 17 | 12.9 | 22 | 138 | YER016w,YGL163c,YHR191c,YNL273w,YML032c,YLR103c,YMR078c,YKL113c,YOR144c,YCL061c,YPR135w,YNL250w,YCL016c,YHR031c,YMR048w, YLR234w,YJL092w,YJR043c,YMR190c,YDR363w,YBR228w,YLR135w |
| 20 | 10.8 | 13 | 66 | YER146w,YDR378c,YNL147w,YLR438c-a,YER112w,YJL124c,YCR077c,YBL026w,YJR022w,YOL149w,YGL173c,YNL118c,YEL015w |
| 21 | 10.8 | 11 | 54 | YBL026w,YER146w,YDR378c,YJR022w,YNL147w,YLR438c-a,YER112w,YOL149w,YJL124c,YCR077c,YGL173c |
| 24 | 10 | 17 | 82 | YBL061c,YDR388w,YBR023c,YJL095w,YNL298w,YNL271c,YLR330w,YHR030c,YER111c,YGR229c,YER155c,YPL031c,YMR307w,YOR008c,YLR342w, YLR371w,YLR332w |
| 25 | 9.9 | 17 | 81 | YJL095w,YDL029w,YCR009c,YNL298w,YBL061c,YDR388w,YER111c,YGR229c,YNL322c,YLR330w,YHR030c,YBR023c,YMR307w,YGL027c,YML115c, YGL200c,YPR159w |
| 26 | 9.5 | 17 | 78 | YNL298w,YDR388w,YER111c,YNL271c,YLR330w,YHR030c,YBL061c,YBR023c,YNL322c,YMR307w,YLR039c,YLR262c,YGR229c,YLR342w,YJR073c, YJL183w,YKL190w |
| 27 | 8.6 | 14 | 56 | YLR039c,YPL051w,YLR262c,YJL154c,YDR126w,YGL005c,YOR132w,YOR069w,YNL051w,YHL031c,YNL041c,YOR070c,YML071c,YBR164c |
| 28 | 8.5 | 14 | 55 | YOL012c,YLR039c,YLR262c,YLR085c,YLR418c,YAL011w,YAL013w,YMR263w,YJL168c,YML041c,YGL244w,YPL181w,YDR334w,YOR123c |
| 29 | 7.7 | 13 | 46 | YNL051w,YHL031c,YPL051w,YOR070c,YNL041c,YLR262c,YML071c,YDR126w,YKL190w,YLR039c,YBR164c,YKR001c,YMR004w |
| 30 | 7.6 | 11 | 39 | YBR023c,YKL190w,YNL322c,YHR030c,YGR229c,YMR307w,YJR073c,YLR262c,YDR162c,YLR039c,YDL006w |
| 31 | 7.5 | 13 | 45 | YNL051w,YML071c,YPL051w,YOR070c,YNL041c,YLR262c,YOL018c,YDR126w,YHL031c,YBR164c,YLR039c,YNL238w,YLL040c |
| 39 | 6 | 17 | 51 | YBR200w,YNL298w,YOR127w,YHR061c,YHL007c,YER149c,YGR152c,YLL021w,YPL115c,YLR319c,YDR309c,YBL085w,YAL041w,YER114c,YOR188w, YMR273c,YLR229c |
| 40 | 6 | 8 | 25 | YCR088w,YOR284w,YGR268c,YDR388w,YBL007c,YNL094w,YNL243w,YHR016c |
| 42 | 6 | 7 | 18 | YER155c,YAL040c,YDL047w,YKR028w,YJL098w,YFR040w,YKR072c |
| 43 | 5.8 | 10 | 27 | YMR263w,YOR123c,YLR085c,YGL244w,YML041c,YJL168c,YAL013w,YLR418c,YBL008w,YOR038c |
| 48 | 5.5 | 8 | 26 | YIL144w,YDR201w,YFL008w,YOL069w,YFR031c,YEL043w,YDL074c,YJL074c |
| 50 | 5.3 | 17 | 55 | YLR347c,YNL189w,YML064c,YLR082c,YLR245c,YDR321w,YPL070w,YBR176w,YNL044w,YNL331c,YPL124w,YER023w,YLR423c,YJR056c,YEL066w, YJL218w,YNR012w |
| 51 | 5.3 | 9 | 26 | YOR284w,YJL199c,YNL189w,YML064c,YBR176w,YLR245c,YPL070w,YLR291c,YHL018w |
| 55 | 4.9 | 8 | 18 | YHR066w,YPL093w,YER006w,YPL211w,YMR049c,YMR290c,YNL002c,YGR103w |
| 56 | 4.9 | 8 | 17 | YGL244w,YLR039c,YLR262c,YOR216c,YLR085c,YLR418c,YIR005w,YGL174w |
| 59 | 4.5 | 17 | 49 | YML064c,YNL189w,YLR347c,YGR010w,YKL067w,YFR047c,YGL175c,YOR020c,YLR328w,YLR335w,YGL040c,YGL037c,YNL333w,YFL059w,YPL111w, YGR267c,YLR377c |
| 60 | 4.5 | 5 | 11 | YFL059w,YNL333w,YMR322c,YMR095c,YMR096w |
| 61 | 4.5 | 5 | 10 | YNL214w,YDR244w,YGL153w,YDR142c,YLR191w |
| 65 | 4.4 | 6 | 12 | YLR310c,YLL016w,YAL024c,YNL098c,YAR019c,YOR101w |
| 70 | 4.3 | 7 | 16 | YML064c,YPL124w,YLR423c,YPL070w,YDR148c,YER086w,YDL239c |
| 71 | 4.3 | 7 | 14 | YDL226c,YGR172c,YLR324w,YGL198w,YDR425w,YGL161c,YPL095c |
| 73 | 4.3 | 7 | 14 | YLR347c,YMR047c,YKL068w,YBR216c,YML007w,YER107c,YDL207w |
| 74 | 4.3 | 7 | 14 | YLR039c,YPL234c,YKL190w,YLR262c,YER081w,YHR060w,YGR020c |
| 77 | 4 | 5 | 11 | YER093c,YJL058c,YJR066w,YKL203c,YNL006w |
| 78 | 4 | 5 | 12 | YKL203c,YJR066w,YNL006w,YHR186c,YPL180w |
| 82 | 4 | 5 | 8 | YLR039c,YLR262c,YCR020c-a,YPR051w,YEL053c |
| 90 | 4 | 5 | 8 | YAR019c,YGR092w,YDL047w,YPR111w,YIL106w |
| 93 | 4 | 5 | 8 | YLR039c,YOR070c,YBR036c,YMR272c,YPL057c |
| 94 | 4 | 5 | 8 | YMR197c,YHL031c,YLR026c,YKL006c-a,YKL196c |