| Literature DB >> 29322927 |
Ying Xu1, Jiaogen Zhou2, Shuigeng Zhou3,4, Jihong Guan5.
Abstract
BACKGROUND: Effectively predicting protein complexes not only helps to understand the structures and functions of proteins and their complexes, but also is useful for diagnosing disease and developing new drugs. Up to now, many methods have been developed to detect complexes by mining dense subgraphs from static protein-protein interaction (PPI) networks, while ignoring the value of other biological information and the dynamic properties of cellular systems.Entities:
Keywords: GO annotation; Gene expression; PPI network; Protein complex
Mesh:
Substances:
Year: 2017 PMID: 29322927 PMCID: PMC5763309 DOI: 10.1186/s12918-017-0504-3
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Fig. 1The flowchart of CPredictor3.0. 1) Detecting active proteins; 2) Clustering proteins by function; 3) Computing active proteins of similar function; 4) Extracting candidate complexes from PPI networks; 5) Expanding candidate complexes; 6) Merging candidate complexes
The statistics of PPI datasets
| PPI network | # proteins | # interactions |
|---|---|---|
| Krogan | 2674 | 7075 |
| Collins | 1622 | 9074 |
| WI-PHI | 6400 | 50000 |
The statistics of benchmark datasets
| benchmark database | # complexes | # proteins |
|---|---|---|
| MIPS | 313 | 1237 |
| CYC2008 | 349 | 1627 |
Fig. 2The distribution of protein complex size. a Krogan PPI data set. b Collins PPI data set. c WI-PHI PPI data set
Fig. 3The effect of K and β on prediction performance. a Krogan PPI data and MIPS reference complexes set. b Krogan PPI data and CYC2008 reference complexes set. c Collins PPI data and MIPS reference complexes set. d Collins PPI data and CYC2008 reference complexes set. e WI-PHI PPI data and MIPS reference complexes set. f WI-PHI PPI data and CYC2008 reference complexes set
Fig. 4Performance comparison with eight existing protein complex prediction algorithms in terms of recall, precision, and F1-measure. Our method CPreditor3.0 achieves the highest F1-measure in five of the six experimental settings. (a) Results with Krogan as PPI dataset and MIPS as complex reference set, (b) Results with Krogan as PPI dataset and CYC2008 as complex reference set, (c) Results with Collins as PPI dataset and MIPS as complex reference set, (d) Results with Collins as PPI dataset and CYC2008 as complex reference set, (e) Results with WI-PHI as PPI dataset and MIPS as complex reference set, (f) Results with WI-PHI as PPI dataset and CYC2008 as complex reference set
The average F1-measure values of the nine algorithms on various PPI datasets and complexes reference sets
| Collins | Krogan | WI-PHI | ||||
|---|---|---|---|---|---|---|
| CYC2008 | MIPS | CYC2008 | MIPS | CYC2008 | MIPS | |
| F1 | 0.5518 | 0.4837 | 0.4376 | 0.3534 | 0.2672 | 0.1861 |
Overlapping protein ratios of between PPI datasets and complexes reference sets
| Benchmark database | Krogan | Collins | WI-PHI |
|---|---|---|---|
| MIPS | 30.6% | 49.2% | 19.1% |
| CYC2008 | 43.1% | 68.8% | 25.3% |