| Literature DB >> 29072136 |
Bin Xu1, Yang Wang2, Zewei Wang3, Jiaogen Zhou4, Shuigeng Zhou5,6, Jihong Guan7.
Abstract
BACKGROUND: Predicting protein complexes from protein-protein interaction (PPI) networks has been studied for decade. Various methods have been proposed to address some challenging issues of this problem, including overlapping clusters, high false positive/negative rates of PPI data and diverse complex structures. It is well known that most current methods can detect effectively only complexes of size ≥3, which account for only about half of the total existing complexes. Recently, a method was proposed specifically for finding small complexes (size = 2 and 3) from PPI networks. However, up to now there is no effective approach that can predict both small (size ≤ 3) and large (size >3) complexes from PPI networks.Entities:
Keywords: Large protein complex; Protein complex prediction; Protein-protein interaction; Small protein complex
Mesh:
Substances:
Year: 2017 PMID: 29072136 PMCID: PMC5657047 DOI: 10.1186/s12859-017-1820-8
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The workflow of CPredictor2.0
The numbers of proteins and interactions in the three PPI datasets
| PPI dataset | #Proteins | #Interactions |
|---|---|---|
| Gavin et al. | 1855 | 7669 |
| Krogan et al. | 2674 | 7075 |
| Collins et al. | 1622 | 9074 |
The numbers of small and large complexes in the two benchmark datasets
| Complex dataset | #Small complexes | #Large complexes |
|---|---|---|
| MIPS | 103 | 170 |
| CYC2008 | 222 | 127 |
Fig. 2Recall and precision using MIPS dataset as benchmark. a PPI dataset of Gavin et al., b PPI dataset of Krogan et al., c PPI dataset of Collins et al.
Fig. 3Recall and precision using CYC2008 dataset as benchmark. a PPI dataset of Gavin et al., b PPI dataset of Krogan et al., c PPI dataset of Collins et al.
Fig. 4Performance comparison. Protein complexes are detected from three PPI datasets and MIPS is used as benchmark
Fig. 5Performance comparison. Protein complexes are detected from three PPI datasets and CYC2008 is used as benchmark
Performance comparison. Here, protein complexes are detected from three PPI datasets and MIPS is used as benchmark
| Methods | Small | Large | Total | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Recall | Precision | F1 | Recall | Precision | F1 | Recall | Precision | F1 | |
| (a) Gavin et al. | |||||||||
| MCODE | 0.058 | 0.143 | 0.083 | 0.371 | 0.411 | 0.390 | 0.311 | 0.392 | 0.347 |
| RNSC | 0.184 | 0.071 | 0.102 | 0.524 | 0.393 | 0.449 | 0.487 | 0.221 | 0.304 |
| DPClus | 0.165 | 0.114 | 0.135 | 0.529 | 0.331 | 0.408 | 0.495 | 0.267 | 0.347 |
| CORE | 0.291 | 0.109 | 0.159 | 0.135 | 0.452 | 0.208 | 0.330 | 0.245 | 0.281 |
| ClusterONE | 0.087 |
| 0.133 | 0.488 | 0.481 | 0.485 | 0.403 |
| 0.443 |
| CPredictor | 0.117 | 0.212 | 0.150 | 0.506 | 0.421 | 0.459 | 0.458 | 0.437 | 0.447 |
| CPredictor2.0 |
| 0.146 |
|
|
|
|
| 0.418 |
|
| (b) Krogan et al. | |||||||||
| MCODE | 0.039 | 0.120 | 0.059 | 0.288 | 0.511 | 0.368 | 0.234 |
| 0.304 |
| RNSC | 0.408 | 0.063 | 0.110 | 0.394 | 0.390 | 0.392 | 0.549 | 0.147 | 0.233 |
| DPClus | 0.369 | 0.074 | 0.123 | 0.418 | 0.352 | 0.382 | 0.549 | 0.169 | 0.258 |
| CORE |
| 0.088 | 0.147 | 0.047 | 0.250 | 0.079 | 0.377 | 0.165 | 0.229 |
| ClusterONE | 0.184 | 0.088 | 0.119 | 0.441 | 0.132 | 0.203 | 0.495 | 0.176 | 0.259 |
| CPredictor | 0.233 | 0.132 | 0.169 | 0.453 | 0.425 | 0.438 | 0.513 | 0.338 | 0.407 |
| CPredictor2.0 | 0.417 |
|
|
|
|
|
| 0.391 |
|
| (c) Collins et al. | |||||||||
| MCODE | 0.058 | 0.139 | 0.082 | 0.459 | 0.560 | 0.504 | 0.388 |
| 0.449 |
| RNSC | 0.398 | 0.139 | 0.206 | 0.471 | 0.533 | 0.500 | 0.590 | 0.298 | 0.396 |
| DPClus | 0.350 | 0.146 | 0.206 | 0.512 | 0.440 | 0.473 | 0.579 | 0.313 | 0.407 |
| CORE | 0.388 | 0.141 | 0.206 | 0.247 | 0.605 | 0.351 | 0.451 | 0.298 | 0.358 |
| ClusterONE | 0.350 | 0.187 | 0.244 | 0.553 | 0.431 | 0.484 | 0.586 | 0.346 | 0.435 |
| CPredictor | 0.272 |
| 0.238 | 0.524 | 0.509 | 0.516 | 0.546 | 0.430 | 0.481 |
| CPredictor2.0 |
| 0.200 |
|
|
|
|
| 0.466 |
|
Each bold value means the largest performance measure among the compared methods on the given PPI dataset
Performance comparison. Here protein complexes are detected from three PPI datasets and CYC2008 is used as benchmark
| Methods | Small | Large | Total | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Recall | Precision | F1 | Recall | Precision | F1 | Recall | Precision | F1 | |
| (a) Gavin et al. | |||||||||
| MCODE | 0.023 | 0.143 | 0.039 | 0.457 | 0.558 | 0.502 | 0.255 | 0.577 | 0.354 |
| RNSC | 0.234 | 0.208 | 0.221 | 0.567 | 0.453 | 0.504 | 0.453 | 0.356 | 0.399 |
| DPClus | 0.162 | 0.198 | 0.178 |
| 0.425 | 0.505 | 0.438 | 0.399 | 0.418 |
| CORE | 0.239 | 0.186 | 0.209 | 0.134 | 0.484 | 0.210 | 0.312 | 0.360 | 0.334 |
| ClusterONE | 0.072 |
| 0.127 | 0.567 | 0.580 | 0.574 | 0.347 |
| 0.465 |
| CPredictor | 0.095 | 0.365 | 0.150 | 0.575 | 0.517 | 0.545 | 0.384 | 0.624 | 0.475 |
| CPredictor2.0 |
| 0.273 |
| 0.535 |
|
|
| 0.562 |
|
| (b) Krogan et al. | |||||||||
| MCODE | 0.023 | 0.200 | 0.040 | 0.291 |
| 0.408 | 0.146 |
| 0.239 |
| RNSC |
| 0.163 | 0.244 | 0.465 | 0.550 | 0.504 |
| 0.272 | 0.374 |
| DPClus | 0.414 | 0.179 | 0.250 | 0.520 | 0.512 | 0.516 | 0.564 | 0.306 | 0.397 |
| CORE | 0.414 | 0.172 | 0.243 | 0.079 | 0.357 | 0.129 | 0.444 | 0.283 | 0.346 |
| ClusterONE | 0.176 | 0.186 | 0.180 |
| 0.181 | 0.269 | 0.499 | 0.308 | 0.381 |
| CPredictor | 0.243 | 0.282 | 0.261 | 0.488 | 0.549 | 0.517 | 0.447 | 0.516 | 0.479 |
| CPredictor2.0 | 0.410 |
|
| 0.441 | 0.657 |
| 0.516 | 0.535 |
|
| c) Collins et al. | |||||||||
| MCODE | 0.027 | 0.167 | 0.047 | 0.512 | 0.733 | 0.603 | 0.258 |
| 0.379 |
| RNSC | 0.401 | 0.315 | 0.353 | 0.575 | 0.707 |
|
| 0.499 | 0.526 |
| DPClus | 0.369 | 0.329 | 0.348 | 0.591 | 0.595 | 0.593 | 0.547 | 0.513 | 0.530 |
| CORE | 0.401 | 0.313 | 0.351 | 0.315 |
| 0.450 | 0.473 | 0.512 | 0.491 |
| ClusterONE | 0.320 | 0.392 | 0.352 |
| 0.549 | 0.580 | 0.550 | 0.587 | 0.568 |
| CPredictor | 0.257 |
| 0.327 | 0.598 | 0.652 | 0.624 | 0.473 | 0.657 | 0.550 |
| CPredictor2.0 |
| 0.356 |
| 0.559 | 0.699 | 0.621 |
| 0.644 |
|
Each bold value means the largest performance measure among the compared methods on the given PPI dataset