| Literature DB >> 27623844 |
Le Ou-Yang1,2, Xiao-Fei Zhang3, Dao-Qing Dai4, Meng-Yun Wu5, Yuan Zhu6, Zhiyong Liu7, Hong Yan2.
Abstract
BACKGROUND: Protein complexes are the key molecular entities to perform many essential biological functions. In recent years, high-throughput experimental techniques have generated a large amount of protein interaction data. As a consequence, computational analysis of such data for protein complex detection has received increased attention in the literature. However, most existing works focus on predicting protein complexes from a single type of data, either physical interaction data or co-complex interaction data. These two types of data provide compatible and complementary information, so it is necessary to integrate them to discover the underlying structures and obtain better performance in complex detection.Entities:
Keywords: Multi-view learning; Protein complex; Protein-protein interaction
Mesh:
Year: 2016 PMID: 27623844 PMCID: PMC5022186 DOI: 10.1186/s12859-016-1164-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1The overall framework of PSMVC. Schematic overview of the algorithm
Fig. 2The effect of K and λ. Performance of PSMVC on protein complex detection with different values of K and λ measured by Acc with respect to MIPS. The x-axis denotes the value of logλ and the y-axis denotes the value of Acc
Fig. 3The effect of η. Performance of PSMVC on protein complex detection with different values of η measured by Acc with respect to MIPS. The x-axis denotes the value of η and the y-axis denotes the value of Acc
Fig. 4Single view vs. multi-view. Acc, Recall and FRAC of PSMVC, PSMVC-FS and PSMVC-TAP with respect to (a) CYC2008 and (b) SGD
Comparison between PSMVC and various protein complex detection algorithms in terms of three evaluation metrics with respect to two reference sets
| Methods | # complexes | # proteins | Reference sets | |||||
|---|---|---|---|---|---|---|---|---|
| CYC2008 | SGD | |||||||
| Evaluation metrics | ||||||||
| FRAC | Recall | Acc | FRAC | Recall | Acc | |||
| PSMVC | 1534 | 5508 | 0.712 | 0.706 | 0.814 | 0.607 | 0.598 | 0.699 |
| EC-BNMF [ | 400 | 1936 | 0.577 | 0.558 | 0.763 | 0.530 | 0.497 | 0.681 |
| InteHC [ | 366 | 2763 | 0.571 | 0.527 | 0.765 | 0.530 | 0.466 | 0.697 |
| ClusterONE [ | 362 | 1394 | 0.337 | 0.353 | 0.559 | 0.333 | 0.333 | 0.512 |
| CMC [ | 566 | 1391 | 0.442 | 0.468 | 0.523 | 0.388 | 0.420 | 0.475 |
| Linkcomm [ | 1531 | 2640 | 0.399 | 0.492 | 0.549 | 0.399 | 0.455 | 0.516 |
| MCODE [ | 83 | 952 | 0.166 | 0.139 | 0.435 | 0.109 | 0.094 | 0.388 |
| MINE [ | 231 | 1247 | 0.337 | 0.312 | 0.526 | 0.295 | 0.275 | 0.497 |
| MF-PINCoC [ | 1099 | 2838 | 0.399 | 0.368 | 0.563 | 0.355 | 0.330 | 0.520 |
| PINCoC [ | 1101 | 4457 | 0.423 | 0.394 | 0.573 | 0.404 | 0.366 | 0.535 |
| RANCoC [ | 1069 | 2797 | 0.436 | 0.406 | 0.596 | 0.410 | 0.379 | 0.542 |
| SPICi [ | 420 | 2041 | 0.350 | 0.329 | 0.563 | 0.339 | 0.313 | 0.510 |
| BT [ | 409 | 1286 | 0.509 | 0.463 | 0.749 | 0.508 | 0.461 | 0.678 |
| C2S [ | 1035 | 4499 | 0.571 | 0.527 | 0.781 | 0.519 | 0.463 | 0.692 |
| CACHET [ | 449 | 963 | 0.472 | 0.665 | 0.697 | 0.448 | 0.626 | 0.632 |
| Hart [ | 390 | 1307 | 0.509 | 0.467 | 0.746 | 0.481 | 0.421 | 0.665 |
| Pu [ | 400 | 1504 | 0.479 | 0.418 | 0.729 | 0.497 | 0.429 | 0.669 |
Here “# complexes” denotes the number of complexes predicted by each algorithm, and “# proteins” denotes the number of proteins covered by the complexes predicted by each algorithm
The number of complexes detected by various algorithms that match with known complexes and the number of known complexes that are discovered by various algorithms
| Methods | Number of predicted complexes that are | Number of reference complexes that are | ||
|---|---|---|---|---|
| matched by the reference complexes | matched by the predicted complexes | |||
| CYC2008 | SGD | CYC2008 | SGD | |
| PSMVC | 113 | 107 | 116 | 111 |
| EC-BNMF | 87 | 85 | 94 | 97 |
| InteHC | 78 | 75 | 93 | 97 |
| ClusterONE | 59 | 61 | 55 | 61 |
| CMC | 80 | 81 | 72 | 71 |
| Linkcomm | 95 | 92 | 65 | 73 |
| MCODE | 22 | 17 | 27 | 20 |
| MINE | 49 | 49 | 55 | 54 |
| MF-PINCoC | 57 | 58 | 65 | 65 |
| PINCoC | 61 | 63 | 69 | 74 |
| RANCoC | 63 | 66 | 71 | 75 |
| SPICi | 52 | 55 | 57 | 62 |
| BT | 69 | 77 | 83 | 93 |
| C2S | 78 | 76 | 93 | 95 |
| CACHET | 171 | 169 | 77 | 82 |
| Hart | 70 | 69 | 83 | 88 |
| Pu | 61 | 69 | 78 | 91 |
Fig. 5Comparison with ensemble clustering and data integration algorithms. Acc, Recall and FRAC of PSMVC, EC-BNMF and InteHC with respect to (a) CYC2008 and (b) SGD
Fig. 6The mitochondrial inner membrane protein insertion complex as detected by different computational methods. The shadow area shows the complex predicted by each method, blue round rectangle nodes represent subunits of the mitochondrial inner membrane protein insertion complex in SGD and green circle nodes represent proteins with other functions. The lines between nodes represent the physical interactions between proteins. a ClusterONE. b EC-BNMF. c InteHC. d PSMVC
Fig. 7The NuA4 histone acetyltransferase complex as detected by different computational methods. The shadow area shows the complex predicted by each method, blue round rectangle nodes represent subunits of the NuA4 histone acetyltransferase complex in SGD. The lines between nodes represent the physical interactions between proteins. a ClusterONE. b EC-BNMF. c InteHC. d C2S. e PSMVC