| Literature DB >> 21685066 |
Zhipeng Xie1, Chee Keong Kwoh, Xiao-Li Li, Min Wu.
Abstract
MOTIVATION: Protein complexes are of great importance for unraveling the secrets of cellular organization and function. The AP-MS technique has provided an effective high-throughput screening to directly measure the co-complex relationship among multiple proteins, but its performance suffers from both false positives and false negatives. To computationally predict complexes from AP-MS data, most existing approaches either required the additional knowledge from known complexes (supervised learning), or had numerous parameters to tune.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21685066 PMCID: PMC3117344 DOI: 10.1093/bioinformatics/btr212
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Four evidence types for a protein pair {x, y}.
Details of AP-MS datasets
| AP-MS datasets | #Purifications | #Baits | #Preys |
|---|---|---|---|
| Gavin | 2166 | 1993 | 2671 |
| Krogan | 4332 | 2294 | 5333 |
| Krogan-HighConf | 3575 | 2143 | 2567 |
Statistics of predicted complex sets to be compared
| Predicted complex set | #Complexes | Avg. complex size | #distinct proteins |
|---|---|---|---|
| C2S | 1039 | 4.93 | 5121 |
| C2S-HighConf | 679 | 3.79 | 2571 |
| Hart | 390 | 4.33 | 1689 |
| Pu | 400 | 5.14 | 1913 |
| BT-893 | 893 | 6.25 | 5187 |
Sensitivities (Sn), PPVs and Acc compared on reference complexes
| Predicted complex set | CYC2008 (408) | MIPS (214) | ||||
|---|---|---|---|---|---|---|
| Sn | PPV | Acc | Sn | PPV | Acc | |
| C2S | 0.680 | 0.837 | 0.755 | 0.582 | 0.821 | 0.692 |
| C2S-HighConf | 0.643 | 0.889 | 0.756 | 0.5294 | 0.889 | 0.686 |
| Pu (400) | 0.691 | 0.789 | 0.738 | 0.593 | 0.795 | 0.686 |
| Hart (390) | 0.610 | 0.863 | 0.725 | 0.514 | 0.846 | 0.660 |
| BT-893 | 0.720 | 0.759 | 0.740 | 0.582 | 0.773 | 0.671 |
Fig. 2.The comparison of recall with different values of ω.
Fig. 3.The effect of varying tpr on sensitivity, PPV and accuracy.
Comparison of co-localization and functional co-annotation within complexes
| Predicted complex set | COLOC (%) | GO-BP (%) | GO-MF (%) |
|---|---|---|---|
| C2S-HighConf-405 | 89.2 | 85.9 | 80.3 |
| BT-409 | 89.1 | 86.5 | 79.3 |
| Pu | 84.6 | 85.8 | 77.7 |
| Hart | 88.1 | 87.5 | 78.0 |
Comparisons of Sensitivities (Sn), PPVs and Acc of predictions on Gavin data alone against two reference complex sets
| Predicted complex set | CYC2008 (408) | MIPS (214) | ||||
|---|---|---|---|---|---|---|
| Sn | PPV | Acc | Sn | PPV | Acc | |
| C2S-Gavin (474) | 0.588 | 0.884 | 0.721 | 0.500 | 0.895 | 0.669 |
| BT-Gavin (381) | 0.631 | 0.756 | 0.691 | 0.547 | 0.774 | 0.650 |
| Zhang-Gavin (851) | 0.607 | 0.679 | 0.642 | 0.547 | 0.699 | 0.618 |
| Gavin-Core (478) | 0.392 | 0.914 | 0.598 | 0.350 | 0.907 | 0.564 |
| Gavin-All (491) | 0.570 | 0.552 | 0.561 | 0.517 | 0.605 | 0.559 |
| CODEC-w0-Gavin (1082) | 0.552 | 0.506 | 0.528 | 0.486 | 0.535 | 0.510 |
| CODEC-w1-Gavin (1005) | 0.549 | 0.542 | 0.546 | 0.484 | 0.600 | 0.539 |
Fig. 4.Comparison of recall by varying ω for predictions on Gavin data alone.
Comparison of co-localization, functional co-annotation for predictions on Gavin data alone
| Predicted complex set | COLOC (%) | GO-BP (%) | GO-MF (%) |
|---|---|---|---|
| C2S-Gavin | 85.29 | 81.64 | 76.05 |
| BT-Gavin | 78.93 | 78.35 | 71.92 |
| Zhang-Gavin | 75.51 | 75.64 | 70.92 |
| Gavin-All | 70.27 | 74.36 | 68.19 |
| CODEC-w0-Gavin | 68.83 | 69.75 | 66.15 |
| CODEC-w1-Gavin | 76.78 | 78.71 | 72.87 |
Fig. 5.The flowchart of protein complex prediction based on C2S score matrix.