| Literature DB >> 24564744 |
Peiying Ruan, Morihiro Hayashida, Osamu Maruyama, Tatsuya Akutsu.
Abstract
BACKGROUND: Protein complexes play important roles in biological systems such as gene regulatory networks and metabolic pathways. Most methods for predicting protein complexes try to find protein complexes with size more than three. It, however, is known that protein complexes with smaller sizes occupy a large part of whole complexes for several species. In our previous work, we developed a method with several feature space mappings and the domain composition kernel for prediction of heterodimeric protein complexes, which outperforms existing methods.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24564744 PMCID: PMC4016531 DOI: 10.1186/1471-2105-15-S2-S6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Feature space mapping from three distinct proteins P, P, P.
| (F1) | |
| (F2) | |
| (F3) | |
| (F4) | |
| (F5) | |
| (F6) | max{# domains of |
| (F7) | min{# domains of |
Figure 1Example of a subgraph including three focused proteins . In this example, protein Pis neighboring to both of Pand P.
Figure 2Example of a subgraph including a focused set of proteins and neighboring sets of proteins. Each neighboring set of three proteins shares at least one protein with the focused set (black circle). In this example, set S1 of three proteins shares two proteins with the focused set, and S2, S3 share one protein, respectively.
Results on the average of accuracy, precision, recall, and F-measure by our proposed methods and NWE.
| SVM+SVM | SVM+RVM | SVM | NWE | ||||
|---|---|---|---|---|---|---|---|
|
| 0 | 0.5 | 0 | 0.5 | 0 | 0.5 | |
| accuracy | 0.885 | 0.810 | 0.853 | 0.861 | 0.876 | - | |
| precision | 0.869 | 0.847 | 0.899 | 0.909 | 0.873 | 0.352 | |
| recall | 0.840 | 0.770 | 0.766 | 0.819 | 0.862 | 0.218 | |
| F-measure | 0.880 | 0.767 | 0.810 | 0.854 | 0.862 | 0.270 | |
'SVM+SVM' and 'SVM+RVM' denote two-phase methods using SVM and RVM as the second classifier, respectively. 'SVM' denotes usual SVM using only features (1). α denotes the coefficient of the domain composition kernel K. Note that the accuracy is not defined for NWE because it is unsupervised, and predict protein complexes of various sizes. The precision and recall for NWE were calculated as TP divided by the numbers of predicted and known heterotrimers, respectively.