| Literature DB >> 25133234 |
Bin Liu1, Bingquan Liu2, Fule Liu3, Xiaolong Wang1.
Abstract
Identification of protein binding sites is critical for studying the function of the proteins. In this paper, we proposed a method for protein binding site prediction, which combined the order profile propensities and hidden Markov support vector machine (HM-SVM). This method employed the sequential labeling technique to the field of protein binding site prediction. The input features of HM-SVM include the profile-based propensities, the Position-Specific Score Matrix (PSSM), and Accessible Surface Area (ASA). When tested on different data sets, the proposed method showed promising results, and outperformed some closely relative methods by more than 10% in terms of AUC.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25133234 PMCID: PMC4122092 DOI: 10.1155/2014/464093
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
Summary of six data sets.
| Data set | Chains | Res. | Surface res. | Interface res. |
|---|---|---|---|---|
| Heterocomplex Ia | 504 | 109829 | 92797 | 26085 |
| Homocomplex I | 620 | 172917 | 141295 | 38170 |
| Mixb I | 1124 | 282746 | 234092 | 64255 |
| Heterocomplex IIc | 504 | 109829 | 92797 | 32386 |
| Homocomplex II | 620 | 172917 | 141295 | 45633 |
| Mix II | 1124 | 282746 | 234092 | 78019 |
aType I data set with minor interface as negative samples.
bThe mixed data set of heterocomplexes and homocomplexes.
cType II data set with minor interface as positive samples.
Figure 1Overview of the proposed framework for protein binding site prediction.
Performance of HM-SVM based method with and without order profile propensities.
| Dataset | Method | Sp % | Sn % | F1 % | Acc % | MCC | AUC % |
|---|---|---|---|---|---|---|---|
| Heterocomplex I | HM-SVM 1a | 44.9 | 56.0 | 49.8 | 68.3 | 0.274 | 69.5 |
| HM-SVM 2b |
|
|
|
|
|
| |
|
| |||||||
| Homocomplex I | HM-SVM 1 | 45.4 | 60.0 | 51.70 | 69.7 | 0.309 | 72.2 |
| HM-SVM 2 |
|
|
|
|
|
| |
|
| |||||||
| Mix I | HM-SVM 1 | 45.5 | 58.0 | 51.0 | 69.4 | 0.297 | 71.2 |
| HM-SVM 2 |
|
|
|
|
|
| |
|
| |||||||
| Heterocomplex II | HM-SVM 1 | 54.0 | 56.7 | 55.3 | 68.0 | 0.305 | 70.7 |
| HM-SVM 2 |
|
|
|
|
|
| |
|
| |||||||
| Homocomplex II | HM-SVM 1 | 53.3 | 60.1 | 56.5 | 70.1 | 0.340 | 73.4 |
| HM-SVM 2 |
|
|
|
|
|
| |
|
| |||||||
| Mix II | HM-SVM 1 | 53.6 | 58.6 | 56.0 | 69.3 | 0.326 | 72.4 |
| HM-SVM 2 |
|
|
|
|
|
| |
aResults of HM-SVM 1 on the six data sets are obtained from [13]. HM-SVM 1 represents the HM-SVM predictor with the basic feature set using PSSM and ASA features; bHM-SVM 2 represents the HM-SVM predictor with the feature set using PSSM, ASA, and order profile propensity features.