| Literature DB >> 28155710 |
Ronesh Sharma1,2, Shiu Kumar1,2, Tatsuhiko Tsunoda3,4,5, Ashwini Patil6, Alok Sharma7,8,9,10.
Abstract
BACKGROUND: Intrinsically Disordered Proteins (IDPs) lack an ordered three-dimensional structure and are enriched in various biological processes. The Molecular Recognition Features (MoRFs) are functional regions within IDPs that undergo a disorder-to-order transition on binding to a partner protein. Identifying MoRFs in IDPs using computational methods is a challenging task.Entities:
Keywords: Hidden Markov model profiles; Intrinsically disordered proteins; Intrinsically disordered regions; Molecular recognition features; Support vector machines
Mesh:
Substances:
Year: 2016 PMID: 28155710 PMCID: PMC5259822 DOI: 10.1186/s12859-016-1375-0
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Overview of the proposed method
Fig. 2Two segments from each training sequences discriminating MoRFs region from other surroundings of IDR
AUC, Success rate and FPR for varying flank size with RBF and sigmoid kernels (C value used is 1000)
| AUC | Success rate | FPR @ 0.222 TPR | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RBF kernel | Sigmoid kernel | RBF kernel | Sigmoid kernel | RBF kernel | Sigmoid kernel | |||||||
| Gamma | 0.0038 | 5 | 0.0038 | 5 | 0.0038 | 5 | 0.0038 | 5 | 0.0038 | 5 | 0.0038 | 5 |
| Flank size | ||||||||||||
| 1 | 0.658 | 0.587 | 0.570 |
| 0.680 | 0.660 | 0.658 |
| 0.057 | 0.160 | 0.090 |
|
| 2 | 0.659 | 0.597 | 0.590 |
| 0.651 | 0.737 | 0.653 |
| 0.053 | 0.190 | 0.088 |
|
| 3 |
|
| 0.580 |
|
|
| 0.660 |
|
|
| 0.080 |
|
| 4 | 0.660 |
| 0.580 | 0.340 | 0.640 |
| 0.589 | 0.370 | 0.053 |
| 0.090 | 0.380 |
| 5 |
| 0.600 | 0.587 | 0.650 |
| 0.720 | 0.618 | 0.600 |
| 0.180 | 0.098 | 0.060 |
| 6 |
|
| 0.589 | 0.648 |
|
| 0.618 | 0.572 |
|
| 0.098 | 0.066 |
| 7 | 0.664 | 0.601 | 0.588 | 0.340 | 0.644 | 0.756 | 0.642 | 0.460 | 0.051 | 0.170 | 0.090 | 0.360 |
| 8 | 0.652 | 0.602 | 0.595 | 0.350 | 0.653 | 0.740 | 0.600 | 0.470 | 0.059 | 0.170 | 0.095 | 0.360 |
| 9 | 0.646 | 0.584 | 0.582 | 0.618 | 0.653 | 0.699 | 0.584 | 0.390 | 0.061 | 0.180 | 0.010 | 0.073 |
| 10 | 0.644 | 0.587 | 0.640 | 0.590 | 0.656 | 0.699 | 0.432 | 0.590 | 0.065 | 0.175 | 0.077 | 0.100 |
| 11 | 0.645 | 0.605 | 0.640 | 0.639 | 0.668 | 0.749 | 0.604 | 0.390 | 0.066 | 0.160 | 0.105 | 0.080 |
| 12 | 0.640 | 0.600 | 0.600 | 0.630 | 0.670 | 0.810 | 0.630 | 0..36 | 0.070 | 0.160 | 0.090 | 0.080 |
Bold numbers indicate the best performance metrics for different kernels, gamma values and Flank sizes
Selected SVM models with respective gamma and window size values
| SVM models | window size | kernel | gamma |
|---|---|---|---|
| 1 | 11 | RBF | 0.0038 |
| 2 | 7 | RBF | 5 |
| 3 | 3 | Sigmoid | 5 |
| 4 | 13 | RBF | 0.0038 |
| 5 | 9 | RBF | 5 |
| 6 | 5 | Sigmoid | 5 |
| 7 | 7 | RBF | 0.0038 |
| 8 | 13 | RBF | 5 |
| 9 | 7 | Sigmoid | 5 |
Selected SVM models with increased sampling ratio
| Training sampling ratio | 1:1 | 1:2 | ||||
|---|---|---|---|---|---|---|
| SVM models | AUC | Success rate | FPR | AUC | Success rate | FPR |
| 1 | 0.660 | 0.669 | 0.050 |
|
|
|
| 2 | 0.600 | 0.770 | 0.180 |
|
|
|
| 3 | 0.648 | 0.701 | 0.070 |
|
|
|
| 4 | 0.659 | 0.649 | 0.053 |
|
|
|
| 5 |
|
|
| 0.600 | 0.700 | 0.190 |
| 6 | 0.653 | 0.680 | 0.065 |
|
|
|
| 7 | 0.650 | 0.640 | 0.047 |
|
|
|
| 8 | 0.600 | 0.749 | 0.180 |
|
|
|
| 9 |
|
|
| 0.650 | 0.656 | 0.065 |
Bold numbers indicate performance metrics for best models
Comparison of results
| Method/predictors | TPR | AUC | Success rate | FPR | Accuracy |
|---|---|---|---|---|---|
| ANCHOR |
|
|
|
|
|
| MoRFPred |
|
|
|
|
|
| Proposed method | 0.222 | 0.702 | 0.711 | 0.036 | 0.949 |
Accuracy and FPR is a function of TPR and the underlined values are obtained from Disfani et al. [8]
Overall Comparison of results
| Proposed method | MoRFPred | ANCHOR | |
|---|---|---|---|
| Efficiency residues/min | 405 | 48 | 4 × 106 |
| Max sequence size | Unlimited | 1000 residues | Unlimited |
| AUC | 0.702 | 0.673 | 0.600 |
| FPR at 0.222 TPR | 0.036 | 0.037 | 0.092 |
| FPR at 0.389 TPR | 0.109 | 0.137 | 0.253 |
| Number of component predictors | 1 | 8 | 0 |
| MoRF size limitations | No limits | No limits | No limits |