| Literature DB >> 30001346 |
Debasree Sarkar1, Tanmoy Jana1, Sudipto Saha1.
Abstract
Protein-peptide interactions form an important subset of the total protein interaction network in the cell and play key roles in signaling and regulatory networks, and in major biological processes like cellular localization, protein degradation, and immune response. In this work, we have described the LMDIPred web server, an online resource for generalized prediction of linear peptide sequences that may bind to three most prevalent and well-studied peptide recognition modules (PRMs)-SH3, WW and PDZ. We have developed support vector machine (SVM)-based prediction models that achieved maximum Matthews Correlation Coefficient (MCC) of 0.85 with an accuracy of 94.55% for SH3, MCC of 0.90 with an accuracy of 95.82% for WW, and MCC of 0.83 with an accuracy of 92.29% for PDZ binding peptides. LMDIPred output combines predictions from these SVM models with predictions using Position-Specific Scoring Matrices (PSSMs) and string-matching methods using known domain-binding motif instances and regular expressions. All of these methods were evaluated using a five-fold cross-validation technique on both balanced and unbalanced datasets, and also validated on independent datasets. LMDIPred aims to provide a preliminary bioinformatics platform for sequence-based prediction of probable binding sites for SH3, WW or PDZ domains.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30001346 PMCID: PMC6042728 DOI: 10.1371/journal.pone.0200430
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Comparison of prediction performance of SVM prediction models developed using different input features for different domain-binding peptides (6-mer peptides for SH3 and WW and 4-mer peptides for PDZ).
| AUC (%) values for different peptide classes | |||
|---|---|---|---|
| SH3-binding | WW-binding | PDZ-binding | |
| Amino Acid Composition (AAC) | 88.05 | 93.54 | 92.31 |
| Dipeptide Composition (DPC) | 86.79 | 96.33 | 93.65 |
| Tripeptide Composition (TPC) | 94.72 | 96.11 | 92.44 |
| AAC + DPC | 94.63 | 97.77 | 93.98 |
| AAC + TPC | 95.56 | 97.86 | |
| DPC + TPC | 95.34 | 97.58 | 94.89 |
| AAC + DPC + TPC | 90.49 | ||
Performance of SVM models for different domain binding peptides (6-mer peptides for SH3 and WW and 4-mer peptides for PDZ) on respective unbalanced datasets.
| P:N Ratio | Threshold | Sensitivity | Specificity | Accuracy | MCC | |
|---|---|---|---|---|---|---|
| ~1:4 | -0.25 | 0.9391 | 0.9471 | 0.9455 | 0.8475 | |
| ~1:3 | -0.05 | 0.9571 | 0.9585 | 0.9582 | 0.8973 | |
| ~1:2 | -0.10 | 0.9152 | 0.9263 | 0.9229 | 0.8259 |
*P:N Ratio denotes ratio of positive to negative data
Performance of PSSMs for different domain binding peptides (6-mer peptides for SH3 and WW and 4-mer peptides for PDZ) on respective unbalanced datasets.
| P:N Ratio | Threshold | Sensitivity | Specificity | Accuracy | MCC | |
|---|---|---|---|---|---|---|
| ~1:4 | 1.00 | 0.6957 | 0.9264 | 0.8782 | 0.6167 | |
| ~1:3 | 0.50 | 0.8786 | 0.8415 | 0.8509 | 0.6640 | |
| ~1:2 | 0.60 | 0.6857 | 0.9467 | 0.8636 | 0.6774 |
Performance of RES method for different domain binding peptide classes on respective unbalanced datasets.
| P:N Ratio | Sensitivity | Specificity | Accuracy | MCC | |
|---|---|---|---|---|---|
| ~1:4 | 0.8087 | 0.9126 | 0.8909 | 0.6735 | |
| ~1:3 | 0.8929 | 0.9902 | 0.9655 | 0.9064 | |
| ~1:2 | 0.7657 | 0.8667 | 0.8345 | 0.6305 |
Performance of MIM method for different domain binding peptide classes on respective unbalanced datasets.
| P:N Ratio | Sensitivity | Specificity | Accuracy | MCC | |
|---|---|---|---|---|---|
| ~1:4 | 0.1739 | 1.0000 | 0.8273 | 0.3642 | |
| ~1:3 | 0.1286 | 1.0000 | 0.7782 | 0.2915 | |
| ~1:2 | 0.3029 | 1.0000 | 0.7782 | 0.4655 |
Comparison of sensitivity shown by different prediction methods on the independent datasets.
| Sensitivity (%) for SH3 ligands | Sensitivity (%) for PDZ ligands | |
|---|---|---|
| SVM | 60.00 | 75.81 |
| PSSM | 32.00 | 69.35 |
| RES | 80.00 | 93.55 |
| MIM | 16.00 | 29.03 |
| LMDIPred (Combined) | 92.00 | 97.00 |
| MoDPepInt | 40.00 | 100.00 |
| ELM | 84.00 | 53.23 |
*Hits from any one of the four methods.