| Literature DB >> 36173168 |
Carlos H M Rodrigues1,2,3,4, Anjali Garg1,2, David Keizer1,2, Douglas E V Pires2,3,5, David B Ascher1,2,3,4.
Abstract
Peptides are attractive alternatives for the development of new therapeutic strategies due to their versatility and low complexity of synthesis. Increasing interest in these molecules has led to the creation of large collections of experimentally characterized therapeutic peptides, which greatly contributes to development of data-driven computational approaches. Here we propose CSM-peptides, a novel machine learning method for rapid identification of eight different types of therapeutic peptides: anti-angiogenic, anti-bacterial, anti-cancer, anti-inflammatory, anti-viral, cell-penetrating, quorum sensing, and surface binding. Our method has shown to outperform existing approaches, achieving an AUC of up to 0.92 on independent blind tests, and consistent performance on cross-validation. We anticipate CSM-peptides to be of great value in helping screening large libraries to identify novel peptides with therapeutic potential and have made it freely available as a user-friendly web server and Application Programming Interface at https://biosig.lab.uq.edu.au/csm_peptides.Entities:
Keywords: machine learning; peptide screening; therapeutic peptides; web server
Mesh:
Substances:
Year: 2022 PMID: 36173168 PMCID: PMC9518225 DOI: 10.1002/pro.4442
Source DB: PubMed Journal: Protein Sci ISSN: 0961-8368 Impact factor: 6.993
FIGURE 1Methodology workflow for CSM‐peptides. The development of CSM‐peptides can be divided into four main steps: (1) data are collected from the literature for eight different classes of therapeutic peptides; (2) features are calculated, including evolutionary scores from substitution tables, physicochemical and indexes calculated based on each peptide sequence and predicted proportion of secondary structure; (3) feature selection and model training for each peptide class separately; (4) best performing models are deployed into a webserver and API publicly available
FIGURE 2Performance of CSM‐peptides on 10‐fold CV and two independent blind‐tests for predictive models for eight classes of therapeutic peptides. Results are shown as ROC curves where green lines represent results on 10‐fold CV, yellow and red describe results of assessing the performance on main and alternative test sets, respectively. AAP, anti‐angiogenic; ABP, anti‐bacterial; ACP, anti‐cancer; AIP, anti‐inflammatory; AVP, anti‐viral; CPP, cell‐penetrating; QSP, quorum sensing; SBP, surface binding
Performance comparison of CSM‐peptides with other methods on two independent test sets for predictive models of each therapeutic peptide class
| Class | Method | Main test set | Alternative test set | ||||||
|---|---|---|---|---|---|---|---|---|---|
| AUC | TPR | TNR | MCC | AUC | TPR | TNR | MCC | ||
| AAP | CSM‐peptides | 0.76 | 0.57 | 0.92 | 0.53 | 0.86 | 0.67 | 0.86 | 0.18 |
| PPTPP | 0.77 | 0.71 | 0.78 | 0.50 | 0.75 | 0.71 | 0.70 | 0.10 | |
| PEPred | 0.80 | – | – | – | 0.77 | – | – | – | |
| ABP | CSM‐peptides | 1.00 | 0.98 | 0.99 | 0.97 | 1.00 | 0.96 | 0.98 | 0.90 |
| PPTPP | 0.98 | 0.92 | 0.96 | 0.89 | 0.96 | 1.00 | 0.93 | 0.61 | |
| PEPred | 0.97 | – | – | – | 0.96 | – | – | – | |
| ACP | CSM‐peptides | 0.97 | 0.90 | 0.90 | 0.80 | 0.83 | 0.87 | 0.50 | 0.15 |
| PPTPP | 0.87 | 0.80 | 0.81 | 0.62 | 0.71 | 0.80 | 0.38 | 0.07 | |
| PEPred | 0.94 | – | – | – | 0.63 | – | – | – | |
| AI4ACP | 0.49 | 0.88 | 0.1 | −0.02 | 0.5 | 0.15 | 0.85 | 0.00 | |
| AIP | CSM‐peptides | 0.71 | 0.43 | 0.93 | 0.43 | 0.54 | 0.67 | 0.35 | 0.02 |
| PPTPP | 0.70 | 1.00 | 0.04 | 0.15 | 0.39 | 0.08 | 0.83 | −0.08 | |
| PEPred | 0.75 | – | – | – | 0.63 | – | – | – | |
| AVP | CSM‐peptides | 0.94 | 0.90 | 0.86 | 0.76 | 0.97 | 0.96 | 0.74 | 0.27 |
| PPTPP | 0.96 | 0.91 | 0.77 | 0.70 | 0.90 | 0.91 | 0.47 | 0.13 | |
| PEPred | 0.94 | – | – | – | 0.95 | – | – | – | |
| FIRM‐AVP | 0.67 | 0.90 | 0.44 | 0.20 | 0.51 | 0.30 | 0.72 | 0.01 | |
| CPP | CSM‐peptides | 0.99 | 0.90 | 0.96 | 0.87 | 0.97 | 0.85 | 0.98 | 0.78 |
| PPTPP | 0.96 | 0.86 | 0.88 | 0.75 | 0.85 | 0.86 | 0.55 | 0.17 | |
| PEPred | 0.95 | – | – | – | 0.87 | – | – | – | |
| QSP | CSM‐peptides | 0.98 | 0.90 | 0.95 | 0.85 | 0.94 | 0.95 | 0.87 | 0.24 |
| PPTPP | 0.94 | 0.80 | 1.00 | 0.81 | 0.85 | 0.75 | 0.77 | 0.122 | |
| PEPred | 0.96 | – | – | – | 0.89 | – | – | – | |
| SBP | CSM‐peptides | 0.94 | 0.83 | 0.91 | 0.75 | 0.98 | 0.87 | 0.97 | 0.51 |
| PPTPP | 0.77 | 0.75 | 0.66 | 0.41 | 0.84 | 0.66 | 0.87 | 0.17 | |
| PEPred | 0.67 | – | – | – | 0.79 | – | – | – | |
Note: Results are shown in terms of area under the ROC curve (AUC), sensitivity (TPR), specificity (TNR) and Matthew's correlation coeff (MCC). Cells filled with a dash (−) indicate cases where results were not available or could not be generated.
Abbreviations: AAP, anti‐angiogenic; ABP, anti‐bacterial; ACP, anti‐cancer; AIP, anti‐inflammatory; AVP, anti‐viral; CPP, cell‐penetrating; QSP, quorum sensing; and SBP, surface binding.