| Literature DB >> 35078402 |
Chenkai Li1,2, Darcy Sutherland1,3,4, S Austin Hammond1, Chen Yang1,2, Figali Taho1,2, Lauren Bergman5, Simon Houston5, René L Warren1, Titus Wong4,6, Linda M N Hoang3,4, Caroline E Cameron5,7, Caren C Helbing5, Inanc Birol8,9,10,11.
Abstract
BACKGROUND: Antibiotic resistance is a growing global health concern prompting researchers to seek alternatives to conventional antibiotics. Antimicrobial peptides (AMPs) are attracting attention again as therapeutic agents with promising utility in this domain, and using in silico methods to discover novel AMPs is a strategy that is gaining interest. Such methods can sift through large volumes of candidate sequences and reduce lab screening costs.Entities:
Keywords: Antimicrobial peptide; Attention mechanism; Deep learning
Mesh:
Substances:
Year: 2022 PMID: 35078402 PMCID: PMC8788131 DOI: 10.1186/s12864-022-08310-4
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Model architecture of AMPlify. Residues of a peptide sequence are one-hot encoded and passed to three hidden layers in order: the bidirectional long short-term memory (Bi-LSTM) layer, the multi-head scaled dot-product attention (MHSDPA) layer and the context attention (CA) layer. The output layer generates the probability that the input sequence is an AMP
Performance comparison among different tools on the test set. Performance of different tools are presented with five metrics in percentage: accuracy (acc), sensitivity (sens), specificity (spec), F1 score (F1) and area under the receiver operating characteristic curve (AUROC)
| Tool | Model | Acc | Sens | Spec | F1 | AUROC |
|---|---|---|---|---|---|---|
| iAMPpred | originala | 74.01 | 87.90 | 60.12 | 77.18 | 80.70 |
| iAMP-2L | originala | 77.96 | 88.26 | 67.66 | 80.02 | – |
| AMP Scanner Vr.2 | originala | 78.50 | 90.66 | 66.35 | 80.83 | 88.33 |
| re-trained, 10 epochsb | 90.66 | 91.14 | 90.18 | 90.70 | 97.40 | |
| re-trained, early stoppedc | 91.20 | 90.42 | 91.98 | 91.13 | 97.03 | |
| AMPlify | single sub-model 1 | 92.40 | 90.90 | 93.89 | 92.28 | 97.54 |
| single sub-model 2 | 91.98 | 91.02 | 92.93 | 91.90 | 97.40 | |
| single sub-model 3 | 92.51 | 92.69 | 92.34 | 92.53 | 97.82 | |
| single sub-model 4 | 92.10 | 90.90 | 93.29 | 92.00 | 97.27 | |
| single sub-model 5 | 92.57 | 92.57 | 92.57 | 92.57 | 97.98 | |
aModels presented in the referenced papers are available through online servers
bThe best hyperparameter as stated in the referenced paper
cThe optimal number of training epochs determined by early stopping is 16
Fig. 2Performance comparison of different AMP prediction tools based on the test sequence similarities to their corresponding training sets. F1 scores of AMP prediction tools were calculated on test subsets based on similarities to sequences in the training sets. All the AMP/non-AMP test subsets were derived from the AMPlify test data, with subsets containing 10 or fewer sequences removed. The size of the round makers indicates the number of sequences remaining in the test subset given the similarity threshold
Fig. 3Visualization of AMPlify model performance and the AMP discovery pipeline application results. a Receiver operating characteristic (ROC) curves of AMPlify and comparators are plotted, with round dots marking the performance at the threshold of 0.5. The iAMP-2L online server only output labels of AMP/non-AMP without the corresponding probabilities, so it appears as a single point on the plot. b AMPlify prediction scores against peptide lengths of 101 sequences analyzed by AMPlify. The grey dotted line represents the score threshold of 0.5 used to distinguish AMPs from non-AMPs. Inset shows amplified view of the upper left region of the plot to enhance visualization of the majority of the selected sequences
Putative and reported AMP sequences discovered from Rana [Lithobates] catesbeiana. Genomic and transcriptomic resources from Rana [Lithobates] catesbeiana [33] were mined using the AMP discovery pipeline based on AMPlify. Top-scoring peptide sequences were selected for synthesis and validation in vitro
| Peptide Name | Sequence | # aa | Net Chargea | MW (Da) | AMPlify Score |
|---|---|---|---|---|---|
| GLLDIIKTTGKDFAVKILDNLKCKLAGGCPP | 31 | 2 | 3242.93 | 1.0000 | |
| FFPIIARLAAKVIPSLVCAVTKKC | 24 | 4 | 2589.28 | 1.0000 | |
| AFLSTVKNTLTNVAGTMIDTFKCKITGVC | 29 | 2 | 3077.66 | 1.0000 | |
| FLFPLITSFLSKFLGK | 16 | 2 | 1858.30 | 1.0000 | |
| GFLDIIKDTGKEFAVKILNNLKCKLAGGCPP | 31 | 2 | 3303.97 | 1.0000 | |
| GLFLDTLKGAAKDVAGKLLEGLKCKITGCKP | 31 | 3 | 3188.88 | 1.0000 | |
| GLWETIKTTGKSIALNLLDKIKCKIAGGCPP | 31 | 3 | 3269.95 | 1.0000 | |
| GVFLDTLKGLAGKMLESLKCKIAGCKP | 27 | 3 | 2821.49 | 0.9999 | |
| FLTFPGMTFGKLLGK | 15 | 2 | 1657.05 | 0.9997 | |
| GLLDIIKDTGKTTGILMDTLKCQMTGRCPPSS | 32 | 1 | 3395.02 | 0.9996 | |
| ATAWRIPPPGMQPIIPIRIRPLCGKQ | 26 | 4 | 2910.58 | 0.9994 | |
| FFPRVLPLANKFLPTIYCALPKSVGN | 26 | 3 | 2906.52 | 0.9985 | |
| FPAIICKVSKNC | 12 | 2 | 1322.65 | 0.9961 | |
| FYFPVSRKFGGK | 12 | 3 | 1432.69 | 0.9412 | |
| ALVAKIQKFPVFNTLKLCKLELEII | 25 | 2 | 2872.59 | 0.6063 | |
| SNRDFFKVNIFRLCG | 15 | 2 | 1816.11 | 0.6058 |
*Previously reported amphibian peptide sequences [34, 38, 39]
+Previously reported as a full-length AMP precursor sequence. Uniprot ID: C5IB07
aNet charge at pH = 7
Minimum inhibitory concentrations (MIC) and minimum bactericidal concentrations (MBC) of selected AMP candidates following antimicrobial susceptibility testing (AST) in vitro. Candidate antimicrobial peptides were synthesized and purchased from Genscript. AST, and MIC/MBC determination was performed as outlined by the Clinical and Laboratory Standards Institute (CLSI) [40], with modification as recommended by Hancock [41]. Data is presented as the lowest effective peptide concentration range (μM) observed in three independent experiments. LL37, human cathelicidin and a peptide from Tp0751 from Treponema pallidum were used as the positive and negative control peptides [34], respectively
| MDR | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Gram-positive | Gram-positive | Gram-negative | Gram-negative | Gram-negative | Gram-negative | |||||||
| (μM) | MIC | MBC | MIC | MBC | MIC | MBC | MIC | MBC | MIC | MBC | MIC | MBC |
| NI | NI | 79 | ≥ 79 | NI | NI | 20 – 39 | 39 – 79 | 10 – 20 | 10 – 39 | 20 – 39 | 20 – 39 | |
| 1 – 2 | 1 – 2 | 25 – 49 | 25 – 49 | 25 – 49 | 49 – ≥99 | 3 – 6 | 3 – 6 | 2 – 6 | 2 – 6 | 2 – 6 | 2 – 6 | |
| ≥78 | NI | 39 | 39 – ≥ 78 | 20 – ≥78 | 39 – ≥78 | 5 – 10 | 5 – 10 | 2 – 5 | 2 – 5 | 5 – 10 | 5 – 20 | |
| NI | NI | NI | NI | NI | NI | NI | NI | – | – | – | – | |
| NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | |
| NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | |
| ≥ 88 | NI | NI | NI | NI | NI | 11 – 22 | 11 – 88 | 6 – 44 | 6 – 44 | 6 – 44 | 6 – 44 | |
| NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | |
| NI | NI | NI | NI | NI | NI | NI | NI | – | – | – | – | |
| NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | |
| NI | NI | NI | NI | NI | NI | NI | NI | – | – | – | – | |
| NI | NI | NI | NI | 7 – ≥57 | 7 – ≥57 | 2 – 4 | 4 – 7 | 2 – 4 | 2 – 4 | 2 – 4 | 2 – 4 | |
| NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | NI | |
aBacteria obtained and tested at the University of Victoria
bUnknown strain; hospital isolate
cATCC quality control strain #25922 purchased from Cedarlane Laboratories (Burlington, Ontario, Canada)
dClinical isolate obtained and tested at the British Columbia Centre for Disease Control
NI, no inhibition observed in vitro
‘—’ = not tested
Abbreviations: Staphylococcus aureus, Streptococcus pyogenes, Pseudomonas aeruginosa, Escherichia coli, ATCC American Type Culture Collection, CPO carbapenemase-producing organism, MDR multi-drug resistant, NDM New-Delhi Metallo-beta-lactamase