| Literature DB >> 31182007 |
Yuan Lin1,2, Yinyin Cai1, Juan Liu3, Chen Lin1, Xiangrong Liu4.
Abstract
BACKGROUND: Antimicrobial peptides (AMPs) are essential components of the innate immune system and can protect the host from various pathogenic bacteria. The marine environment is known to be one of the richest sources for AMPs. Effective usage of AMPs and their derivatives can greatly improve the immunity and breeding survival rate of aquatic products. It is highly desirable to develop computational tools for rapidly and accurately identifying AMPs and their functional types, for the purpose of helping design new and more effective antimicrobial agents.Entities:
Keywords: Antimicrobial peptides; Feature extraction; Machine learning; Multi-label classification
Mesh:
Substances:
Year: 2019 PMID: 31182007 PMCID: PMC6557738 DOI: 10.1186/s12859-019-2766-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Preprocessed benchmark dataset
| Function | Dataset | Function type | Sequence |
|---|---|---|---|
| AMPs |
| Wound healing | 18 |
|
| Spermicidal | 13 | |
|
| Insecticidal | 28 | |
|
| Chemotactic | 57 | |
|
| Antifungal | 593 | |
|
| Anti-protist | 4 | |
|
| Antioxidant | 22 | |
|
| Antibacterial | 1297 | |
|
| Antibiotic | 32 | |
|
| Antimalarial | 25 | |
|
| Antiparasital | 101 | |
|
| Antiviral | 125 | |
|
| Anticancer | 125 | |
|
| Anti-HIV | 109 | |
|
| Proteinase inhibitor | 26 | |
|
| Surface immobilized | 43 | |
|
| 2618 | ||
| non-AMPs |
| 4371 |
188-D feature of cecropin A
| Sequence | KWKLFKKIEKVGQNIRDGIIKAGPAVAVVGQATQIAK | ||||||
|---|---|---|---|---|---|---|---|
| Property | Value of feature vector | ||||||
| Amino acid composition | 13.5 | 0.0 | 2.70 | 2.70 | 2.70 | 10.8 | 0.0 |
| 135 | 00 | 27.0 | 27.0 | 27.0 | 108 | 00 | |
| 13.5 | 18.9 | 2.70 | 0.00 | 2.70 | 2.70 | 8.10 | |
| 135 | 189 | 27.0 | 0.00 | 27.0 | 27.0 | 81.0 | |
| 2.70 | 0.00 | 2.70 | 10.8 | 2.70 | 0.00 | ||
| 27.0 | 00 | 27.0 | 108 | 27.0 | 00 | ||
| Hydro-phobic | 37.8 | 29.7 | 32.4 | 19.4 | 30.5 | 19.4 | 2.70 |
| 378 | 297 | 324 | 444 | 555 | 444 | 27 | |
| 16.2 | 35.1 | 45.9 | 100. | 32.4 | 48.6 | 64.8 | |
| 162 | 351 | 459 | 000 | 324 | 486 | 648 | |
| 81.0 | 97.2 | 5.40 | 13.5 | 40.5 | 70.2 | 94.5 | |
| 810 | 972 | 54 | 135 | 405 | 702 | 945 | |
Fig. 1The main flowchart of the AMPs identification and prediction process
Performance comparison of first-layer classifiers on test dataset S
| Classifier | AMPs | non-AMPs | Acc(%) | ||||
|---|---|---|---|---|---|---|---|
| TPR | FPR | AUC | TPR | FPR | AUC | ||
|
|
|
|
|
|
|
|
|
| 188D-RF-R | 0.892 | 0.205 | 0.897 | 0.795 | 0.108 | 0.897 | 81.145 |
| 188D-Bagging-W | 0.888 | 0.205 | 0.899 | 0.795 | 0.112 | 0.899 | 81.084 |
| 188D-Bagging-R | 0.921 | 0.220 | 0.897 | 0.780 | 0.079 | 0.897 | 80.361 |
| 40D-RF-R | 0.874 | 0.194 | 0.890 | 0.806 | 0.126 | 0.890 | 81.747 |
a. Statements that serve as captions for the entire table do not need footnote letters
b. W = weighted random sampling, R = random-under-sampling, 188D = SVM-prot 188-D, 40D = Co-Pse-AAC 40-D
Performance Comparison of Second-layer Classifiers (10 fold cross-validation)
| Models | Acc | EMR | H-Loss | F1-Micro | F1-Macro | One-error | Rank-Loss | Log-Loss |
|---|---|---|---|---|---|---|---|---|
| BR-RF | 0.839 | 0.785 | 0.021 | 0.920 | 0.941 | 0.122 | 0.019 | 0.076 |
|
|
|
|
|
|
|
|
|
|
| CC-RF | 0.844 | 0.794 | 0.021 | 0.922 | 0.942 | 0.165 | 0.051 | 0.057 |
| BCC-RF | 0.847 | 0.801 | 0.020 | 0.924 | 0.943 | 0.160 | 0.051 | 0.056 |
|
|
|
|
|
|
|
|
|
|
| BRkNN | 0.696 | 0.561 | 0.044 | 0.838 | 0.783 | 0.238 | 0.101 | 0.121 |
Fig. 2Predicting function types of s
Performance comparison of MAMPs-Pred and iAMP-2L first-layer on dataset)
| Method | Acc | SN | SP | Mcc |
|---|---|---|---|---|
| MAMPs-Pred | 93.91% | 92.83% | 94.99% | 0.878 |
| iAMP-2L | 92.23% | 97.72% | 86.74% | 0.845 |
Performance comparison of MAMPs-Pred and iAMP-2L, LIFT second-layer on data set
| Method | Acc | EMR | Precision | Recall | H-Loss |
|---|---|---|---|---|---|
| MAMPs-Pred | 0.856 | 0.825 | 0.918 | 0.929 | 0.020 |
| iAMP-2L | 0.669 | 0.43 | 0.833 | 0.75 | 0.164 |
| LIFT | 0.700 | 0.5365 | 0.838 | 0.741 | 0.1392 |
Fig. 3AMPs activity prediction of 126 shrimp sequences