| Literature DB >> 35745625 |
You Li1, Xueyong Li1, Yuewu Liu2, Yuhua Yao3, Guohua Huang1.
Abstract
Bioactive peptides are typically small functional peptides with 2-20 amino acid residues and play versatile roles in metabolic and biological processes. Bioactive peptides are multi-functional, so it is vastly challenging to accurately detect all their functions simultaneously. We proposed a convolution neural network (CNN) and bi-directional long short-term memory (Bi-LSTM)-based deep learning method (called MPMABP) for recognizing multi-activities of bioactive peptides. The MPMABP stacked five CNNs at different scales, and used the residual network to preserve the information from loss. The empirical results showed that the MPMABP is superior to the state-of-the-art methods. Analysis on the distribution of amino acids indicated that the lysine preferred to appear in the anti-cancer peptide, the leucine in the anti-diabetic peptide, and the proline in the anti-hypertensive peptide. The method and analysis are beneficial to recognize multi-activities of bioactive peptides.Entities:
Keywords: bioactive peptide; convolution neural network; deep learning; long short-term memory; multi-label issues
Year: 2022 PMID: 35745625 PMCID: PMC9231127 DOI: 10.3390/ph15060707
Source DB: PubMed Journal: Pharmaceuticals (Basel) ISSN: 1424-8247
Figure 1The predictive accuracy of various hyper-parameters. (A) ED, (B) LR, (C) DP, and (D) PS denote embedding dimension in the embedding layer, the learning rate, the dropout rate, and the pooling size in the pooling layer, respectively.
The details of hyper-parameters in the MPMABP.
| Layer | Super-Parameter | Value |
|---|---|---|
| Embedding | embedding dimensions | 100 |
| CNN layer 1 | number of kernels | 64 |
| size of kernels | 3 | |
| CNN layer 2 | number of kernels | 64 |
| size of kernels | 5 | |
| CNN layer 3 | number of kernels | 64 |
| size of kernels | 8 | |
| CNN layer 4 | number of kernels | 64 |
| size of kernels | 10 | |
| CNN layer 5 | number of kernels | 64 |
| size of kernels | 12 | |
| Pooling layer | size of pooling | 3 |
| stride | 1 | |
| Bi-LSTM layer | number of neurons | 32 |
| Dense1 | number of neurons | 64 |
| activation function | relu | |
| Dense2 | number of neurons | 128 |
| activation function | relu | |
| Dense3 | number of neurons | 5 |
| activation function | relu |
The 5-fold cross-validation results of the training dataset.
| Model | Precision | Coverage | Accuracy | Absolute True | Absolute False |
|---|---|---|---|---|---|
| MPMABP | 0.731 ± 0.011 | 0.738 ± 0.012 | 0.722 ± 0.010 | 0.696 ± 0.013 | 0.099 ± 0.006 |
| MLBP [ | 0.697 ± 0.012 | 0.701 ± 0.014 | 0.695 ± 0.012 | 0.685 ± 0.011 | 0.109 ± 0.004 |
Note: ± indicates standard deviation over the 5-fold cross-validations.
The independent test results.
| Model | Precision | Coverage | Accuracy | Absolute True | Absolute False |
|---|---|---|---|---|---|
| MPMABP | 0.728 | 0.749 | 0.727 | 0.704 | 0.101 |
| MLBP [ | 0.710 | 0.720 | 0.709 | 0.697 | 0.106 |
| CLR [ | 0.667 | 0.677 | 0.666 | 0.655 | 0.133 |
| RAKEL [ | 0.649 | 0.648 | 0.648 | 0.647 | 0.141 |
| MLDF [ | 0.649 | 0.649 | 0.648 | 0.646 | 0.119 |
| RBRL [ | 0.650 | 0.651 | 0.649 | 0.646 | 0.140 |
Figure 2The predictive performance of single-functional bioactive peptides. (A) Comparison of MPMABP with other methods on SN, (B) Comparison of MPMABP with other methods on SP.
Comparison with four existing state-of-the-art methods.
| Method | MPMABP | IAMP-RAAC [ | mAHTPred [ | AHPPred [ | AIPpred [ | |
|---|---|---|---|---|---|---|
| Type | ||||||
| AMP | 0.872 | 0.788 | - | - | - | |
| ACP | 0.505 | 0.333 | - | - | - | |
| AHP | 0.889 | - | 0.986 | 0.361 | - | |
| AIP | 0.914 | - | - | - | 0.827 | |
Comparison of MPMABP with two other algorithms by case study.
| Sequence | True labels | Prediction | ||
|---|---|---|---|---|
| MPMABP | MLBP [ | MultiPep [ | ||
| ACP-499 | ACP | ACP | ACP | AMP/anti-virus/ACP/anti-bacterial |
| ADP-156 | ADP | ADP | ADP | ACE inhibitor/AHP |
| AHP-665 | AHP | AHP | AHP | Neuropeptide/peptidehormone |
| AIP-1046 | AIP | AIP | AIP | AMP/anti-bacterial |
| AMP-1389 | AMP | AMP | AMP | AMP/anti-bacterial |
| ACP-29 | ACP/AMP | ACP/AMP | AMP | ACP/anti-bacterial/anti-fungal |
| ACP-220 | ACP/AMP | ACP/AMP | None | AMP/anti-bacterial/anti-fungal |
| ADP-463 | ADP/AHP | ADP/AHP | ADP | ADP |
| AIP-1050 | AIP/ADP | ADP/AIP | ADP/AHP | ADP |
| AHP-483 | AHP/ACP | AHP | AHP | Antioxidative/ACE inhibitor/AHP |
The predictive performances of MPMABPwr and MPMABPsc.
| Model | Precision | Coverage | Accuracy | Absolute True | Absolute False |
|---|---|---|---|---|---|
| MPMABPwr a | 0.702 | 0.723 | 0.701 | 0.678 | 0.108 |
| MPMABPwr b | 0.697 ± 0.013 | 0.704 ± 0.022 | 0.688 ± 0.013 | 0.663 ± 0.013 | 0.105 ± 0.003 |
| MPMABPsc a | 0.697 | 0.719 | 0.696 | 0.672 | 0.109 |
| MPMABPsc b | 0.704 ± 0.019 | 0.710 ±0.023 | 0.694 ± 0.019 | 0.668 ± 0.018 | 0.103 ± 0.006 |
a and b represent independent test and 5-fold cross-validation, respectively.
The predictive performance of 5-fold cross-validation.
| Model | Precision | Coverage | Accuracy | Absolute True | Absolute False |
|---|---|---|---|---|---|
| MPMABP | 0.731 ± 0.011 | 0.738 ± 0.012 | 0.722 ± 0.010 | 0.696 ± 0.013 | 0.099 ± 0.006 |
| No CNN | 0.724 ± 0.011 | 0.729 ± 0.010 | 0.714 ± 0.011 | 0.689 ± 0.013 | 0.101 ± 0.004 |
| No LSTM | 0.708 ± 0.017 | 0.708 ± 0.014 | 0.698 ± 0.017 | 0.678 ± 0.020 | 0.102 ± 0.004 |
| Degeneration | 0.725 ± 0.015 | 0.733 ± 0.015 | 0.716 ± 0.014 | 0.688 ± 0.013 | 0.101 ± 0.009 |
The predictive performance of the independent test.
| Model | Precision | Coverage | Accuracy | Absolute True | Absolute False |
|---|---|---|---|---|---|
| MPMABP | 0.728 | 0.749 | 0.727 | 0.704 | 0.101 |
| No CNN | 0.676 | 0.688 | 0.675 | 0.662 | 0.105 |
| No LSTM | 0.659 | 0.670 | 0.658 | 0.645 | 0.109 |
| Degeneration | 0.690 | 0.708 | 0.689 | 0.670 | 0.111 |
Figure 3Hot map of amino acid distribution of five types of bioactive peptides.
Figure 4Venn diagram of the dataset.
Figure 5The architecture of the MPMABP. Conv1D represents the CNN layer, MaxPooling1D the pooling layer, and Dense the fully connected layer.