| Literature DB >> 30218091 |
Lina Zhang1, Runtao Yang2, Chengjin Zhang1.
Abstract
Anti-angiogenic peptides perform distinct physiological functions and potential therapies for angiogenesis-related diseases. Accurate identification of anti-angiogenic peptides may provide significant clues to understand the essential angiogenic homeostasis within tissues and develop antineoplastic therapies. In this study, an ensemble predictor is proposed for anti-angiogenic peptide prediction by fusing an individual classifier with the best sensitivity and another individual one with the best specificity. We investigate predictive capabilities of various feature spaces with respect to the corresponding optimal individual classifiers and ensemble classifiers. The accuracy and Matthew's Correlation Coefficient (MCC) of the ensemble classifier trained by Bi-profile Bayes (BpB) features are 0.822 and 0.649, respectively, which represents the highest prediction results among the investigated prediction models. Discriminative features are obtained from BpB using the Relief algorithm followed by the Incremental Feature Selection (IFS) method. The sensitivity, specificity, accuracy, and MCC of the ensemble classifier trained by the discriminative features reach up to 0.776, 0.888, 0.832, and 0.668, respectively. Experimental results indicate that the proposed method is far superior to the previous study for anti-angiogenic peptide prediction.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30218091 PMCID: PMC6138733 DOI: 10.1038/s41598-018-32443-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The construction process of the proposed anti-angiogenic peptide prediction model.
Prediction performance of various feature spaces with respect to the corresponding optimal individual classifiers.
| Feature Space | Optimal Classifier | Sn | Sp | Acc | MCC | AUC |
|---|---|---|---|---|---|---|
| BpB | NB | 0.682 | 0.925 | 0.804 | 0.626 | 0.902 |
| CTD | RBFNetwork | 0.551 | 0.766 | 0.659 | 0.325 | 0.698 |
| DFT | NNA | 0.692 | 0.579 | 0.636 | 0.273 | 0.636 |
| BpB + CTD | RBFNetwork | 0.701 | 0.804 | 0.752 | 0.507 | 0.806 |
| BpB + DFT | RF | 0.710 | 0.850 | 0.780 | 0.566 | 0.843 |
| CTD + DFT | RF | 0.664 | 0.682 | 0.673 | 0.346 | 0.699 |
| BpB + CTD + DFT | RF | 0.673 | 0.794 | 0.734 | 0.471 | 0.802 |
Figure 2ROC curves of various feature spaces with respect to the corresponding optimal individual classifiers.
Prediction performance of various feature spaces with respect to the corresponding optimal ensemble classifiers.
| Feature Space | Optimal Classifier | Sn | Sp | Acc | MCC | AUC |
|---|---|---|---|---|---|---|
| BpB | NB + LR | 0.766 | 0.879 | 0.822 | 0.649 | 0.870 |
| CTD | RBFNetwork + NNA | 0.617 | 0.57 | 0.593 | 0.187 | 0.676 |
| DFT | NB + NNA | 0.701 | 0.579 | 0.64 | 0.282 | 0.645 |
| BpB + CTD | NB + LR | 0.794 | 0.72 | 0.757 | 0.515 | 0.842 |
| BpB + DFT | NB + LR | 0.748 | 0.701 | 0.724 | 0.449 | 0.831 |
| CTD + DFT | NB + RF | 0.542 | 0.72 | 0.631 | 0.266 | 0.700 |
| BpB + CTD + DFT | NB + LR | 0.748 | 0.738 | 0.743 | 0.486 | 0.838 |
Figure 3ROC curves of various feature spaces with respect to the corresponding optimal ensemble classifiers.
The individual performance of NB classifier on different feature spaces.
| Feature Space | Classifier | Sn | Sp | Acc | MCC | AUC |
|---|---|---|---|---|---|---|
| BpB | NB | 0.682 | 0.925 | 0.804 | 0.626 | 0.902 |
| BpB + CTD | NB | 0.626 | 0.832 | 0.734 | 0.478 | 0.729 |
| BpB + DFT | NB | 0.570 | 0.804 | 0.687 | 0.384 | 0.704 |
| BpB + CTD + DFT | NB | 0.589 | 0.841 | 0.715 | 0.444 | 0.715 |
The individual performance of LR classifier on different feature spaces.
| Feature Space | Classifier | Sn | Sp | Acc | MCC | AUC |
|---|---|---|---|---|---|---|
| BpB | LR | 0.785 | 0.748 | 0.766 | 0.533 | 0.766 |
| BpB + CTD | LR | 0.757 | 0.720 | 0.738 | 0.477 | 0.782 |
| BpB + DFT | LR | 0.738 | 0.682 | 0.710 | 0.421 | 0.710 |
| BpB + CTD + DFT | LR | 0.748 | 0.710 | 0.729 | 0.458 | 0.729 |
Figure 4The IFS curve: the accuracy of the prediction model trained by different feature subsets.
Prediction results with the proposed feature selection method or not.
| Method | Sn | Sp | Acc | MCC | AUC |
|---|---|---|---|---|---|
| Without feature selection | 0.766 | 0.879 | 0.822 | 0.649 | 0.870 |
| With feature selection | 0.776 | 0.888 | 0.832 | 0.668 | 0.872 |
Figure 5ROC curves with the proposed feature selection method or not.
Performance comparisons with the existing method on benchmark dataset.
| Method | Sn | Sp | Acc | MCC | AUC |
|---|---|---|---|---|---|
| Ref.[ | 0.757 | 0.738 | 0.748 | 0.50 | — |
| This study | 0.776 | 0.888 | 0.832 | 0.668 | 0.872 |