| Literature DB >> 32266225 |
Hong-Fei Li1,2, Xian-Fang Wang2, Hua Tang1.
Abstract
Bacteriophage is a type of virus that could infect the host bacteria. They have been applied in the treatment of pathogenic bacterial infection. Phage enzymes and hydrolases play the most important role in the destruction of bacterial cells. Correctly identifying the hydrolases coded by phage is not only beneficial to their function study, but also conducive to antibacteria drug discovery. Thus, this work aims to recognize the enzymes and hydrolases in phage. A combination of different features was used to represent samples of phage and hydrolase. A feature selection technique called analysis of variance was developed to optimize features. The classification was performed by using support vector machine (SVM). The prediction process includes two steps. The first step is to identify phage enzymes. The second step is to determine whether a phage enzyme is hydrolase or not. The jackknife cross-validated results showed that our method could produce overall accuracies of 85.1 and 94.3%, respectively, for the two predictions, demonstrating that the proposed method is promising.Entities:
Keywords: analysis of variance; bacteriophage enzymes; classification; hydrolase; sequence feature
Year: 2020 PMID: 32266225 PMCID: PMC7105632 DOI: 10.3389/fbioe.2020.00183
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
The results by using different features for phage enzymes prediction.
| Combined vector features | Original feature | Optimal features | ||
| Accuracy | Dimensions | Accuracy | Dimensions | |
| GGDC + PseAAC | 74.5% | 550 | 83.1% | 154 |
| GTPC + CTD | 67.8% | 164 | 77.6% | 35 |
| GGDC + PseAAC + GTPC + CTD | 72.9% | 714 | 85.1% | 191 |
FIGURE 1A plot showing the F-values for (A) discriminating phage enzymes from nonenzymes and (B) discriminating phage hydrolases from other enzymes.
The comparison of different classifiers for predicting phage enzymes.
| Classifier | Sn | Sp | Ac | MCC | AUC |
| KNN | 0.98 | 0.16 | 0.702 | 0.232 | 0.664 |
| RF | 0.73 | 0.76 | 0.752 | 0.490 | 0.798 |
| SVM | 0.83 | 0.88 | 0.851 | 0.703 | 0.897 |
| MLP | 0.77 | 0.84 | 0.812 | 0.610 | 0.858 |
The results by using different feature for discriminating phage hydrolases from other enzymes.
| Combined vector features | Original features | Optimal features | ||
| Accuracy | Dimensions | Accuracy | Dimensions | |
| GGDC + PseAAC | 75.8 | 550 | 94.3% | 61 |
| GTPC + CTD | 76.6% | 164 | 86.4% | 37 |
| GGDC + PseAAC + GTPC + CTD | 75.8% | 714 | 92.7% | 89 |
The comparison of different classifiers for discriminating phage hydrolases from other enzymes.
| Classifier | Sn | Sp | Ac | MCC | AUC |
| KNN | 0.70 | 0.89 | 0.814 | 0.588 | 0.863 |
| RF | 0.91 | 0.80 | 0.86 | 0.722 | 0.898 |
| SVM | 0.96 | 0.93 | 0.943 | 0.886 | 0.961 |
| MLP | 0.93 | 0.91 | 0.927 | 0.837 | 0.948 |
Comparison of predictive performance with exist method.
| Ac | Sp | Sn | ||
| Discriminating phage enzymes from nonenzymes | ( | 84.3% | 81.7% | 87.1% |
| This study | 85.1% | 88.0% | 83.0% | |
| Discriminating phage hydrolases from other enzymes | ( | 93.5% | 92.8% | 94.5% |
| This study | 94.3% | 93.0% | 96.0% |