| Literature DB >> 23762187 |
Peng-Mian Feng1, Hui Ding, Wei Chen, Hao Lin.
Abstract
Knowledge about the protein composition of phage virions is a key step to understand the functions of phage virion proteins. However, the experimental method to identify virion proteins is time consuming and expensive. Thus, it is highly desirable to develop novel computational methods for phage virion protein identification. In this study, a Naïve Bayes based method was proposed to predict phage virion proteins using amino acid composition and dipeptide composition. In order to remove redundant information, a novel feature selection technique was employed to single out optimized features. In the jackknife test, the proposed method achieved an accuracy of 79.15% for phage virion and nonvirion proteins classification, which are superior to that of other state-of-the-art classifiers. These results indicate that the proposed method could be as an effective and promising high-throughput method in phage proteomics research.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23762187 PMCID: PMC3671239 DOI: 10.1155/2013/530696
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Predictive performance of Naïve Bayes based on different features.
| Feature dimensions | Sn (%) | Sp (%) | Acc (%) | auROC |
|---|---|---|---|---|
| 420 | 53.54 | 83.17 | 75.57 | 0.758 |
| 38 | 75.76 | 80.77 | 79.15 | 0.855 |
Comparison of Naïve Bayes with other methods by using optimized features.
| Classifier | Sn (%) | Sp (%) | Acc (%) | auROC |
|---|---|---|---|---|
| BayesNet | 68.69 | 79.81 | 76.22 | 0.799 |
| RBFnetwork | 72.73 | 82.21 | 79.15 | 0.839 |
| Random Forest | 55.56 | 84.62 | 75.24 | 0.802 |
| LogitBoot | 52.53 | 85.10 | 74.59 | 0.795 |
| SVM | 63.64 | 86.54 | 79.15 | 0.836 |
| J48 | 61.62 | 77.88 | 72.64 | 0.671 |
| Naïve Bayes | 75.76 | 80.77 | 79.15 | 0.855 |