| Literature DB >> 22489173 |
Xiaowei Zhao1,2, Jiakui Li1, Yanxin Huang3, Zhiqiang Ma2, Minghao Yin1.
Abstract
Bioluminescent proteins are important for various cellular processes, such as gene expression analysis, drug discovery, bioluminescent imaging, toxicity determination, and DNA sequencing studies. Hence, the correct identification of bioluminescent proteins is of great importance both for helping genome annotation and providing a supplementary role to experimental research to obtain insight into bioluminescent proteins' functions. However, few computational methods are available for identifying bioluminescent proteins. Therefore, in this paper we develop a new method to predict bioluminescent proteins using a model based on position specific scoring matrix and auto covariance. Tested by 10-fold cross-validation and independent test, the accuracy of the proposed model reaches 85.17% for the training dataset and 90.71% for the testing dataset respectively. These results indicate that our predictor is a useful tool to predict bioluminescent proteins. This is the first study in which evolutionary information and local sequence environment information have been successfully integrated for predicting bioluminescent proteins. A web server (BLPre) that implements the proposed predictor is freely available.Entities:
Keywords: bioluminescent proteins; evolutionary information; position specific scoring matrix; support vector machine
Mesh:
Substances:
Year: 2012 PMID: 22489173 PMCID: PMC3317733 DOI: 10.3390/ijms13033650
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Figure 1Schematic representation of transformation of each protein sequence into PSSM-400 matrix.
Figure 2Detailed system flow of the prediction system.
Figure 3Accuracies of the prediction model with AC of different lgs.
The performance comparison of different encoding strategies on the training dataset.
| Method | |||
|---|---|---|---|
| PSSM-400 | 72.00 | 86.33 | 79.32 |
| PSSM-AC | 79.33 | 91.00 | 85.17 |
| BLProt [ | 74.47 | 84.21 | 80.00 |
Figure 4The ROC curves calculated from the ten-fold cross validation of PSSM and PSSM-AC encoding strategies.