Luo Xu1, Zhirui Guo1, Xiao Liu2. 1. School of Microelectronics and Communication Engineering, Chongqing University, 174 ShaPingBa District, Chongqing, 400044, China. 2. School of Microelectronics and Communication Engineering, Chongqing University, 174 ShaPingBa District, Chongqing, 400044, China. liuxiao@cqu.edu.cn.
Abstract
BACKGROUND: Rapid identification of new essential genes is necessary to understand biological mechanisms and identify potential targets for antimicrobial drugs. Many computational methods have been proposed. OBJECTIVES: To construct an essential genes classifier which satisfies more different organisms, and to study the redundancy of features used in the prediction of essential genes. METHODS: We designed a 57-12-1 artificial neural network model to predict the essential genes of 31 prokaryotic genomes. Four methods including self-predictions of each organism, the leave-one-genome-out method, predicting all by one organism, and self-predictions of all organisms were applied to assess the predictive performance. Additionally, the 57 features used in the artificial neural network model were analyzed by weighted principal component analysis to screen the key features strongly related to the essentiality of genes. RESULTS: Our results compared with previous researches indicate that our models had better generalizability. Furthermore, this method reduced the features to 29 while maintaining stable prediction performance overall, suggesting that some features are redundant for gene essentiality, and the screened features contained more important biological information for gene essentiality. CONCLUSION: This study showed the effectiveness and generalizability of our artificial neural network model. In addition, the screened features could be used as key features in computational analysis and biological experiments.
BACKGROUND: Rapid identification of new essential genes is necessary to understand biological mechanisms and identify potential targets for antimicrobial drugs. Many computational methods have been proposed. OBJECTIVES: To construct an essential genes classifier which satisfies more different organisms, and to study the redundancy of features used in the prediction of essential genes. METHODS: We designed a 57-12-1 artificial neural network model to predict the essential genes of 31 prokaryotic genomes. Four methods including self-predictions of each organism, the leave-one-genome-out method, predicting all by one organism, and self-predictions of all organisms were applied to assess the predictive performance. Additionally, the 57 features used in the artificial neural network model were analyzed by weighted principal component analysis to screen the key features strongly related to the essentiality of genes. RESULTS: Our results compared with previous researches indicate that our models had better generalizability. Furthermore, this method reduced the features to 29 while maintaining stable prediction performance overall, suggesting that some features are redundant for gene essentiality, and the screened features contained more important biological information for gene essentiality. CONCLUSION: This study showed the effectiveness and generalizability of our artificial neural network model. In addition, the screened features could be used as key features in computational analysis and biological experiments.
Keywords:
Essential genes; Feature selection; Neural network; Principal component analysis
Authors: N K Francis; A Luther; E Salib; L Allanby; D Messenger; A S Allison; N J Smart; J B Ockrim Journal: Tech Coloproctol Date: 2015-06-19 Impact factor: 3.781