| Literature DB >> 31521251 |
Jianying Lin1, Hui Chen2, Shan Li3, Yushuang Liu4, Xuan Li5, Bin Yu6.
Abstract
Discovering and accurately locating drug targets is of great significance for the research and development of new drugs. As a different approach to traditional drug development, the machine learning algorithm is used to predict the drug target by mining the data. Because of its advantages of short time and low cost, it has received more and more attention in recent years. In this paper, we propose a novel method for predicting druggable proteins. Firstly, the features of the protein sequence are extracted by combining Chou's pseudo amino acid composition (PseAAC), dipeptide composition (DPC) and reduced sequence (RS), getting the 591 dimension of drug target dataset. Then, the feature information of druggable proteins dataset is selected by genetic algorithm (GA). Finally, we use Bagging ensemble learning to improve SVM classifier to get the final prediction model. The predictive accuracy rate reaches 93.78% by using 5-fold cross-validation and compared with other state-of-the-art predictive methods. The results indicate that the method proposed in this paper has a high reference value for the prediction of potential drug targets, which will successfully play a key role in the drug research and development. The source code and all datasets are available at https://github.com/QUST-AIBBDRC/GA-Bagging-SVM.Entities:
Keywords: Bagging; Druggable proteins; Ensemble classifier; Feature extraction; Genetic algorithm; Support vector machine
Year: 2019 PMID: 31521251 DOI: 10.1016/j.artmed.2019.07.005
Source DB: PubMed Journal: Artif Intell Med ISSN: 0933-3657 Impact factor: 5.326