Zhe Yang1, Juan Wang1, Jia Yang2, Zhi Qi2, Jiahao He3.
Abstract
BACKGROUND: We research the binding function proteins in Elymus nutans. Recognition for proteins is essential for study of biology. Machine learning methods have been widely used for the prediction of proteins.
METHODS: We used BLAST software for the function annotations of Elymus nutans. Besides, we used machine learning methods to recognize proteins which are not annotated by the software. In the process, we focused on identifying the proteins with binding functions. In our research, features are extracted by four algorithms, and then selected by mutual information estimator. Here three classifiers are constructed based on K-nearest neighbour algorithm and gradient boosting algorithm. RESULTS AND
CONCLUSION: Experimental results show that there are 848 proteins with ATP binding function, 113 proteins with heme binding function, 315 proteins with zinc-ion binding function, 135 proteins with GTP binding function and 21 proteins with ADP binding function. Furthermore, we have successfully predicted the functions of 10 special protein sequences whose function annotations cannot be obtained by making sequence alignment with seven famous protein databases. Among them, seven sequences have ATP binding functions, one sequence has heme binding function, one sequence has zinc-ion binding function and the other one has GTP binding function. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.net.
BACKGROUND: We research the binding function proteins in Elymus nutans. Recognition for proteins is essential for study of biology. Machine learning methods have been widely used for the prediction of proteins.
METHODS: We used BLAST software for the function annotations of Elymus nutans. Besides, we used machine learning methods to recognize proteins which are not annotated by the software. In the process, we focused on identifying the proteins with binding functions. In our research, features are extracted by four algorithms, and then selected by mutual information estimator. Here three classifiers are constructed based on K-nearest neighbour algorithm and gradient boosting algorithm. RESULTS AND
CONCLUSION: Experimental results show that there are 848 proteins with ATP binding function, 113 proteins with heme binding function, 315 proteins with zinc-ion binding function, 135 proteins with GTP binding function and 21 proteins with ADP binding function. Furthermore, we have successfully predicted the functions of 10 special protein sequences whose function annotations cannot be obtained by making sequence alignment with seven famous protein databases. Among them, seven sequences have ATP binding functions, one sequence has heme binding function, one sequence has zinc-ion binding function and the other one has GTP binding function. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.net.
Entities:
Keywords:
ATP; GTP; Protein; binding function; feature; machine learning
Mesh:
Substances:
Year: 2020
PMID: 32223731 DOI: 10.2174/1386207323666200330120154
Source DB: PubMed Journal: Comb Chem High Throughput Screen ISSN: 1386-2073 Impact factor: 1.339