| Literature DB >> 33486591 |
Hao Wang1, Qilemuge Xi1, Pengfei Liang1, Lei Zheng1, Yan Hong1, Yongchun Zuo2.
Abstract
Enzymes have been proven to play considerable roles in disease diagnosis and biological functions. The feature extraction that truly reflects the intrinsic properties of protein is the most critical step for the automatic identification of enzymes. Although lots of feature extraction methods have been proposed, some challenges remain. In this study, we developed a predictor called IHEC_RAAC, which has the capability to identify whether a protein is a human enzyme and distinguish the function of the human enzyme. To improve the feature representation ability, protein sequences were encoded by a new feature-vector called 'reduced amino acid cluster'. We calculated 673 amino acid reduction alphabets to determine the optimal feature representative scheme. The tenfold cross-validation test showed that the accuracy of IHEC_RAAC to identify human enzymes was 74.66% and further discriminate the human enzyme classes with an accuracy of 54.78%, which was 2.06% and 8.68% higher than the state-of-the-art predictors, respectively. Additionally, the results from the independent dataset indicated that IHEC_RAAC can effectively predict human enzymes and human enzyme classes to further provide guidance for protein research. A user-friendly web server, IHEC_RAAC, is freely accessible at http://bioinfor.imu.edu.cn/ihecraac .Entities:
Keywords: Human enzymes; Machine learning; Reduced amino acid cluster; Tenfold cross-validation test; Web-server
Year: 2021 PMID: 33486591 DOI: 10.1007/s00726-021-02941-9
Source DB: PubMed Journal: Amino Acids ISSN: 0939-4451 Impact factor: 3.520