Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Accurate prediction of enzyme subfamily class using an adaptive fuzzy k-nearest neighbor method.

Literature DB >> 17140725

Accurate prediction of enzyme subfamily class using an adaptive fuzzy k-nearest neighbor method.

Wen-Lin Huang¹, Hung-Ming Chen, Shiow-Fen Hwang, Shinn-Ying Ho.

Abstract

Amphiphilic pseudo-amino acid composition (Am-Pse-AAC) with extra sequence-order information is a useful feature for representing enzymes. This study first utilizes the k-nearest neighbor (k-NN) rule to analyze the distribution of enzymes in the Am-Pse-AAC feature space. This analysis indicates the distributions of multiple classes of enzymes are highly overlapped. To cope with the overlap problem, this study proposes an efficient non-parametric classifier for predicting enzyme subfamily class using an adaptive fuzzy r-nearest neighbor (AFK-NN) method, where k and a fuzzy strength parameter m are adaptively specified. The fuzzy membership values of a query sample Q are dynamically determined according to the position of Q and its weighted distances to the k nearest neighbors. Using the same enzymes of the oxidoreductases family for comparisons, the prediction accuracy of AFK-NN is 76.6%, which is better than those of Support Vector Machine (73.6%), the decision tree method C5.0 (75.4%) and the existing covariant-discriminate algorithm (70.6%) using a jackknife test. To evaluate the generalization ability of AFK-NN, the datasets for all six families of entirely sequenced enzymes are established from the newly updated SWISS-PROT and ENZYME database. The accuracy of AFK-NN on the new large-scale dataset of oxidoreductases family is 83.3%, and the mean accuracy of the six families is 92.1%.

Entities: Chemical Gene

Mesh：

Substances：

Year: 2006 PMID： 17140725 DOI： 10.1016/j.biosystems.2006.10.004

Source DB: PubMed Journal: Biosystems ISSN： 0303-2647 Impact factor: 1.973

Keyword Cloud
Cited

6 in total

1. Computational Approaches for Automated Classification of Enzyme Sequences.

Authors: Akram Mohammed; Chittibabu Guda
Journal: J Proteomics Bioinform Date: 2011-08-23

Review 2. A survey of computational intelligence techniques in protein function prediction.

Authors: Arvind Kumar Tiwari; Rajeev Srivastava
Journal: Int J Proteomics Date: 2014-12-11

3. DEEPre: sequence-based enzyme EC number prediction by deep learning.

Authors: Yu Li; Sheng Wang; Ramzan Umarov; Bingqing Xie; Ming Fan; Lihua Li; Xin Gao
Journal: Bioinformatics Date: 2018-03-01 Impact factor: 6.937

4. ProLoc-GO: utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization.

Authors: Wen-Lin Huang; Chun-Wei Tung; Shih-Wen Ho; Shiow-Fen Hwang; Shinn-Ying Ho
Journal: BMC Bioinformatics Date: 2008-02-01 Impact factor: 3.169

5. Identification of Multi-Functional Enzyme with Multi-Label Classifier.

Authors: Yuxin Che; Ying Ju; Ping Xuan; Ren Long; Fei Xing
Journal: PLoS One Date: 2016-04-14 Impact factor: 3.240

6. ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature.

Authors: Alperen Dalkiran; Ahmet Sureyya Rifaioglu; Maria Jesus Martin; Rengul Cetin-Atalay; Volkan Atalay; Tunca Doğan
Journal: BMC Bioinformatics Date: 2018-09-21 Impact factor: 3.169

6 in total