Literature DB >> 14997540

Enzyme family classification by support vector machines.

C Z Cai1, L Y Han, Z L Ji, Y Z Chen.   

Abstract

One approach for facilitating protein function prediction is to classify proteins into functional families. Recent studies on the classification of G-protein coupled receptors and other proteins suggest that a statistical learning method, Support vector machines (SVM), may be potentially useful for protein classification into functional families. In this work, SVM is applied and tested on the classification of enzymes into functional families defined by the Enzyme Nomenclature Committee of IUBMB. SVM classification system for each family is trained from representative enzymes of that family and seed proteins of Pfam curated protein families. The classification accuracy for enzymes from 46 families and for non-enzymes is in the range of 50.0% to 95.7% and 79.0% to 100% respectively. The corresponding Matthews correlation coefficient is in the range of 54.1% to 96.1%. Moreover, 80.3% of the 8,291 correctly classified enzymes are uniquely classified into a specific enzyme family by using a scoring function, indicating that SVM may have certain level of unique prediction capability. Testing results also suggest that SVM in some cases is capable of classification of distantly related enzymes and homologous enzymes of different functions. Effort is being made to use a more comprehensive set of enzymes as training sets and to incorporate multi-class SVM classification systems to further enhance the unique prediction accuracy. Our results suggest the potential of SVM for enzyme family classification and for facilitating protein function prediction. Our software is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi. Copyright 2004 Wiley-Liss, Inc.

Mesh:

Substances:

Year:  2004        PMID: 14997540     DOI: 10.1002/prot.20045

Source DB:  PubMed          Journal:  Proteins        ISSN: 0887-3585


  35 in total

1.  EHPred: an SVM-based method for epoxide hydrolases recognition and classification.

Authors:  Jia Jia; Liang Yang; Zi-Zhang Zhang
Journal:  J Zhejiang Univ Sci B       Date:  2006-01       Impact factor: 3.066

2.  Classifying Multifunctional Enzymes by Incorporating Three Different Models into Chou's General Pseudo Amino Acid Composition.

Authors:  Hong-Liang Zou; Xuan Xiao
Journal:  J Membr Biol       Date:  2016-04-25       Impact factor: 1.843

3.  Computational Approaches for Automated Classification of Enzyme Sequences.

Authors:  Akram Mohammed; Chittibabu Guda
Journal:  J Proteomics Bioinform       Date:  2011-08-23

4.  iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.

Authors:  Zhen Chen; Pei Zhao; Chen Li; Fuyi Li; Dongxu Xiang; Yong-Zi Chen; Tatsuya Akutsu; Roger J Daly; Geoffrey I Webb; Quanzhi Zhao; Lukasz Kurgan; Jiangning Song
Journal:  Nucleic Acids Res       Date:  2021-06-04       Impact factor: 16.971

5.  Gene function prediction based on genomic context clustering and discriminative learning: an application to bacteriophages.

Authors:  Jason Li; Saman K Halgamuge; Christopher I Kells; Sen-Lin Tang
Journal:  BMC Bioinformatics       Date:  2007-05-22       Impact factor: 3.169

6.  Identification of protein functions using a machine-learning approach based on sequence-derived properties.

Authors:  Bum Ju Lee; Moon Sun Shin; Young Joon Oh; Hae Seok Oh; Keun Ho Ryu
Journal:  Proteome Sci       Date:  2009-08-09       Impact factor: 2.480

Review 7.  Enzyme informatics.

Authors:  Rosanna G Alderson; Luna De Ferrari; Lazaros Mavridis; James L McDonagh; John B O Mitchell; Neetika Nath
Journal:  Curr Top Med Chem       Date:  2012       Impact factor: 3.295

8.  Predicting disordered regions in proteins using the profiles of amino acid indices.

Authors:  Pengfei Han; Xiuzhen Zhang; Zhi-Ping Feng
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

9.  Enzyme classification with peptide programs: a comparative study.

Authors:  Daniel Faria; António E N Ferreira; André O Falcão
Journal:  BMC Bioinformatics       Date:  2009-07-24       Impact factor: 3.169

10.  Local combinational variables: an approach used in DNA-binding helix-turn-helix motif prediction with sequence information.

Authors:  Wenwei Xiong; Tonghua Li; Kai Chen; Kailin Tang
Journal:  Nucleic Acids Res       Date:  2009-08-03       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.