Literature DB >> 9322021

Prediction of enzyme classification from protein sequence without the use of sequence similarity.

M des Jardins1, P D Karp, M Krummenacker, T J Lee, C A Ouzounis.   

Abstract

We describe a novel approach for predicting the function of a protein from its amino-acid sequence. Given features that can be computed from the amino-acid sequence in a straightforward fashion (such as pI, molecular weight, and amino-acid composition), the technique allows us to answer questions such as: Is the protein an enzyme? If so, in which Enzyme Commission (EC) class does it belong? Our approach uses machine learning (ML) techniques to induce classifiers that predict the EC class of an enzyme from features extracted from its primary sequence. We report on a variety of experiments in which we explored the use of three different ML techniques in conjunction with training datasets derived from PDB and from Swiss-Prot. We also explored the use of several different feature sets. Our method is able to predict the first EC number of an enzyme with 74% accuracy (thereby assigning the enzyme to one of six broad categories of enzyme function), and to predict the second EC number of an enzyme with 68% accuracy (thereby assigning the enzyme to one of 57 subcategories of enzyme function). This technique could be a valuable complement to sequence-similarity searches and to pathway-analysis methods.

Entities:  

Mesh:

Substances:

Year:  1997        PMID: 9322021

Source DB:  PubMed          Journal:  Proc Int Conf Intell Syst Mol Biol        ISSN: 1553-0833


  13 in total

1.  Functional versatility and molecular diversity of the metabolic map of Escherichia coli.

Authors:  S Tsoka; C A Ouzounis
Journal:  Genome Res       Date:  2001-09       Impact factor: 9.043

Review 2.  The past, present and future of genome-wide re-annotation.

Authors:  Christos A Ouzounis; Peter D Karp
Journal:  Genome Biol       Date:  2002-01-31       Impact factor: 13.583

3.  Expansion of the BioCyc collection of pathway/genome databases to 160 genomes.

Authors:  Peter D Karp; Christos A Ouzounis; Caroline Moore-Kochlacs; Leon Goldovsky; Pallavi Kaipa; Dag Ahrén; Sophia Tsoka; Nikos Darzentas; Victor Kunin; Núria López-Bigas
Journal:  Nucleic Acids Res       Date:  2005-10-24       Impact factor: 16.971

4.  Probabilistic annotation of protein sequences based on functional classifications.

Authors:  Emmanuel D Levy; Christos A Ouzounis; Walter R Gilks; Benjamin Audit
Journal:  BMC Bioinformatics       Date:  2005-12-14       Impact factor: 3.169

5.  Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach.

Authors:  L Y Han; C Z Cai; Z L Ji; Z W Cao; J Cui; Y Z Chen
Journal:  Nucleic Acids Res       Date:  2004-12-07       Impact factor: 16.971

6.  A computational strategy for protein function assignment which addresses the multidomain problem.

Authors:  A J Pérez; A Rodríguez; O Trelles; G Thode
Journal:  Comp Funct Genomics       Date:  2002

7.  Sequence-based feature prediction and annotation of proteins.

Authors:  Agnieszka S Juncker; Lars J Jensen; Andrea Pierleoni; Andreas Bernsel; Michael L Tress; Peer Bork; Gunnar von Heijne; Alfonso Valencia; Christos A Ouzounis; Rita Casadio; Søren Brunak
Journal:  Genome Biol       Date:  2009-02-02       Impact factor: 13.583

8.  CORRIE: enzyme sequence annotation with confidence estimates.

Authors:  Benjamin Audit; Emmanuel D Levy; Wally R Gilks; Leon Goldovsky; Christos A Ouzounis
Journal:  BMC Bioinformatics       Date:  2007-05-22       Impact factor: 3.169

9.  Prediction of functional class of proteins and peptides irrespective of sequence homology by support vector machines.

Authors:  Zhi Qun Tang; Hong Huang Lin; Hai Lei Zhang; Lian Yi Han; Xin Chen; Yu Zong Chen
Journal:  Bioinform Biol Insights       Date:  2009-11-24

10.  Characterising Complex Enzyme Reaction Data.

Authors:  Handan Melike Dönertaş; Sergio Martínez Cuesta; Syed Asad Rahman; Janet M Thornton
Journal:  PLoS One       Date:  2016-02-03       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.