Literature DB >> 21572553

Non-Alignment Features Based Enzyme/Non-Enzyme Classification Using an Ensemble Method.

Nicholas J Davidson1, Xueyi Wang.   

Abstract

As a growing number of protein structures are resolved without known functions, using computational methods to help predict protein functions from the structures becomes more and more important. Some computational methods predict protein functions by aligning to homologous proteins with known functions, but they fail to work if such homology cannot be identified. In this paper we classify enzymes/non-enzymes using non-alignment features. We propose a new ensemble method that includes three support vector machines (SVM) and two k-nearest neighbor algorithms (k-NN) and uses a simple majority voting rule. The test on a data set of 697 enzymes and 480 non-enzymes adapted from Dobson and Doig shows 85.59% accuracy in a 10-fold cross validation and 86.49% accuracy in a leave-one-out validation. The prediction accuracy is much better than other non-alignment features based methods and even slightly better than alignment features based methods. To our knowledge, our method is the first time to use ensemble methods to classify enzymes/non-enzymes and is superior over a single classifier.

Entities:  

Year:  2010        PMID: 21572553      PMCID: PMC3091888          DOI: 10.1109/ICMLA.2010.167

Source DB:  PubMed          Journal:  Proc Int Conf Mach Learn Appl


  25 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison.

Authors:  Angel R Ortiz; Charlie E M Strauss; Osvaldo Olmea
Journal:  Protein Sci       Date:  2002-11       Impact factor: 6.725

3.  Predicting subcellular localization of proteins using machine-learned classifiers.

Authors:  Z Lu; D Szafron; R Greiner; P Lu; D S Wishart; B Poulin; J Anvik; C Macdonell; R Eisner
Journal:  Bioinformatics       Date:  2004-01-22       Impact factor: 6.937

4.  Computational chemistry study of 3D-structure-function relationships for enzymes based on Markov models for protein electrostatic, HINT, and van der Waals potentials.

Authors:  Riccardo Concu; Gianni Podda; Eugenio Uriarte; Humberto González-Díaz
Journal:  J Comput Chem       Date:  2009-07-15       Impact factor: 3.376

5.  Sequence context-specific profiles for homology searching.

Authors:  A Biegert; J Söding
Journal:  Proc Natl Acad Sci U S A       Date:  2009-02-20       Impact factor: 11.205

6.  Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices.

Authors:  Cristian Robert Munteanu; Humberto González-Díaz; Alexandre L Magalhães
Journal:  J Theor Biol       Date:  2008-06-14       Impact factor: 2.691

7.  Protein structure alignment by incremental combinatorial extension (CE) of the optimal path.

Authors:  I N Shindyalov; P E Bourne
Journal:  Protein Eng       Date:  1998-09

8.  Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design.

Authors:  J Liang; H Edelsbrunner; C Woodward
Journal:  Protein Sci       Date:  1998-09       Impact factor: 6.725

9.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

10.  Searching protein structure databases with DaliLite v.3.

Authors:  L Holm; S Kääriäinen; P Rosenström; A Schenkel
Journal:  Bioinformatics       Date:  2008-09-25       Impact factor: 6.937

View more
  3 in total

1.  CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure.

Authors:  Jan-Oliver Janda; Markus Busch; Fabian Kück; Mikhail Porfenenko; Rainer Merkl
Journal:  BMC Bioinformatics       Date:  2012-04-05       Impact factor: 3.169

2.  ECPred: a tool for the prediction of the enzymatic functions of protein sequences based on the EC nomenclature.

Authors:  Alperen Dalkiran; Ahmet Sureyya Rifaioglu; Maria Jesus Martin; Rengul Cetin-Atalay; Volkan Atalay; Tunca Doğan
Journal:  BMC Bioinformatics       Date:  2018-09-21       Impact factor: 3.169

3.  Alignment-Free Method to Predict Enzyme Classes and Subclasses.

Authors:  Riccardo Concu; M Natália D S Cordeiro
Journal:  Int J Mol Sci       Date:  2019-10-29       Impact factor: 5.923

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.