Literature DB >> 12850146

Distinguishing enzyme structures from non-enzymes without alignments.

Paul D Dobson1, Andrew J Doig.   

Abstract

The ability to predict protein function from structure is becoming increasingly important as the number of structures resolved is growing more rapidly than our capacity to study function. Current methods for predicting protein function are mostly reliant on identifying a similar protein of known function. For proteins that are highly dissimilar or are only similar to proteins also lacking functional annotations, these methods fail. Here, we show that protein function can be predicted as enzymatic or not without resorting to alignments. We describe 1178 high-resolution proteins in a structurally non-redundant subset of the Protein Data Bank using simple features such as secondary-structure content, amino acid propensities, surface properties and ligands. The subset is split into two functional groupings, enzymes and non-enzymes. We use the support vector machine-learning algorithm to develop models that are capable of assigning the protein class. Validation of the method shows that the function can be predicted to an accuracy of 77% using 52 features to describe each protein. An adaptive search of possible subsets of features produces a simplified model based on 36 features that predicts at an accuracy of 80%. We compare the method to sequence-based methods that also avoid calculating alignments and predict a recently released set of unrelated proteins. The most useful features for distinguishing enzymes from non-enzymes are secondary-structure content, amino acid frequencies, number of disulphide bonds and size of the largest cleft. This method is applicable to any structure as it does not require the identification of sequence or structural similarity to a protein of known function.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 12850146     DOI: 10.1016/s0022-2836(03)00628-4

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  29 in total

1.  Descriptor-based protein remote homology identification.

Authors:  Ziding Zhang; Sunil Kochhar; Martin G Grigorov
Journal:  Protein Sci       Date:  2005-01-04       Impact factor: 6.725

2.  Predicting flexible length linear B-cell epitopes.

Authors:  Yasser El-Manzalawy; Drena Dobbs; Vasant Honavar
Journal:  Comput Syst Bioinformatics Conf       Date:  2008

3.  The enzymatic nature of an anonymous protein sequence cannot reliably be inferred from superfamily level structural information alone.

Authors:  Daniel Barry Roche; Thomas Brüls
Journal:  Protein Sci       Date:  2015-01-28       Impact factor: 6.725

4.  ProtDCal-Suite: A web server for the numerical codification and functional analysis of proteins.

Authors:  Sandra Romero-Molina; Yasser B Ruiz-Blanco; James R Green; Elsa Sanchez-Garcia
Journal:  Protein Sci       Date:  2019-09       Impact factor: 6.725

5.  Multi-algorithm and multi-model based drug target prediction and web server.

Authors:  Ying-tao Liu; Yi Li; Zi-fu Huang; Zhi-jian Xu; Zhuo Yang; Zhu-xi Chen; Kai-xian Chen; Ji-ye Shi; Wei-liang Zhu
Journal:  Acta Pharmacol Sin       Date:  2014-02-03       Impact factor: 6.150

6.  Non-Alignment Features Based Enzyme/Non-Enzyme Classification Using an Ensemble Method.

Authors:  Nicholas J Davidson; Xueyi Wang
Journal:  Proc Int Conf Mach Learn Appl       Date:  2010-12-12

7.  Identification of protein functions using a machine-learning approach based on sequence-derived properties.

Authors:  Bum Ju Lee; Moon Sun Shin; Young Joon Oh; Hae Seok Oh; Keun Ho Ryu
Journal:  Proteome Sci       Date:  2009-08-09       Impact factor: 2.480

8.  Protein functional annotation of simultaneously improved stability, accuracy and false discovery rate achieved by a sequence-based deep learning.

Authors:  Jiajun Hong; Yongchao Luo; Yang Zhang; Junbiao Ying; Weiwei Xue; Tian Xie; Lin Tao; Feng Zhu
Journal:  Brief Bioinform       Date:  2020-07-15       Impact factor: 11.622

9.  SitesIdentify: a protein functional site prediction tool.

Authors:  Tracey Bray; Pedro Chan; Salim Bougouffa; Richard Greaves; Andrew J Doig; Jim Warwicker
Journal:  BMC Bioinformatics       Date:  2009-11-18       Impact factor: 3.169

10.  TIM-Finder: a new method for identifying TIM-barrel proteins.

Authors:  Jing-Na Si; Ren-Xiang Yan; Chuan Wang; Ziding Zhang; Xiao-Dong Su
Journal:  BMC Struct Biol       Date:  2009-12-14
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.