Literature DB >> 16672258

Ensemble classifier for protein fold pattern recognition.

Hong-Bin Shen1, Kuo-Chen Chou.   

Abstract

MOTIVATION: Prediction of protein folding patterns is one level deeper than that of protein structural classes, and hence is much more complicated and difficult. To deal with such a challenging problem, the ensemble classifier was introduced. It was formed by a set of basic classifiers, with each trained in different parameter systems, such as predicted secondary structure, hydrophobicity, van der Waals volume, polarity, polarizability, as well as different dimensions of pseudo-amino acid composition, which were extracted from a training dataset. The operation engine for the constituent individual classifiers was OET-KNN (optimized evidence-theoretic k-nearest neighbors) rule. Their outcomes were combined through a weighted voting to give a final determination for classifying a query protein. The recognition was to find the true fold among the 27 possible patterns.
RESULTS: The overall success rate thus obtained was 62% for a testing dataset where most of the proteins have <25% sequence identity with the proteins used in training the classifier. Such a rate is 6-21% higher than the corresponding rates obtained by various existing NN (neural networks) and SVM (support vector machines) approaches, implying that the ensemble classifier is very promising and might become a useful vehicle in protein science, as well as proteomics and bioinformatics. AVAILABILITY: The ensemble classifier, called PFP-Pred, is available as a web-server at http://202.120.37.186/bioinf/fold/PFP-Pred.htm for public usage.

Mesh:

Substances:

Year:  2006        PMID: 16672258     DOI: 10.1093/bioinformatics/btl170

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  65 in total

1.  On the relation between the predicted secondary structure and the protein size.

Authors:  Lukasz Kurgan
Journal:  Protein J       Date:  2008-06       Impact factor: 2.371

2.  High-accuracy prediction of transmembrane inter-helix contacts and application to GPCR 3D structure modeling.

Authors:  Jing Yang; Richard Jang; Yang Zhang; Hong-Bin Shen
Journal:  Bioinformatics       Date:  2013-08-14       Impact factor: 6.937

3.  An ensemble classifier of support vector machines used to predict protein structural classes by fusing auto covariance and pseudo-amino acid composition.

Authors:  Jiang Wu; Meng-Long Li; Le-Zheng Yu; Chao Wang
Journal:  Protein J       Date:  2010-01       Impact factor: 2.371

4.  Sequence physical properties encode the global organization of protein structure space.

Authors:  S Rackovsky
Journal:  Proc Natl Acad Sci U S A       Date:  2009-08-12       Impact factor: 11.205

Review 5.  Methods of integrating data to uncover genotype-phenotype interactions.

Authors:  Marylyn D Ritchie; Emily R Holzinger; Ruowang Li; Sarah A Pendergrass; Dokyoon Kim
Journal:  Nat Rev Genet       Date:  2015-01-13       Impact factor: 53.242

6.  D-Glucose sensing by a plasma membrane regulator of G signaling protein, AtRGS1.

Authors:  Jeffrey C Grigston; Daniel Osuna; Wolf-Rüdiger Scheible; Chenggang Liu; Mark Stitt; Alan M Jones
Journal:  FEBS Lett       Date:  2008-09-24       Impact factor: 4.124

7.  Exploring protein structural dissimilarity to facilitate structure classification.

Authors:  Pooja Jain; Jonathan D Hirst
Journal:  BMC Struct Biol       Date:  2009-09-19

8.  Machine learning integration for predicting the effect of single amino acid substitutions on protein stability.

Authors:  Ayşegül Ozen; Mehmet Gönen; Ethem Alpaydan; Türkan Haliloğlu
Journal:  BMC Struct Biol       Date:  2009-10-19

9.  MULTI-K: accurate classification of microarray subtypes using ensemble k-means clustering.

Authors:  Eun-Youn Kim; Seon-Young Kim; Daniel Ashlock; Dougu Nam
Journal:  BMC Bioinformatics       Date:  2009-08-22       Impact factor: 3.169

10.  Enhanced protein fold recognition through a novel data integration approach.

Authors:  Yiming Ying; Kaizhu Huang; Colin Campbell
Journal:  BMC Bioinformatics       Date:  2009-08-26       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.