Literature DB >> 16547073

A machine learning information retrieval approach to protein fold recognition.

Jianlin Cheng1, Pierre Baldi.   

Abstract

MOTIVATION: Recognizing proteins that have similar tertiary structure is the key step of template-based protein structure prediction methods. Traditionally, a variety of alignment methods are used to identify similar folds, based on sequence similarity and sequence-structure compatibility. Although these methods are complementary, their integration has not been thoroughly exploited. Statistical machine learning methods provide tools for integrating multiple features, but so far these methods have been used primarily for protein and fold classification, rather than addressing the retrieval problem of fold recognition-finding a proper template for a given query protein.
RESULTS: Here we present a two-stage machine learning, information retrieval, approach to fold recognition. First, we use alignment methods to derive pairwise similarity features for query-template protein pairs. We also use global profile-profile alignments in combination with predicted secondary structure, relative solvent accessibility, contact map and beta-strand pairing to extract pairwise structural compatibility features. Second, we apply support vector machines to these features to predict the structural relevance (i.e. in the same fold or not) of the query-template pairs. For each query, the continuous relevance scores are used to rank the templates. The FOLDpro approach is modular, scalable and effective. Compared with 11 other fold recognition methods, FOLDpro yields the best results in almost all standard categories on a comprehensive benchmark dataset. Using predictions of the top-ranked template, the sensitivity is approximately 85, 56, and 27% at the family, superfamily and fold levels respectively. Using the 5 top-ranked templates, the sensitivity increases to 90, 70, and 48%.

Mesh:

Substances:

Year:  2006        PMID: 16547073     DOI: 10.1093/bioinformatics/btl102

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  65 in total

1.  Improving threading algorithms for remote homology modeling by combining fragment and template comparisons.

Authors:  Hongyi Zhou; Jeffrey Skolnick
Journal:  Proteins       Date:  2010-07

2.  PSS-3D1D: an improved 3D1D profile method of protein fold recognition for the annotation of twilight zone sequences.

Authors:  K Ganesan; S Parthasarathy
Journal:  J Struct Funct Genomics       Date:  2011-12-03

3.  Ty3 capsid mutations reveal early and late functions of the amino-terminal domain.

Authors:  Liza S Z Larsen; Min Zhang; Nadejda Beliakova-Bethell; Virginia Bilanchone; Anne Lamsa; Kunio Nagashima; Rani Najdi; Kathryn Kosaka; Vuk Kovacevic; Jianlin Cheng; Pierre Baldi; G Wesley Hatfield; Suzanne Sandmeyer
Journal:  J Virol       Date:  2007-04-18       Impact factor: 5.103

4.  On the relation between the predicted secondary structure and the protein size.

Authors:  Lukasz Kurgan
Journal:  Protein J       Date:  2008-06       Impact factor: 2.371

5.  Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided-learning through a two-layer neural network.

Authors:  Eshel Faraggi; Bin Xue; Yaoqi Zhou
Journal:  Proteins       Date:  2009-03

6.  Protein structure prediction by pro-Sp3-TASSER.

Authors:  Hongyi Zhou; Jeffrey Skolnick
Journal:  Biophys J       Date:  2009-03-18       Impact factor: 4.033

7.  Structure prediction of domain insertion proteins from structures of individual domains.

Authors:  Monica Berrondo; Marc Ostermeier; Jeffrey J Gray
Journal:  Structure       Date:  2008-04       Impact factor: 5.006

Review 8.  Progress and challenges in protein structure prediction.

Authors:  Yang Zhang
Journal:  Curr Opin Struct Biol       Date:  2008-04-22       Impact factor: 6.809

9.  BCL::contact-low confidence fold recognition hits boost protein contact prediction and de novo structure determination.

Authors:  Mert Karakaş; Nils Woetzel; Jens Meiler
Journal:  J Comput Biol       Date:  2010-02       Impact factor: 1.479

10.  Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of query and corresponding native properties of templates.

Authors:  Yuedong Yang; Eshel Faraggi; Huiying Zhao; Yaoqi Zhou
Journal:  Bioinformatics       Date:  2011-06-11       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.