Literature DB >> 14695283

An optimal structure-discriminative amino acid index for protein fold recognition.

R H Leary1, J B Rosen, P Jambeck.   

Abstract

Identifying the fold class of a protein sequence of unknown structure is a fundamental problem in modern biology. We apply a supervised learning algorithm to the classification of protein sequences with low sequence identity from a library of 174 structural classes created with the Combinatorial Extension structural alignment methodology. A class of rules is considered that assigns test sequences to structural classes based on the closest match of an amino acid index profile of the test sequence to a profile centroid for each class. A mathematical optimization procedure is applied to determine an amino acid index of maximal structural discriminatory power by maximizing the ratio of between-class to within-class profile variation. The optimal index is computed as the solution to a generalized eigenvalue problem, and its performance for fold classification is compared to that of other published indices. The optimal index has significantly more structural discriminatory power than all currently known indices, including average surrounding hydrophobicity, which it most closely resembles. It demonstrates >70% classification accuracy over all folds and nearly 100% accuracy on several folds with distinctive conserved structural features. Finally, there is a compelling universality to the optimal index in that it does not appear to depend strongly on the specific structural classes used in its computation.

Mesh:

Substances:

Year:  2004        PMID: 14695283      PMCID: PMC1303806          DOI: 10.1016/S0006-3495(04)74117-X

Source DB:  PubMed          Journal:  Biophys J        ISSN: 0006-3495            Impact factor:   4.033


  21 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Protein fold recognition using sequence-derived predictions.

Authors:  D Fischer; D Eisenberg
Journal:  Protein Sci       Date:  1996-05       Impact factor: 6.725

3.  Identification of protein folds: matching hydrophobicity patterns of sequence sets with solvent accessibility patterns of known structures.

Authors:  J U Bowie; N D Clarke; C O Pabo; R T Sauer
Journal:  Proteins       Date:  1990

Review 4.  Helix capping.

Authors:  R Aurora; G D Rose
Journal:  Protein Sci       Date:  1998-01       Impact factor: 6.725

5.  Understanding the recognition of protein structural classes by amino acid composition.

Authors:  I Bahar; A R Atilgan; R L Jernigan; B Erman
Journal:  Proteins       Date:  1997-10

6.  An iterative method for extracting energy-like quantities from protein structures.

Authors:  P D Thomas; K A Dill
Journal:  Proc Natl Acad Sci U S A       Date:  1996-10-15       Impact factor: 11.205

7.  Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins.

Authors:  K Tomii; M Kanehisa
Journal:  Protein Eng       Date:  1996-01

8.  A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space.

Authors:  K C Chou
Journal:  Proteins       Date:  1995-04

9.  SCOP: a structural classification of proteins database for the investigation of sequences and structures.

Authors:  A G Murzin; S E Brenner; T Hubbard; C Chothia
Journal:  J Mol Biol       Date:  1995-04-07       Impact factor: 5.469

10.  Analysis of membrane and surface protein sequences with the hydrophobic moment plot.

Authors:  D Eisenberg; E Schwarz; M Komaromy; R Wall
Journal:  J Mol Biol       Date:  1984-10-15       Impact factor: 5.469

View more
  2 in total

1.  ΤND: a thyroid nodule detection system for analysis of ultrasound images and videos.

Authors:  Eystratios G Keramidas; Dimitris Maroulis; Dimitris K Iakovidis
Journal:  J Med Syst       Date:  2010-09-14       Impact factor: 4.460

2.  Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers.

Authors:  Peng Chen; Jinyan Li
Journal:  BMC Struct Biol       Date:  2010-05-17
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.