Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 An optimal structure-discriminative amino acid index for protein fold recognition.

Literature DB >> 14695283

An optimal structure-discriminative amino acid index for protein fold recognition.

Abstract

Identifying the fold class of a protein sequence of unknown structure is a fundamental problem in modern biology. We apply a supervised learning algorithm to the classification of protein sequences with low sequence identity from a library of 174 structural classes created with the Combinatorial Extension structural alignment methodology. A class of rules is considered that assigns test sequences to structural classes based on the closest match of an amino acid index profile of the test sequence to a profile centroid for each class. A mathematical optimization procedure is applied to determine an amino acid index of maximal structural discriminatory power by maximizing the ratio of between-class to within-class profile variation. The optimal index is computed as the solution to a generalized eigenvalue problem, and its performance for fold classification is compared to that of other published indices. The optimal index has significantly more structural discriminatory power than all currently known indices, including average surrounding hydrophobicity, which it most closely resembles. It demonstrates >70% classification accuracy over all folds and nearly 100% accuracy on several folds with distinctive conserved structural features. Finally, there is a compelling universality to the optimal index in that it does not appear to depend strongly on the specific structural classes used in its computation.

Mesh：

Substances：
Amino Acids
Proteins

Year: 2004 PMID： 14695283 PMCID： PMC1303806 DOI： 10.1016/S0006-3495(04)74117-X

Source DB: PubMed Journal: Biophys J ISSN： 0006-3495 Impact factor: 4.033

21 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. Protein fold recognition using sequence-derived predictions.

Authors: D Fischer; D Eisenberg
Journal: Protein Sci Date: 1996-05 Impact factor: 6.725

3. Identification of protein folds: matching hydrophobicity patterns of sequence sets with solvent accessibility patterns of known structures.

Authors: J U Bowie; N D Clarke; C O Pabo; R T Sauer
Journal: Proteins Date: 1990

Review 4. Helix capping.

Authors: R Aurora; G D Rose
Journal: Protein Sci Date: 1998-01 Impact factor: 6.725

2. Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers.

Authors: Peng Chen; Jinyan Li
Journal: BMC Struct Biol Date: 2010-05-17

2 in total

An optimal structure-discriminative amino acid index for protein fold recognition.

1. The Protein Data Bank.

2. Protein fold recognition using sequence-derived predictions.

3. Identification of protein folds: matching hydrophobicity patterns of sequence sets with solvent accessibility patterns of known structures.

Review 4. Helix capping.

5. Understanding the recognition of protein structural classes by amino acid composition.

6. An iterative method for extracting energy-like quantities from protein structures.

7. Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins.

8. A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space.

9. SCOP: a structural classification of proteins database for the investigation of sequences and structures.

10. Analysis of membrane and surface protein sequences with the hydrophobic moment plot.

1. ΤND: a thyroid nodule detection system for analysis of ultrasound images and videos.

2. Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers.