Literature DB >> 19135454

Structural alphabets for protein structure classification: a comparison study.

Quan Le1, Gianluca Pollastri, Patrice Koehl.   

Abstract

Finding structural similarities between proteins often helps reveal shared functionality, which otherwise might not be detected by native sequence information alone. Such similarity is usually detected and quantified by protein structure alignment. Determining the optimal alignment between two protein structures, however, remains a hard problem. An alternative approach is to approximate each three-dimensional protein structure using a sequence of motifs derived from a structural alphabet. Using this approach, structure comparison is performed by comparing the corresponding motif sequences or structural sequences. In this article, we measure the performance of such alphabets in the context of the protein structure classification problem. We consider both local and global structural sequences. Each letter of a local structural sequence corresponds to the best matching fragment to the corresponding local segment of the protein structure. The global structural sequence is designed to generate the best possible complete chain that matches the full protein structure. We use an alphabet of 20 letters, corresponding to a library of 20 motifs or protein fragments having four residues. We show that the global structural sequences approximate well the native structures of proteins, with an average coordinate root mean square of 0.69 A over 2225 test proteins. The approximation is best for all alpha-proteins, while relatively poorer for all beta-proteins. We then test the performance of four different sequence representations of proteins (their native sequence, the sequence of their secondary-structure elements, and the local and global structural sequences based on our fragment library) with different classifiers in their ability to classify proteins that belong to five distinct folds of CATH. Without surprise, the primary sequence alone performs poorly as a structure classifier. We show that addition of either secondary-structure information or local information from the structural sequence considerably improves the classification accuracy. The two fragment-based sequences perform better than the secondary-structure sequence but not well enough at this stage to be a viable alternative to more computationally intensive methods based on protein structure alignment.

Entities:  

Mesh:

Substances:

Year:  2008        PMID: 19135454      PMCID: PMC2772874          DOI: 10.1016/j.jmb.2008.12.044

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  58 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks.

Authors:  A G de Brevern; C Etchebest; S Hazout
Journal:  Proteins       Date:  2000-11-15

3.  Local structure-based sequence profile database for local and global protein structure predictions.

Authors:  An-Suei Yang; Lu-Yong Wang
Journal:  Bioinformatics       Date:  2002-12       Impact factor: 6.937

4.  A substitution matrix for structural alphabet based on structural alignment of homologous proteins and its applications.

Authors:  Manoj Tyagi; Venkataraman S Gowri; Narayanaswamy Srinivasan; Alexandre G de Brevern; Bernard Offmann
Journal:  Proteins       Date:  2006-10-01

5.  Automatic definition of recurrent local structure motifs in proteins.

Authors:  M J Rooman; J Rodriguez; S J Wodak
Journal:  J Mol Biol       Date:  1990-05-20       Impact factor: 5.469

6.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

Review 7.  Construction of phylogenetic trees.

Authors:  W M Fitch; E Margoliash
Journal:  Science       Date:  1967-01-20       Impact factor: 47.728

Review 8.  Principles that determine the structure of proteins.

Authors:  C Chothia
Journal:  Annu Rev Biochem       Date:  1984       Impact factor: 23.643

9.  Protein structure database search and evolutionary classification.

Authors:  Jinn-Moon Yang; Chi-Hua Tung
Journal:  Nucleic Acids Res       Date:  2006-08-02       Impact factor: 16.971

10.  Use of a structural alphabet for analysis of short loops connecting repetitive structures.

Authors:  Laurent Fourrier; Cristina Benros; Alexandre G de Brevern
Journal:  BMC Bioinformatics       Date:  2004-05-12       Impact factor: 3.169

View more
  13 in total

1.  Reducing the dimensionality of the protein-folding search problem.

Authors:  George D Chellapa; George D Rose
Journal:  Protein Sci       Date:  2012-07-06       Impact factor: 6.725

2.  Structural alphabets derived from attractors in conformational space.

Authors:  Alessandro Pandini; Arianna Fornili; Jens Kleinjung
Journal:  BMC Bioinformatics       Date:  2010-02-20       Impact factor: 3.169

3.  Local conformational changes in the DNA interfaces of proteins.

Authors:  Tomoko Sunami; Hidetoshi Kono
Journal:  PLoS One       Date:  2013-02-13       Impact factor: 3.240

4.  A library of protein surface patches discriminates between native structures and decoys generated by structure prediction servers.

Authors:  Roi Gamliel; Klara Kedem; Rachel Kolodny; Chen Keasar
Journal:  BMC Struct Biol       Date:  2011-05-04

5.  BriX: a database of protein building blocks for structural analysis, modeling and design.

Authors:  Peter Vanhee; Erik Verschueren; Lies Baeten; Francois Stricher; Luis Serrano; Frederic Rousseau; Joost Schymkowitz
Journal:  Nucleic Acids Res       Date:  2010-10-23       Impact factor: 16.971

6.  3D representations of amino acids-applications to protein sequence comparison and classification.

Authors:  Jie Li; Patrice Koehl
Journal:  Comput Struct Biotechnol J       Date:  2014-09-06       Impact factor: 7.271

Review 7.  Protein flexibility in the light of structural alphabets.

Authors:  Pierrick Craveur; Agnel P Joseph; Jeremy Esque; Tarun J Narwani; Floriane Noël; Nicolas Shinada; Matthieu Goguet; Sylvain Leonard; Pierre Poulain; Olivier Bertrand; Guilhem Faure; Joseph Rebehmed; Amine Ghozlane; Lakshmipuram S Swapna; Ramachandra M Bhaskara; Jonathan Barnoud; Stéphane Téletchéa; Vincent Jallu; Jiri Cerny; Bohdan Schneider; Catherine Etchebest; Narayanaswamy Srinivasan; Jean-Christophe Gelly; Alexandre G de Brevern
Journal:  Front Mol Biosci       Date:  2015-05-27

8.  A weighted string kernel for protein fold recognition.

Authors:  Saghi Nojoomi; Patrice Koehl
Journal:  BMC Bioinformatics       Date:  2017-08-25       Impact factor: 3.169

9.  NMRDSP: an accurate prediction of protein shape strings from NMR chemical shifts and sequence data.

Authors:  Wusong Mao; Peisheng Cong; Zhiheng Wang; Longjian Lu; Zhongliang Zhu; Tonghua Li
Journal:  PLoS One       Date:  2013-12-23       Impact factor: 3.240

10.  Detecting protein candidate fragments using a structural alphabet profile comparison approach.

Authors:  Yimin Shen; Géraldine Picord; Frédéric Guyon; Pierre Tuffery
Journal:  PLoS One       Date:  2013-11-26       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.