Literature DB >> 17608939

A topological algorithm for identification of structural domains of proteins.

Frank Emmert-Streib1, Arcady Mushegian.   

Abstract

BACKGROUND: Identification of the structural domains of proteins is important for our understanding of the organizational principles and mechanisms of protein folding, and for insights into protein function and evolution. Algorithmic methods of dissecting protein of known structure into domains developed so far are based on an examination of multiple geometrical, physical and topological features. Successful as many of these approaches are, they employ a lot of heuristics, and it is not clear whether they illuminate any deep underlying principles of protein domain organization. Other well-performing domain dissection methods rely on comparative sequence analysis. These methods are applicable to sequences with known and unknown structure alike, and their success highlights a fundamental principle of protein modularity, but this does not directly improve our understanding of protein spatial structure.
RESULTS: We present a novel graph-theoretical algorithm for the identification of domains in proteins with known three-dimensional structure. We represent the protein structure as an undirected, unweighted and unlabeled graph whose nodes correspond to the secondary structure elements and edges represent physical proximity of at least one pair of alpha carbon atoms from two elements. Domains are identified as constrained partitions of the graph, corresponding to sets of vertices obtained by the maximization of the cycle distributions found in the graph. When a partition is found, the algorithm is iteratively applied to each of the resulting subgraphs. The decision to accept or reject a tentative cut position is based on a specific classifier. The algorithm is applied iteratively to each of the resulting subgraphs and terminates automatically if partitions are no longer accepted. The distribution of cycles is the only type of information on which the decision about protein dissection is based. Despite the barebone simplicity of the approach, our algorithm approaches the best heuristic algorithms in accuracy.
CONCLUSION: Our graph-theoretical algorithm uses only topological information present in the protein structure itself to find the domains and does not rely on any geometrical or physical information about protein molecule. Perhaps unexpectedly, these drastic constraints on resources, which result in a seemingly approximate description of protein structures and leave only a handful of parameters available for analysis, do not lead to any significant deterioration of algorithm accuracy. It appears that protein structures can be rigorously treated as topological rather than geometrical objects and that the majority of information about protein domains can be inferred from the coarse-grained measure of pairwise proximity between elements of secondary structure elements.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17608939      PMCID: PMC1933582          DOI: 10.1186/1471-2105-8-237

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  28 in total

1.  The Protein Data Bank.

Authors:  H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Superparamagnetic clustering of data.

Authors: 
Journal:  Phys Rev Lett       Date:  1996-04-29       Impact factor: 9.161

3.  Protein structural domain identification.

Authors:  W R Taylor
Journal:  Protein Eng       Date:  1999-03

4.  SnapDRAGON: a method to delineate protein structural domains from sequence data.

Authors:  Richard A George; Jaap Heringa
Journal:  J Mol Biol       Date:  2002-02-22       Impact factor: 5.469

5.  Improving the performance of DomainParser for structural domain partition using neural network.

Authors:  Jun-tao Guo; Dong Xu; Dongsup Kim; Ying Xu
Journal:  Nucleic Acids Res       Date:  2003-02-01       Impact factor: 16.971

6.  Algorithmic computation of knot polynomials of secondary structure elements of proteins.

Authors:  Frank Emmert-Streib
Journal:  J Comput Biol       Date:  2006-10       Impact factor: 1.479

7.  Domain assignment for protein structures using a consensus approach: characterization and analysis.

Authors:  S Jones; M Stewart; A Michie; M B Swindells; C Orengo; J M Thornton
Journal:  Protein Sci       Date:  1998-02       Impact factor: 6.725

8.  Letter: Recognition of structural domains in globular proteins.

Authors:  M G Rossman; A Liljas
Journal:  J Mol Biol       Date:  1974-05-05       Impact factor: 5.469

9.  Hierarchic organization of domains in globular proteins.

Authors:  G D Rose
Journal:  J Mol Biol       Date:  1979-11-05       Impact factor: 5.469

10.  InterPro, progress and status in 2005.

Authors:  Nicola J Mulder; Rolf Apweiler; Teresa K Attwood; Amos Bairoch; Alex Bateman; David Binns; Paul Bradley; Peer Bork; Phillip Bucher; Lorenzo Cerutti; Richard Copley; Emmanuel Courcelle; Ujjwal Das; Richard Durbin; Wolfgang Fleischmann; Julian Gough; Daniel Haft; Nicola Harte; Nicolas Hulo; Daniel Kahn; Alexander Kanapin; Maria Krestyaninova; David Lonsdale; Rodrigo Lopez; Ivica Letunic; Martin Madera; John Maslen; Jennifer McDowall; Alex Mitchell; Anastasia N Nikolskaya; Sandra Orchard; Marco Pagni; Chris P Ponting; Emmanuel Quevillon; Jeremy Selengut; Christian J A Sigrist; Ville Silventoinen; David J Studholme; Robert Vaughan; Cathy H Wu
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

View more
  8 in total

1.  Limitations of gene duplication models: evolution of modules in protein interaction networks.

Authors:  Frank Emmert-Streib
Journal:  PLoS One       Date:  2012-04-18       Impact factor: 3.240

2.  Optimal contact definition for reconstruction of contact maps.

Authors:  Jose M Duarte; Rajagopal Sathyapriya; Henning Stehr; Ioannis Filippis; Michael Lappe
Journal:  BMC Bioinformatics       Date:  2010-05-27       Impact factor: 3.169

3.  Protein structural modularity and robustness are associated with evolvability.

Authors:  Mary M Rorick; Günter P Wagner
Journal:  Genome Biol Evol       Date:  2011-05-21       Impact factor: 3.416

4.  Elucidating Self-Assembling Peptide Aggregation via Morphoscanner: A New Tool for Protein-Peptide Structural Characterization.

Authors:  Gloria A A Saracino; Federico Fontana; Shehrazade Jekhmane; João Medeiros Silva; Markus Weingarth; Fabrizio Gelain
Journal:  Adv Sci (Weinh)       Date:  2018-06-22       Impact factor: 16.806

5.  Assignment of structural domains in proteins using diffusion kernels on graphs.

Authors:  Mohammad Taheri-Ledari; Amirali Zandieh; Seyed Peyman Shariatpanahi; Changiz Eslahchi
Journal:  BMC Bioinformatics       Date:  2022-09-08       Impact factor: 3.307

6.  Hierarchical coordination of periodic genes in the cell cycle of Saccharomyces cerevisiae.

Authors:  Frank Emmert-Streib; Matthias Dehmer
Journal:  BMC Syst Biol       Date:  2009-07-20

7.  Predicting cell cycle regulated genes by causal interactions.

Authors:  Frank Emmert-Streib; Matthias Dehmer
Journal:  PLoS One       Date:  2009-08-18       Impact factor: 3.240

8.  Identifying structural domains of proteins using clustering.

Authors:  Howard J Feldman
Journal:  BMC Bioinformatics       Date:  2012-11-01       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.