Literature DB >> 30239207

On the Natural Structure of Amino Acid Patterns in Families of Protein Sequences.

Pablo Turjanski1, Diego U Ferreiro2.   

Abstract

All known terrestrial proteins are coded as continuous strings of ≈20 amino acids. The patterns formed by the repetitions of elements in groups of finite sequences describes the natural architectures of protein families. We present a method to search for patterns and groupings of patterns in protein sequences using a mathematically precise definition for "repetition", an efficient algorithmic implementation and a robust scoring system with no adjustable parameters. We show that the sequence patterns can be well-separated into disjoint classes according to their recurrence in nested structures. The statistics of the occurrences of patterns indicate that short repetitions are sufficient to account for the differences between natural families and randomized groups of sequences by more than 10 standard deviations, while contiguous sequence patterns shorter than 5 residues are effectively random in their occurrences. A small subset of patterns is sufficient to account for a robust "familiarity" definition between arbitrary sets of sequences.

Mesh:

Substances:

Year:  2018        PMID: 30239207      PMCID: PMC7184844          DOI: 10.1021/acs.jpcb.8b07206

Source DB:  PubMed          Journal:  J Phys Chem B        ISSN: 1520-5207            Impact factor:   2.991


  34 in total

1.  Information content of protein sequences.

Authors:  O Weiss; M A Jiménez-Montaño; H Herzel
Journal:  J Theor Biol       Date:  2000-10-07       Impact factor: 2.691

2.  Anomalies in the vibrational dynamics of proteins are a consequence of fractal-like structure.

Authors:  Shlomi Reuveni; Rony Granek; Joseph Klafter
Journal:  Proc Natl Acad Sci U S A       Date:  2010-07-16       Impact factor: 11.205

3.  Molecules: what kind of a bag of atoms?

Authors:  Praveen D Chowdary; Martin Gruebele
Journal:  J Phys Chem A       Date:  2009-11-26       Impact factor: 2.781

4.  Globally, unrelated protein sequences appear random.

Authors:  Daniel T Lavelle; William R Pearson
Journal:  Bioinformatics       Date:  2009-11-30       Impact factor: 6.937

5.  The foldon universe: a survey of structural similarity and self-recognition of independently folding units.

Authors:  A R Panchenko; Z Luthey-Schulten; R Cole; P G Wolynes
Journal:  J Mol Biol       Date:  1997-09-12       Impact factor: 5.469

6.  Coevolutionary information, protein folding landscapes, and the thermodynamics of natural selection.

Authors:  Faruck Morcos; Nicholas P Schafer; Ryan R Cheng; José N Onuchic; Peter G Wolynes
Journal:  Proc Natl Acad Sci U S A       Date:  2014-08-11       Impact factor: 11.205

7.  Fractal surfaces of proteins.

Authors:  M Lewis; D C Rees
Journal:  Science       Date:  1985-12-06       Impact factor: 47.728

8.  Identifying and seeing beyond multiple sequence alignment errors using intra-molecular protein covariation.

Authors:  Russell J Dickson; Lindi M Wahl; Andrew D Fernandes; Gregory B Gloor
Journal:  PLoS One       Date:  2010-06-28       Impact factor: 3.240

9.  RepeatsDB: a database of tandem repeat protein structures.

Authors:  Tomás Di Domenico; Emilio Potenza; Ian Walsh; R Gonzalo Parra; Manuel Giollo; Giovanni Minervini; Damiano Piovesan; Awais Ihsan; Carlo Ferrari; Andrey V Kajava; Silvio C E Tosatto
Journal:  Nucleic Acids Res       Date:  2013-12-05       Impact factor: 16.971

10.  Amino Acid metabolism conflicts with protein diversity.

Authors:  Teresa Krick; Nina Verstraete; Leonardo G Alonso; David A Shub; Diego U Ferreiro; Michael Shub; Ignacio E Sánchez
Journal:  Mol Biol Evol       Date:  2014-08-01       Impact factor: 16.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.