Literature DB >> 8433379

The evolution of proteins from random amino acid sequences. I. Evidence from the lengthwise distribution of amino acids in modern protein sequences.

S H White1, R E Jacobs.   

Abstract

We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with little sequence identify using the run test statistic (ro) of Mood (1940, Ann. Math. Stat. 11, 367-392). The probability density of ro for a collection of random sequences has mean = 0 and variance = 1 [the N(0,1) distribution] and can be used to measure the tendency of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity (4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However, we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two important global trends are found: (1) Amino acids with a strong alpha-helix propensity show a strong tendency to cluster whereas those with beta-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling the random nature of protein sequences with structurally meaningful periodic "patterns" that can be detected by sliding-window, autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural feature of random sequences.

Entities:  

Mesh:

Substances:

Year:  1993        PMID: 8433379     DOI: 10.1007/bf02407307

Source DB:  PubMed          Journal:  J Mol Evol        ISSN: 0022-2844            Impact factor:   2.395


  43 in total

1.  Evolution of the structure of ferredoxin based on living relics of primitive amino Acid sequences.

Authors:  R V Eck; M O Dayhoff
Journal:  Science       Date:  1966-04-15       Impact factor: 47.728

2.  Implications of thermodynamics of protein folding for evolution of primary sequences.

Authors:  E I Shakhnovich; A M Gutin
Journal:  Nature       Date:  1990-08-23       Impact factor: 49.962

3.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.

Authors:  S Karlin; S F Altschul
Journal:  Proc Natl Acad Sci U S A       Date:  1990-03       Impact factor: 11.205

4.  Repeating sequences and gene duplication in proteins.

Authors:  A D McLachlan
Journal:  J Mol Biol       Date:  1972-03-14       Impact factor: 5.469

5.  The characterization of amino acid sequences in proteins by statistical methods.

Authors:  J M Zimmerman; N Eliezer; R Simha
Journal:  J Theor Biol       Date:  1968-11       Impact factor: 2.691

6.  A method for estimating the number of invariant amino acid coding positions in a gene using cytochrome c as a model case.

Authors:  W M Fitch; E Margoliash
Journal:  Biochem Genet       Date:  1967-06       Impact factor: 1.890

Review 7.  Protein and Nucleic Acid Sequence Database Systems.

Authors:  B C Orcutt; D G George; M O Dayhoff
Journal:  Annu Rev Biophys Bioeng       Date:  1983

8.  The hydrophobic moment detects periodicity in protein hydrophobicity.

Authors:  D Eisenberg; R M Weiss; T C Terwilliger
Journal:  Proc Natl Acad Sci U S A       Date:  1984-01       Impact factor: 11.205

9.  Origins of structure in globular proteins.

Authors:  H S Chan; K A Dill
Journal:  Proc Natl Acad Sci U S A       Date:  1990-08       Impact factor: 11.205

10.  Characteristic sequential residue environment of amino acids in proteins.

Authors:  F Vonderviszt; G Mátrai; I Simon
Journal:  Int J Pept Protein Res       Date:  1986-05
View more
  14 in total

1.  Frequencies of amino acid strings in globular protein sequences indicate suppression of blocks of consecutive hydrophobic residues.

Authors:  R Schwartz; S Istrail; J King
Journal:  Protein Sci       Date:  2001-05       Impact factor: 6.725

2.  Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes.

Authors:  Nathaniel Echols; Paul Harrison; Suganthi Balasubramanian; Nicholas M Luscombe; Paul Bertone; Zhaolei Zhang; Mark Gerstein
Journal:  Nucleic Acids Res       Date:  2002-06-01       Impact factor: 16.971

3.  Hydrophobic forces and the length limit of foldable protein domains.

Authors:  Milo M Lin; Ahmed H Zewail
Journal:  Proc Natl Acad Sci U S A       Date:  2012-06-04       Impact factor: 11.205

4.  Phylogenetic differences in content and intensity of periodic proteins.

Authors:  Derek Gatherer; Neil R McEwan
Journal:  J Mol Evol       Date:  2005-04       Impact factor: 2.395

Review 5.  Reverse transcriptase: mediator of genomic plasticity.

Authors:  J Brosius; H Tiedge
Journal:  Virus Genes       Date:  1995       Impact factor: 2.332

6.  "Hidden" sequence periodicities and protein architecture.

Authors:  S Rackovsky
Journal:  Proc Natl Acad Sci U S A       Date:  1998-07-21       Impact factor: 11.205

7.  A relationship between GC content and coding-sequence length.

Authors:  J L Oliver; A Marín
Journal:  J Mol Evol       Date:  1996-09       Impact factor: 2.395

8.  Protein sequence randomness and sequence/structure correlations.

Authors:  R S Rahman; S Rackovsky
Journal:  Biophys J       Date:  1995-04       Impact factor: 4.033

9.  The evolution of proteins from random amino acid sequences: II. Evidence from the statistical distributions of the lengths of modern protein sequences.

Authors:  S H White
Journal:  J Mol Evol       Date:  1994-04       Impact factor: 2.395

10.  Periodic recurrence of methionines: fossil of gene fusion?

Authors:  E Kolker; E N Trifonov
Journal:  Proc Natl Acad Sci U S A       Date:  1995-01-17       Impact factor: 11.205

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.