Literature DB >> 671562

A method for detecting distant evolutionary relationships between protein or nucleic acid sequences in the presence of deletions or insertions.

T C Elleman.   

Abstract

A method for detecting homology between two protein or nucleic acid sequences which require insertions or deletions for optimum alignment has been devised for use with a computer. Sequences are assessed for possible relationship by Monte Carlo methods involving comparisons between the alignment of the real sequences and alignments of randomly scrambled sequences of the same composition as the real sequences, each alignment having the optimum number of gaps. As each gap is successively introduced into a comparison (real or random) a maximum score is determined from the similarity of the aligned residues. From the distribution of the maximum alignment scores of randomly scrambled sequences having the same number of gaps, the percentage of random comparisons having higher scores is determined, and the smallest of these percentage levels for each pair of sequences (real or random) indicates the optimum alignment. The fraction of the comparisons of random sequences having percentage levels at their optimum alignment below that of the real sequence comparison at its optimum estimates the probability that such an alignment might have arisen by chance. Related sequences are detected since their optimum alignment score, by virtue of a contribution from ancestral homology in addition to optimised random considerations, occupies a more extreme position in the appropriate frequency distribution of score than do the majority of optimum scores of randomly scrambled sequences in their appropriate distributions. Application of this 'optimum match' method of sequence comparison shows that the sensitivity of the 'maximum match' method of Needleman and Wunsch (1970) decreases quite dramatically with sequence comparisons which require only a few gaps for a reasonable alignment, or when sequences differ greatly in length. The 'maximum match' method as applied by Barker and Dayhoff (1972) has the additional disadvantage that deletions which have occurred in the longer of two homologous protein sequences further decrease the sensitivity of detection of relationship. The 'constrained match' method of Sankoff and Cedergren (1973) is seen to be misleading since large increments in the alignment score from added gaps do not necessarily result in a high total alignment score required to demonstrate sequence homology.

Mesh:

Substances:

Year:  1978        PMID: 671562     DOI: 10.1007/BF01733890

Source DB:  PubMed          Journal:  J Mol Evol        ISSN: 0022-2844            Impact factor:   2.395


  13 in total

1.  A comparison of the heme binding pocket in globins and cytochrome b5.

Authors:  M G Rossmann; P Argos
Journal:  J Biol Chem       Date:  1975-09-25       Impact factor: 5.157

2.  The evolutionary origin of proinsulin. Amino acid sequence homology with the trypsin-related serine proteases detected and evaluated by new statistical methods.

Authors:  C de Haën; E Swanson; D C Teller
Journal:  J Mol Biol       Date:  1976-09-25       Impact factor: 5.469

3.  Three-dimensional Fourier synthesis of calf liver cytochrome b 5 at 2-8 A resolution.

Authors:  F S Mathews; M Levine; P Argos
Journal:  J Mol Biol       Date:  1972-03-14       Impact factor: 5.469

4.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

5.  Further improvements in the method of testing for evolutionary homology among proteins.

Authors:  W M Fitch
Journal:  J Mol Biol       Date:  1970-04-14       Impact factor: 5.469

6.  Amino acid sequence similarity between cytochrome f from a blue-green bacterium and algal chloroplasts.

Authors:  R P Ambler; R G Bartsch
Journal:  Nature       Date:  1975-01-24       Impact factor: 49.962

7.  An evaluation of the relatedness of proteins based on comparison of amino acid sequences.

Authors:  J E Haber; D E Koshland
Journal:  J Mol Biol       Date:  1970-06-28       Impact factor: 5.469

8.  The structure and history of an ancient protein.

Authors:  R E Dickerson
Journal:  Sci Am       Date:  1972-04       Impact factor: 2.142

9.  Sequence and structure homologies in bacterial and mammalian-type cytochromes.

Authors:  R E Dickerson
Journal:  J Mol Biol       Date:  1971-04-14       Impact factor: 5.469

10.  Tests for comparing related amino-acid sequences. Cytochrome c and cytochrome c 551 .

Authors:  A D McLachlan
Journal:  J Mol Biol       Date:  1971-10-28       Impact factor: 5.469

View more
  3 in total

1.  A comprehensive package for DNA sequence analysis in FORTRAN IV for the PDP-11.

Authors:  J Arnold; V K Eckenrode; K Lemke; G J Phillips; S W Schaeffer
Journal:  Nucleic Acids Res       Date:  1986-01-10       Impact factor: 16.971

2.  Prediction of oligonucleotide frequencies based upon dinucleotide frequencies obtained from the nearest neighbor analysis.

Authors:  J Hong
Journal:  Nucleic Acids Res       Date:  1990-03-25       Impact factor: 16.971

3.  Mono- through hexanucleotide composition of the Escherichia coli genome: a Markov chain analysis.

Authors:  G J Phillips; J Arnold; R Ivarie
Journal:  Nucleic Acids Res       Date:  1987-03-25       Impact factor: 16.971

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.