Literature DB >> 6694902

On the statistical significance of nucleic acid similarities.

D J Lipman, W J Wilbur, T F Smith, M S Waterman.   

Abstract

When evaluating sequence similarities among nucleic acids by the usual methods, statistical significance is often found when the biological significance of the similarity is dubious. We demonstrate that the known statistical properties of nucleic acid sequences strongly affect the statistical distribution of similarity values when calculated by standard procedures. We propose a series of models which account for some of these known statistical properties. The utility of the method is demonstrated in evaluating high relative similarity scores in four specific cases in which there is little biological context by which to judge the similarities. In two of the cases we identify the statistical properties which are responsible for the apparent similarity. In the other two cases the statistical significance of the similarity persists even when the known statistical properties of sequences are modelled. For one of these cases biological significance is likely while the other case remains an enigma.

Mesh:

Substances:

Year:  1984        PMID: 6694902      PMCID: PMC320998          DOI: 10.1093/nar/12.1part1.215

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  13 in total

1.  Computer analysis of nucleic acid regulatory sequences.

Authors:  L J Korn; C L Queen; M N Wegman
Journal:  Proc Natl Acad Sci U S A       Date:  1977-10       Impact factor: 11.205

2.  Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type.

Authors:  R Grantham; C Gautier; M Gouy
Journal:  Nucleic Acids Res       Date:  1980-05-10       Impact factor: 16.971

3.  Structure of the rat prolactin gene.

Authors:  E J Gubbins; R A Maurer; M Lagrimini; C R Erwin; J E Donelson
Journal:  J Biol Chem       Date:  1980-09-25       Impact factor: 5.157

4.  A + T-rich linkers define functional domains in eukaryotic DNA.

Authors:  J Moreau; L Marcaud; F Maschat; J Kejzlarova-Lepesant; J A Lepesant; K Scherrer
Journal:  Nature       Date:  1982-01-21       Impact factor: 49.962

5.  Pattern recognition in nucleic acid sequences. I. A general method for finding local homologies and symmetries.

Authors:  W B Goad; M I Kanehisa
Journal:  Nucleic Acids Res       Date:  1982-01-11       Impact factor: 16.971

6.  Random sequences.

Authors:  W M Fitch
Journal:  J Mol Biol       Date:  1983-01-15       Impact factor: 5.469

Review 7.  Codon catalog usage and the genome hypothesis.

Authors:  R Grantham; C Gautier; M Gouy; R Mercier; A Pavé
Journal:  Nucleic Acids Res       Date:  1980-01-11       Impact factor: 16.971

8.  Strong adenine clustering in nucleotide sequences.

Authors:  R Nussinov
Journal:  J Theor Biol       Date:  1980-07-21       Impact factor: 2.691

9.  Some rules in the ordering of nucleotides in the DNA.

Authors:  R Nussinov
Journal:  Nucleic Acids Res       Date:  1980-10-10       Impact factor: 16.971

10.  Recognition of protein coding regions in DNA sequences.

Authors:  J W Fickett
Journal:  Nucleic Acids Res       Date:  1982-09-11       Impact factor: 16.971

View more
  26 in total

1.  Massive sequence comparisons as a help in annotating genomic sequences.

Authors:  A Louis; E Ollivier; J C Aude; J L Risler
Journal:  Genome Res       Date:  2001-07       Impact factor: 9.043

2.  Target-decoy approach and false discovery rate: when things may go wrong.

Authors:  Nitin Gupta; Nuno Bandeira; Uri Keich; Pavel A Pevzner
Journal:  J Am Soc Mass Spectrom       Date:  2011-05-05       Impact factor: 3.109

3.  WORDUP: an efficient algorithm for discovering statistically significant patterns in DNA sequences.

Authors:  G Pesole; N Prunella; S Liuni; M Attimonelli; C Saccone
Journal:  Nucleic Acids Res       Date:  1992-06-11       Impact factor: 16.971

4.  A comparison of several similarity indices used in the classification of protein sequences: a multivariate analysis.

Authors:  C Landès; A Hénaut; J L Risler
Journal:  Nucleic Acids Res       Date:  1992-07-25       Impact factor: 16.971

5.  Ti plasmid-encoded genes responsible for catabolism of the crown gall opine mannopine by Agrobacterium tumefaciens are homologs of the T-region genes responsible for synthesis of this opine by the plant tumor.

Authors:  K S Kim; S K Farrand
Journal:  J Bacteriol       Date:  1996-06       Impact factor: 3.490

6.  Riboregulation in Escherichia coli: DsrA RNA acts by RNA:RNA interactions at multiple loci.

Authors:  R A Lease; M E Cusick; M Belfort
Journal:  Proc Natl Acad Sci U S A       Date:  1998-10-13       Impact factor: 11.205

7.  Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships.

Authors:  S E Brenner; C Chothia; T J Hubbard
Journal:  Proc Natl Acad Sci U S A       Date:  1998-05-26       Impact factor: 11.205

8.  Scrambled duplications in the feline leukemia virus gag gene: a putative pattern for molecular evolution.

Authors:  I Laprevotte
Journal:  J Mol Evol       Date:  1989-08       Impact factor: 2.395

9.  Genetic and physical studies of a portion of the white locus participating in transcriptional regulation and in synapsis-dependent interactions in Drosophila adult tissues.

Authors:  D Davison; C H Chapman; C Wedeen; P M Bingham
Journal:  Genetics       Date:  1985-07       Impact factor: 4.562

10.  A comprehensive package for DNA sequence analysis in FORTRAN IV for the PDP-11.

Authors:  J Arnold; V K Eckenrode; K Lemke; G J Phillips; S W Schaeffer
Journal:  Nucleic Acids Res       Date:  1986-01-10       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.