Literature DB >> 3871073

The statistical distribution of nucleic acid similarities.

T F Smith, M S Waterman, C Burks.   

Abstract

All pairs of a large set of known vertebrate DNA sequences were searched by computer for most similar segments. Analysis of this data shows that the computed similarity scores are distributed proportionally to the logarithm of the product of the lengths of the sequences involved. This distribution is closely related to recent results of Erdos and others on the longest run of heads in coin tossing. A simple rule is derived for determination of statistical significance of the similarity scores and to assist in relating statistical and biological significance.

Mesh:

Substances:

Year:  1985        PMID: 3871073      PMCID: PMC341021          DOI: 10.1093/nar/13.2.645

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  30 in total

1.  Nucleotide sequence and amplification in bacteria of structural gene for rat growth hormone.

Authors:  P H Seeburg; J Shine; J A Martial; J D Baxter; H M Goodman
Journal:  Nature       Date:  1977-12-08       Impact factor: 49.962

2.  Reiterated sequences within the intron of an immediate-early gene of herpes simplex virus type 1.

Authors:  R J Watson; K Umene; L W Enquist
Journal:  Nucleic Acids Res       Date:  1981-08-25       Impact factor: 16.971

3.  Sequence banks. Searching for sequence similarities.

Authors:  T F Smith; C Burks
Journal:  Nature       Date:  1983-01-20       Impact factor: 49.962

4.  Low molecular weight RNAs transcribed in vitro by RNA polymerase III from Alu-type dispersed repeats in Chinese hamster DNA are also found in vivo.

Authors:  S R Haynes; W R Jelinek
Journal:  Proc Natl Acad Sci U S A       Date:  1981-10       Impact factor: 11.205

5.  Rearrangement of immunoglobulin gamma 1-chain gene and mechanism for heavy-chain class switch.

Authors:  T Kataoka; T Kawakami; N Takahashi; T Honjo
Journal:  Proc Natl Acad Sci U S A       Date:  1980-02       Impact factor: 11.205

6.  Structural analysis of interspersed repetitive polymerase III transcription units in human DNA.

Authors:  J Pan; J T Elder; C H Duncan; S M Weissman
Journal:  Nucleic Acids Res       Date:  1981-03-11       Impact factor: 16.971

7.  Molecular cloning and characterization of cDNA sequences coding for rat relaxin.

Authors:  P Hudson; J Haley; M Cronk; J Shine; H Niall
Journal:  Nature       Date:  1981-05-14       Impact factor: 49.962

8.  The ovalbumin gene family: structure of the X gene and evolution of duplicated split genes.

Authors:  R Heilig; F Perrin; F Gannon; J L Mandel; P Chambon
Journal:  Cell       Date:  1980-07       Impact factor: 41.582

9.  Isolation and sequence of the gene for actin in Saccharomyces cerevisiae.

Authors:  R Ng; J Abelson
Journal:  Proc Natl Acad Sci U S A       Date:  1980-07       Impact factor: 11.205

10.  Nucleotide sequence of Xenopus laevis 18S ribosomal RNA inferred from gene sequence.

Authors:  M Salim; B E Maden
Journal:  Nature       Date:  1981-05-21       Impact factor: 49.962

View more
  46 in total

1.  The estimation of statistical parameters for local alignment score distributions.

Authors:  S F Altschul; R Bundschuh; R Olsen; T Hwa
Journal:  Nucleic Acids Res       Date:  2001-01-15       Impact factor: 16.971

Review 2.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.

Authors:  A A Schäffer; L Aravind; T L Madden; S Shavirin; J L Spouge; Y I Wolf; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  2001-07-15       Impact factor: 16.971

3.  An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequences.

Authors:  R F Mott; T B Kirkwood; R N Curnow
Journal:  Bull Math Biol       Date:  1990       Impact factor: 1.758

4.  Towards drug repositioning: a unified computational framework for integrating multiple aspects of drug similarity and disease similarity.

Authors:  Ping Zhang; Fei Wang; Jianying Hu
Journal:  AMIA Annu Symp Proc       Date:  2014-11-14

5.  An Eulerian path approach to local multiple alignment for DNA sequences.

Authors:  Yu Zhang; Michael S Waterman
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-24       Impact factor: 11.205

6.  Poisson, compound Poisson and process approximations for testing statistical significance in sequence comparisons.

Authors:  L Goldstein; M S Waterman
Journal:  Bull Math Biol       Date:  1992-09       Impact factor: 1.758

7.  Corruption of genomic databases with anomalous sequence.

Authors:  E D Lamperti; J M Kittelberger; T F Smith; L Villa-Komaroff
Journal:  Nucleic Acids Res       Date:  1992-06-11       Impact factor: 16.971

8.  Automatic generation of primary sequence patterns from sets of related protein sequences.

Authors:  R F Smith; T F Smith
Journal:  Proc Natl Acad Sci U S A       Date:  1990-01       Impact factor: 11.205

9.  Protein sequence similarity searches using patterns as seeds.

Authors:  Z Zhang; A A Schäffer; W Miller; T L Madden; D J Lipman; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  1998-09-01       Impact factor: 16.971

Review 10.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.