Literature DB >> 8197109

Rapid and accurate estimates of statistical significance for sequence data base searches.

M S Waterman1, M Vingron.   

Abstract

A central question in sequence comparison is the statistical significance of an observed similarity. For local alignment containing gaps to optimize sequence similarity this problem has so far not been solved mathematically. Using as a basis the Chen-Stein theory of Poisson approximation, we present a practical method to approximate the probability that a local alignment score is a result of chance alone. For a set of similarity scores and gap penalties only one simulation of random alignments needs to be calculated to derive the key information allowing us to estimate the significance of any alignment calculated under this setting. We present applications to data base searching and the analysis of pairwise and self-comparisons of proteins.

Mesh:

Year:  1994        PMID: 8197109      PMCID: PMC43840          DOI: 10.1073/pnas.91.11.4625

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  17 in total

1.  Rapid and sensitive protein similarity searches.

Authors:  D J Lipman; W R Pearson
Journal:  Science       Date:  1985-03-22       Impact factor: 47.728

2.  A new algorithm for best subsequence alignments with application to tRNA-rRNA comparisons.

Authors:  M S Waterman; M Eggert
Journal:  J Mol Biol       Date:  1987-10-20       Impact factor: 5.469

3.  Oligopeptide biases in protein sequences and their use in predicting protein coding regions in nucleotide sequences.

Authors:  P McCaldon; P Argos
Journal:  Proteins       Date:  1988

Review 4.  Sequence alignment and penalty choice. Review of concepts, case studies and implications.

Authors:  M Vingron; M S Waterman
Journal:  J Mol Biol       Date:  1994-01-07       Impact factor: 5.469

5.  Viral src gene products are related to the catalytic chain of mammalian cAMP-dependent protein kinase.

Authors:  W C Barker; M O Dayhoff
Journal:  Proc Natl Acad Sci U S A       Date:  1982-05       Impact factor: 11.205

6.  Establishing homologies in protein sequences.

Authors:  M O Dayhoff; W C Barker; L T Hunt
Journal:  Methods Enzymol       Date:  1983       Impact factor: 1.600

7.  Identification of common molecular subsequences.

Authors:  T F Smith; M S Waterman
Journal:  J Mol Biol       Date:  1981-03-25       Impact factor: 5.469

8.  An improved algorithm for matching biological sequences.

Authors:  O Gotoh
Journal:  J Mol Biol       Date:  1982-12-15       Impact factor: 5.469

9.  Rapid similarity searches of nucleic acid and protein data banks.

Authors:  W J Wilbur; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1983-02       Impact factor: 11.205

10.  Simian sarcoma virus onc gene, v-sis, is derived from the gene (or genes) encoding a platelet-derived growth factor.

Authors:  R F Doolittle; M W Hunkapiller; L E Hood; S G Devare; K C Robbins; S A Aaronson; H N Antoniades
Journal:  Science       Date:  1983-07-15       Impact factor: 47.728

View more
  27 in total

1.  The estimation of statistical parameters for local alignment score distributions.

Authors:  S F Altschul; R Bundschuh; R Olsen; T Hwa
Journal:  Nucleic Acids Res       Date:  2001-01-15       Impact factor: 16.971

Review 2.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.

Authors:  A A Schäffer; L Aravind; T L Madden; S Shavirin; J L Spouge; Y I Wolf; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  2001-07-15       Impact factor: 16.971

3.  Fast and effective prediction of microRNA/target duplexes.

Authors:  Marc Rehmsmeier; Peter Steffen; Matthias Hochsmann; Robert Giegerich
Journal:  RNA       Date:  2004-10       Impact factor: 4.942

4.  Selecting protein targets for structural genomics of Pyrobaculum aerophilum: validating automated fold assignment methods by using binary hypothesis testing.

Authors:  P Mallick; K E Goodwill; S Fitz-Gibbon; J H Miller; D Eisenberg
Journal:  Proc Natl Acad Sci U S A       Date:  2000-03-14       Impact factor: 11.205

5.  An Eulerian path approach to local multiple alignment for DNA sequences.

Authors:  Yu Zhang; Michael S Waterman
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-24       Impact factor: 11.205

6.  A geometric interpretation for local alignment-free sequence comparison.

Authors:  Ehsan Behnam; Michael S Waterman; Andrew D Smith
Journal:  J Comput Biol       Date:  2013-07       Impact factor: 1.479

7.  Fold prediction by a hierarchy of sequence, threading, and modeling methods.

Authors:  L Jaroszewski; L Rychlewski; B Zhang; A Godzik
Journal:  Protein Sci       Date:  1998-06       Impact factor: 6.725

8.  SAGA: sequence alignment by genetic algorithm.

Authors:  C Notredame; D G Higgins
Journal:  Nucleic Acids Res       Date:  1996-04-15       Impact factor: 16.971

9.  Island method for estimating the statistical significance of profile-profile alignment scores.

Authors:  Aleksandar Poleksic
Journal:  BMC Bioinformatics       Date:  2009-04-20       Impact factor: 3.169

10.  Swelfe: a detector of internal repeats in sequences and structures.

Authors:  Anne-Laure Abraham; Eduardo P C Rocha; Joël Pothier
Journal:  Bioinformatics       Date:  2008-05-16       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.