Literature DB >> 20148197

ESTIMATING THE GUMBEL SCALE PARAMETER FOR LOCAL ALIGNMENT OF RANDOM SEQUENCES BY IMPORTANCE SAMPLING WITH STOPPING TIMES.

Yonil Park1, Sergey Sheetlin, John L Spouge.   

Abstract

The gapped local alignment score of two random sequences follows a Gumbel distribution. If computers could estimate the parameters of the Gumbel distribution within one second, the use of arbitrary alignment scoring schemes could increase the sensitivity of searching biological sequence databases over the web. Accordingly, this article gives a novel equation for the scale parameter of the relevant Gumbel distribution. We speculate that the equation is exact, although present numerical evidence is limited. The equation involves ascending ladder variates in the global alignment of random sequences. In global alignment simulations, the ladder variates yield stopping times specifying random sequence lengths. Because of the random lengths, and because our trial distribution for importance sampling occurs on a different sample space from our target distribution, our study led to a mapping theorem, which led naturally in turn to an efficient dynamic programming algorithm for the importance sampling weights. Numerical studies using several popular alignment scoring schemes then examined the efficiency and accuracy of the resulting simulations.

Entities:  

Year:  2009        PMID: 20148197      PMCID: PMC2818155          DOI: 10.1214/08-AOS663

Source DB:  PubMed          Journal:  Ann Stat        ISSN: 0090-5364            Impact factor:   4.028


  16 in total

1.  Local sequence alignments with monotonic gap penalties.

Authors:  R Mott
Journal:  Bioinformatics       Date:  1999-06       Impact factor: 6.937

2.  The estimation of statistical parameters for local alignment score distributions.

Authors:  S F Altschul; R Bundschuh; R Olsen; T Hwa
Journal:  Nucleic Acids Res       Date:  2001-01-15       Impact factor: 16.971

3.  Rapid assessment of extremal statistics for gapped local alignment.

Authors:  R Olsen; R Bundschuh; T Hwa
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1999

4.  Accurate formula for P-values of gapped local sequence and profile alignments.

Authors:  R Mott
Journal:  J Mol Biol       Date:  2000-07-14       Impact factor: 5.469

5.  Statistical significance of probabilistic sequence alignment and related local hidden Markov models.

Authors:  Y K Yu; T Hwa
Journal:  J Comput Biol       Date:  2001       Impact factor: 1.479

6.  Asymmetric exclusion process and extremal statistics of random sequences.

Authors:  R Bundschuh
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2002-03-05

7.  Rapid significance estimation in local sequence alignment with gaps.

Authors:  Ralf Bundschuh
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

8.  Amino acid substitution matrices from protein blocks.

Authors:  S Henikoff; J G Henikoff
Journal:  Proc Natl Acad Sci U S A       Date:  1992-11-15       Impact factor: 11.205

9.  Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.

Authors:  S Karlin; S F Altschul
Journal:  Proc Natl Acad Sci U S A       Date:  1990-03       Impact factor: 11.205

10.  An improved algorithm for matching biological sequences.

Authors:  O Gotoh
Journal:  J Mol Biol       Date:  1982-12-15       Impact factor: 5.469

View more
  7 in total

1.  Objective method for estimating asymptotic parameters, with an application to sequence alignment.

Authors:  Sergey Sheetlin; Yonil Park; John L Spouge
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2011-09-13

2.  ALP & FALP: C++ libraries for pairwise local alignment E-values.

Authors:  Sergey Sheetlin; Yonil Park; Martin C Frith; John L Spouge
Journal:  Bioinformatics       Date:  2015-10-01       Impact factor: 6.937

3.  Frameshift alignment: statistics and post-genomic applications.

Authors:  Sergey L Sheetlin; Yonil Park; Martin C Frith; John L Spouge
Journal:  Bioinformatics       Date:  2014-08-28       Impact factor: 6.937

4.  New finite-size correction for local alignment score distributions.

Authors:  Yonil Park; Sergey Sheetlin; Ning Ma; Thomas L Madden; John L Spouge
Journal:  BMC Res Notes       Date:  2012-06-12

5.  A new repeat-masking method enables specific detection of homologous sequences.

Authors:  Martin C Frith
Journal:  Nucleic Acids Res       Date:  2010-11-24       Impact factor: 16.971

6.  Estimating statistical significance of local protein profile-profile alignments.

Authors:  Mindaugas Margelevičius
Journal:  BMC Bioinformatics       Date:  2019-08-13       Impact factor: 3.169

7.  The whole alignment and nothing but the alignment: the problem of spurious alignment flanks.

Authors:  Martin C Frith; Yonil Park; Sergey L Sheetlin; John L Spouge
Journal:  Nucleic Acids Res       Date:  2008-09-16       Impact factor: 16.971

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.