Literature DB >> 10884359

Accurate formula for P-values of gapped local sequence and profile alignments.

R Mott1.   

Abstract

A simple general approximation for the distribution of gapped local alignment scores is presented, suitable for assessing significance of comparisons between two protein sequences or a sequence and a profile. The approximation takes account of the scoring scheme (i.e. gap penalty and substitution matrix or profile), sequence composition and length. Use of this formula means it is unnecessary to fit an extreme-value distribution to simulations or to the results of databank searches. The method is based on the theoretical ideas introduced by R. Mott and R. Tribe in 1999. Extensive simulation studies show that score-thresholds produced by the method are accurate to within +/-5 % 95 % of the time. We also investigate factors which effect the accuracy of alignment statistics, and show that any method based on asymptotic theory is limited because asymptotic behaviour is not strictly achieved for many real protein sequences, due to extreme composition effects. Consequently, it may not be practicable to find a general formula that is significantly more accurate until the sub-asymptotic behaviour of alignments is better understood. Copyright 2000 Academic Press.

Mesh:

Substances:

Year:  2000        PMID: 10884359     DOI: 10.1006/jmbi.2000.3875

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  28 in total

1.  The estimation of statistical parameters for local alignment score distributions.

Authors:  S F Altschul; R Bundschuh; R Olsen; T Hwa
Journal:  Nucleic Acids Res       Date:  2001-01-15       Impact factor: 16.971

2.  ParAlign: a parallel sequence alignment algorithm for rapid and sensitive database searches.

Authors:  T Rognes
Journal:  Nucleic Acids Res       Date:  2001-04-01       Impact factor: 16.971

Review 3.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.

Authors:  A A Schäffer; L Aravind; T L Madden; S Shavirin; J L Spouge; Y I Wolf; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  2001-07-15       Impact factor: 16.971

4.  Analysis of similarity within 142 pairs of orthologous intergenic regions of Caenorhabditis elegans and Caenorhabditis briggsae.

Authors:  Colleen T Webb; Svetlana A Shabalina; Aleksey Yu Ogurtsov; Alexey S Kondrashov
Journal:  Nucleic Acids Res       Date:  2002-03-01       Impact factor: 16.971

5.  Structural characterization of the human proteome.

Authors:  Arne Müller; Robert M MacCallum; Michael J E Sternberg
Journal:  Genome Res       Date:  2002-11       Impact factor: 9.043

6.  Protein structure prediction for the male-specific region of the human Y chromosome.

Authors:  Krzysztof Ginalski; Leszek Rychlewski; David Baker; Nick V Grishin
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-24       Impact factor: 11.205

Review 7.  The proteome: structure, function and evolution.

Authors:  Keiran Fleming; Lawrence A Kelley; Suhail A Islam; Robert M MacCallum; Arne Muller; Florencio Pazos; Michael J E Sternberg
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2006-03-29       Impact factor: 6.237

8.  Recent improvements to the SMART domain-based sequence annotation resource.

Authors:  Ivica Letunic; Leo Goodstadt; Nicholas J Dickens; Tobias Doerks; Joerg Schultz; Richard Mott; Francesca Ciccarelli; Richard R Copley; Chris P Ponting; Peer Bork
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

9.  Protectome analysis: a new selective bioinformatics tool for bacterial vaccine candidate discovery.

Authors:  Emrah Altindis; Roberta Cozzi; Benedetta Di Palo; Francesca Necchi; Ravi P Mishra; Maria Rita Fontana; Marco Soriani; Fabio Bagnoli; Domenico Maione; Guido Grandi; Sabrina Liberatori
Journal:  Mol Cell Proteomics       Date:  2014-11-03       Impact factor: 5.911

10.  Island method for estimating the statistical significance of profile-profile alignment scores.

Authors:  Aleksandar Poleksic
Journal:  BMC Bioinformatics       Date:  2009-04-20       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.