Literature DB >> 15072689

A local alignment metric for accelerating biosequence database search.

Peter A Spiro1, Natasa Macura.   

Abstract

We introduce a metric for local sequence alignments that has utility for accelerating optimal alignment searches without loss of sensitivity. The metric's triangle inequality property permits identification of redundant database entries guaranteed to have optimal alignments to the query sequence that fall below a specified score threshold, thereby permitting comparisons to these entries to be skipped. We prove the existence of the metric for a variety of scoring systems, including the most commonly used ones, and show that a triangle inequality can be established as well for nucleotide-to-protein sequence comparisons. We discuss a database clustering and search strategy that takes advantage of the triangle inequality. The strategy permits moderate but significant acceleration of searches against the widely used "nr" protein database. It also provides a theoretically based method for database clustering in general and provides a standard against which to compare heuristic clustering strategies.

Mesh:

Year:  2004        PMID: 15072689     DOI: 10.1089/106652704773416894

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  2 in total

1.  Mining the NCBI Influenza Sequence Database: adaptive grouping of BLAST results using precalculated neighbor indexing.

Authors:  Leonid Zaslavsky; Tatiana Tatusova
Journal:  PLoS Curr       Date:  2009-10-30

2.  Geometric aspects of biological sequence comparison.

Authors:  Aleksandar Stojmirović; Yi-Kuo Yu
Journal:  J Comput Biol       Date:  2009-04       Impact factor: 1.479

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.