Literature DB >> 16597241

Efficient q-gram filters for finding all epsilon-matches over a given length.

Kim R Rasmussen1, Jens Stoye, Eugene W Myers.   

Abstract

Fast and exact comparison of large genomic sequences remains a challenging task in biosequence analysis. We consider the problem of finding all epsilon-matches between two sequences, i.e., all local alignments over a given length with an error rate of at most epsilon. We study this problem theoretically, giving an efficient q-gram filter for solving it. Two applications of the filter are also discussed, in particular genomic sequence assembly and BLAST-like sequence comparison. Our results show that the method is 25 times faster than BLAST, while not being heuristic.

Mesh:

Year:  2006        PMID: 16597241     DOI: 10.1089/cmb.2006.13.296

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  30 in total

1.  A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads.

Authors:  Tobias Rausch; Sergey Koren; Gennady Denisov; David Weese; Anne-Katrin Emde; Andreas Döring; Knut Reinert
Journal:  Bioinformatics       Date:  2009-03-05       Impact factor: 6.937

Review 2.  Sense from sequence reads: methods for alignment and assembly.

Authors:  Paul Flicek; Ewan Birney
Journal:  Nat Methods       Date:  2009-11       Impact factor: 28.547

3.  Assembling large genomes with single-molecule sequencing and locality-sensitive hashing.

Authors:  Konstantin Berlin; Sergey Koren; Chen-Shan Chin; James P Drake; Jane M Landolin; Adam M Phillippy
Journal:  Nat Biotechnol       Date:  2015-05-25       Impact factor: 54.908

4.  RazerS--fast read mapping with sensitivity control.

Authors:  David Weese; Anne-Katrin Emde; Tobias Rausch; Andreas Döring; Knut Reinert
Journal:  Genome Res       Date:  2009-07-10       Impact factor: 9.043

5.  A coverage criterion for spaced seeds and its applications to support vector machine string kernels and k-mer distances.

Authors:  Laurent Noé; Donald E K Martin
Journal:  J Comput Biol       Date:  2014-12       Impact factor: 1.479

6.  Short Read Mapping: An Algorithmic Tour.

Authors:  Stefan Canzar; Steven L Salzberg
Journal:  Proc IEEE Inst Electr Electron Eng       Date:  2015-09-07       Impact factor: 10.961

7.  Fast and SNP-tolerant detection of complex variants and splicing in short reads.

Authors:  Thomas D Wu; Serban Nacu
Journal:  Bioinformatics       Date:  2010-02-10       Impact factor: 6.937

8.  Lossless filter for multiple repeats with bounded edit distance.

Authors:  Pierre Peterlongo; Gustavo Akio Tominaga Sacomoto; Alair Pereira do Lago; Nadia Pisanti; Marie-France Sagot
Journal:  Algorithms Mol Biol       Date:  2009-01-30       Impact factor: 1.405

9.  r2cat: synteny plots and comparative assembly.

Authors:  Peter Husemann; Jens Stoye
Journal:  Bioinformatics       Date:  2009-12-16       Impact factor: 6.937

10.  Improving de novo sequence assembly using machine learning and comparative genomics for overlap correction.

Authors:  Lance E Palmer; Mathaeus Dejori; Randall Bolanos; Daniel Fasulo
Journal:  BMC Bioinformatics       Date:  2010-01-15       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.