Literature DB >> 16796549

A fast and symmetric DUST implementation to mask low-complexity DNA sequences.

Aleksandr Morgulis1, E Michael Gertz, Alejandro A Schäffer, Richa Agarwala.   

Abstract

The DUST module has been used within BLAST for many years to mask low-complexity sequences. In this paper, we present a new implementation of the DUST module that uses the same function to assign a complexity score to a sequence, but uses a different rule by which high-scoring sequences are masked. The new rule masks every nucleotide masked by the old rule and occasionally masks more. The new masking rule corrects two related deficiencies with the old rule. First, the new rule is symmetric with respect to reversing the sequence. Second, the new rule is not context sensitive; the decision to mask a subsequence does not depend on what sequences flank it. The new implementation is at least four times faster than the old on the human genome. We show that both the percentage of additional bases masked and the effect on MegaBLAST outputs are very small.

Entities:  

Mesh:

Year:  2006        PMID: 16796549     DOI: 10.1089/cmb.2006.13.1028

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  203 in total

1.  Extensive recent secondary contacts between four European white oak species.

Authors:  Thibault Leroy; Camille Roux; Laure Villate; Catherine Bodénès; Jonathan Romiguier; Jorge A P Paiva; Carole Dossat; Jean-Marc Aury; Christophe Plomion; Antoine Kremer
Journal:  New Phytol       Date:  2017-01-13       Impact factor: 10.151

2.  Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome.

Authors:  Derek M Bickhart; Benjamin D Rosen; Sergey Koren; Brian L Sayre; Alex R Hastie; Saki Chan; Joyce Lee; Ernest T Lam; Ivan Liachko; Shawn T Sullivan; Joshua N Burton; Heather J Huson; John C Nystrom; Christy M Kelley; Jana L Hutchison; Yang Zhou; Jiajie Sun; Alessandra Crisà; F Abel Ponce de León; John C Schwartz; John A Hammond; Geoffrey C Waldbieser; Steven G Schroeder; George E Liu; Maitreya J Dunham; Jay Shendure; Tad S Sonstegard; Adam M Phillippy; Curtis P Van Tassell; Timothy P L Smith
Journal:  Nat Genet       Date:  2017-03-06       Impact factor: 38.330

3.  CoLIde: a bioinformatics tool for CO-expression-based small RNA Loci Identification using high-throughput sequencing data.

Authors:  Irina Mohorianu; Matthew Benedict Stocks; John Wood; Tamas Dalmay; Vincent Moulton
Journal:  RNA Biol       Date:  2013-06-28       Impact factor: 4.652

4.  Assembler for de novo assembly of large genomes.

Authors:  Te-Chin Chu; Chen-Hua Lu; Tsunglin Liu; Greg C Lee; Wen-Hsiung Li; Arthur Chun-Chieh Shih
Journal:  Proc Natl Acad Sci U S A       Date:  2013-08-21       Impact factor: 11.205

5.  Searching for repeats, as an example of using the generalised Ruzzo-Tompa algorithm to find optimal subsequences with gaps.

Authors:  John L Spouge; Leonardo Mariño-Ramírez; Sergey L Sheetlin
Journal:  Int J Bioinform Res Appl       Date:  2014

6.  Comparative genomics analysis of Clostridium difficile epidemic strain DH/NAP11/106.

Authors:  Larry K Kociolek; Dale N Gerding; David W Hecht; Egon A Ozer
Journal:  Microbes Infect       Date:  2018-01-31       Impact factor: 2.700

7.  BLAST+: architecture and applications.

Authors:  Christiam Camacho; George Coulouris; Vahram Avagyan; Ning Ma; Jason Papadopoulos; Kevin Bealer; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2009-12-15       Impact factor: 3.169

8.  Supersplat--spliced RNA-seq alignment.

Authors:  Douglas W Bryant; Rongkun Shen; Henry D Priest; Weng-Keen Wong; Todd C Mockler
Journal:  Bioinformatics       Date:  2010-04-21       Impact factor: 6.937

9.  Parameters for accurate genome alignment.

Authors:  Martin C Frith; Michiaki Hamada; Paul Horton
Journal:  BMC Bioinformatics       Date:  2010-02-09       Impact factor: 3.169

10.  Detection of genomic variation by selection of a 9 mb DNA region and high throughput sequencing.

Authors:  Sergey I Nikolaev; Christian Iseli; Andrew J Sharp; Daniel Robyr; Jacques Rougemont; Corinne Gehrig; Laurent Farinelli; Stylianos E Antonarakis
Journal:  PLoS One       Date:  2009-08-17       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.