Literature DB >> 14627826

Mastering seeds for genomic size nucleotide BLAST searches.

Valer Gotea1, Vamsi Veeramachaneni, Wojciech Makałowski.   

Abstract

One of the most common activities in bioinformatics is the search for similar sequences. These searches are usually carried out with the help of programs from the NCBI BLAST family. As the majority of searches are routinely performed with default parameters, a question that should be addressed is how reliable the results obtained using the default parameter values are, i.e. what fraction of potential matches have been retrieved by these searches. Our primary focus is on the initial hit parameter, also known as the seed or word, used by the NCBI BLASTn, MegaBLAST and other similar programs in searches for similar nucleotide sequences. We show that the use of default values for the initial hit parameter can have a big negative impact on the proportion of potentially similar sequences that are retrieved. We also show how the hit probability of different seeds varies with the minimum length and similarity of sequences desired to be retrieved and describe methods that help in determining appropriate seeds. The experimental results described in this paper illustrate situations in which these methods are most applicable and also show the relationship between the various BLAST parameters.

Mesh:

Substances:

Year:  2003        PMID: 14627826      PMCID: PMC290255          DOI: 10.1093/nar/gkg886

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


  16 in total

1.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

2.  Initial sequencing and comparative analysis of the mouse genome.

Authors:  Robert H Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R Brent; Daniel G Brown; Stephen D Brown; Carol Bult; John Burton; Jonathan Butler; Robert D Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T Chinwalla; Deanna M Church; Michele Clamp; Christopher Clee; Francis S Collins; Lisa L Cook; Richard R Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D Delehaunty; Justin Deri; Emmanouil T Dermitzakis; Colin Dewey; Nicholas J Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M Dunn; Sean R Eddy; Laura Elnitski; Richard D Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A Fewell; Paul Flicek; Karen Foley; Wayne N Frankel; Lucinda A Fulton; Robert S Fulton; Terrence S Furey; Diane Gage; Richard A Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A Graves; Eric D Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B Jaffe; L Steven Johnson; Matthew Jones; Thomas A Jones; Ann Joy; Michael Kamal; Elinor K Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W James Kent; Andrew Kirby; Diana L Kolbe; Ian Korf; Raju S Kucherlapati; Edward J Kulbokas; David Kulp; Tom Landers; J P Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R Maglott; Elaine R Mardis; Lucy Matthews; Evan Mauceli; John H Mayer; Megan McCarthy; W Richard McCombie; Stuart McLaren; Kirsten McLay; John D McPherson; Jim Meldrim; Beverley Meredith; Jill P Mesirov; Webb Miller; Tracie L Miner; Emmanuel Mongin; Kate T Montgomery; Michael Morgan; Richard Mott; James C Mullikin; Donna M Muzny; William E Nash; Joanne O Nelson; Michael N Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S Pohl; Alex Poliakov; Tracy C Ponce; Chris P Ponting; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A Roe; Krishna M Roskin; Edward M Rubin; Alistair G Rust; Ralph Santos; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Matthias S Schwartz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B Singer; Guy Slater; Arian Smit; Douglas R Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P Vinson; Andrew C Von Niederhausern; Claire M Wade; Melanie Wall; Ryan J Weber; Robert B Weiss; Michael C Wendl; Anthony P West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K Wilson; Eitan Winter; Kim C Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M Zdobnov; Michael C Zody; Eric S Lander
Journal:  Nature       Date:  2002-12-05       Impact factor: 49.962

3.  Serial BLAST searching.

Authors:  Ian Korf
Journal:  Bioinformatics       Date:  2003-08-12       Impact factor: 6.937

4.  Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes.

Authors:  Samuel Aparicio; Jarrod Chapman; Elia Stupka; Nik Putnam; Jer-Ming Chia; Paramvir Dehal; Alan Christoffels; Sam Rash; Shawn Hoon; Arian Smit; Maarten D Sollewijn Gelpke; Jared Roach; Tania Oh; Isaac Y Ho; Marie Wong; Chris Detter; Frans Verhoef; Paul Predki; Alice Tay; Susan Lucas; Paul Richardson; Sarah F Smith; Melody S Clark; Yvonne J K Edwards; Norman Doggett; Andrey Zharkikh; Sean V Tavtigian; Dmitry Pruss; Mary Barnstead; Cheryl Evans; Holly Baden; Justin Powell; Gustavo Glusman; Lee Rowen; Leroy Hood; Y H Tan; Greg Elgar; Trevor Hawkins; Byrappa Venkatesh; Daniel Rokhsar; Sydney Brenner
Journal:  Science       Date:  2002-07-25       Impact factor: 47.728

5.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

6.  Compact encoding strategies for DNA sequence similarity search.

Authors:  D J States; P Agarwal
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1996

Review 7.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

8.  Rapid and sensitive protein similarity searches.

Authors:  D J Lipman; W R Pearson
Journal:  Science       Date:  1985-03-22       Impact factor: 47.728

Review 9.  The human genome structure and organization.

Authors:  W Makałowski
Journal:  Acta Biochim Pol       Date:  2001       Impact factor: 2.149

10.  Human-mouse alignments with BLASTZ.

Authors:  Scott Schwartz; W James Kent; Arian Smit; Zheng Zhang; Robert Baertsch; Ross C Hardison; David Haussler; Webb Miller
Journal:  Genome Res       Date:  2003-01       Impact factor: 9.043

View more
  10 in total

1.  Evolutionary dynamics of recently duplicated genes: Selective constraints on diverging paralogs in the Drosophila pseudoobscura genome.

Authors:  Richard P Meisel
Journal:  J Mol Evol       Date:  2009-06-18       Impact factor: 2.395

2.  Computational challenges in the analysis of ancient DNA.

Authors:  Kay Prüfer; Udo Stenzel; Michael Hofreiter; Svante Pääbo; Janet Kelso; Richard E Green
Journal:  Genome Biol       Date:  2010-05-06       Impact factor: 13.583

3.  An iterative workflow for mining the human intestinal metaproteome.

Authors:  Koos Rooijers; Carolin Kolmeder; Catherine Juste; Joël Doré; Mark de Been; Sjef Boeren; Pilar Galan; Christian Beauvallet; Willem M de Vos; Peter J Schaap
Journal:  BMC Genomics       Date:  2011-01-05       Impact factor: 3.969

4.  Using paired-end sequences to optimise parameters for alignment of sequence reads against related genomes.

Authors:  Abhirami Ratnakumar; Sean McWilliam; Wesley Barris; Brian P Dalrymple
Journal:  BMC Genomics       Date:  2010-08-03       Impact factor: 3.969

5.  CAFTAN: a tool for fast mapping, and quality assessment of cDNAs.

Authors:  Coral del Val; Vladimir Yurjevich Kuryshev; Karl-Heinz Glatting; Peter Ernst; Agnes Hotz-Wagenblatt; Annemarie Poustka; Sandor Suhai; Stefan Wiemann
Journal:  BMC Bioinformatics       Date:  2006-10-25       Impact factor: 3.169

6.  Complete Genome Sequence of Bacillus megaterium Bacteriophage Eldridge.

Authors:  Alexandra M Reveille; Kimberly A Eldridge; Louise M Temple
Journal:  Genome Announc       Date:  2016-04-21

7.  Complete Genome Sequence of Bacillus Phage Belinda from Grand Cayman Island.

Authors:  Eileen F Breslin; Jessica Cornell; Zachary Schuhmacher; Madison Himelright; Aya Andos; Ariel Childs; Ana Clem; Monica Gerber; Arissa Gordillo; Laith Harb; Reafa Hossain; Taylor Hutchinson; Isaac Miller; Edmund Morton; Ryan Walters; Destin Webb; Louise Temple
Journal:  Genome Announc       Date:  2016-10-13

8.  muBLASTP: database-indexed protein sequence search on multicore CPUs.

Authors:  Jing Zhang; Sanchit Misra; Hao Wang; Wu-Chun Feng
Journal:  BMC Bioinformatics       Date:  2016-11-04       Impact factor: 3.169

9.  CRISPRTarget: bioinformatic prediction and analysis of crRNA targets.

Authors:  Ambarish Biswas; Joshua N Gagnon; Stan J J Brouns; Peter C Fineran; Chris M Brown
Journal:  RNA Biol       Date:  2013-03-14       Impact factor: 4.652

10.  Dissection of the octoploid strawberry genome by deep sequencing of the genomes of Fragaria species.

Authors:  Hideki Hirakawa; Kenta Shirasawa; Shunichi Kosugi; Kosuke Tashiro; Shinobu Nakayama; Manabu Yamada; Mistuyo Kohara; Akiko Watanabe; Yoshie Kishida; Tsunakazu Fujishiro; Hisano Tsuruoka; Chiharu Minami; Shigemi Sasamoto; Midori Kato; Keiko Nanri; Akiko Komaki; Tomohiro Yanagi; Qin Guoxin; Fumi Maeda; Masami Ishikawa; Satoru Kuhara; Shusei Sato; Satoshi Tabata; Sachiko N Isobe
Journal:  DNA Res       Date:  2013-11-26       Impact factor: 4.458

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.