Literature DB >> 17921491

Improved BLAST searches using longer words for protein seeding.

Sergey A Shiryev1, Jason S Papadopoulos, Alejandro A Schäffer, Richa Agarwala.   

Abstract

MOTIVATION: The blastp and tblastn modules of BLAST are widely used methods for searching protein queries against protein and nucleotide databases, respectively. One heuristic used in BLAST is to consider only database sequences that contain a high-scoring match of length at most 5 to the query. We implemented the capability to use words of length 6 or 7. We demonstrate an improved trade-off between running time and retrieval accuracy, controlled by the score threshold used for short word matches. For example, the running time can be reduced by 20-30% while achieving ROC (receiver operator characteristic) scores similar to those obtained with current default parameters. AVAILABILITY: The option to use long words is in the NCBI C and C++ toolkit code for BLAST, starting with version 2.2.16 of blastall. A Linux executable used to produce the results herein is available at: ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/protein_longwords

Mesh:

Substances:

Year:  2007        PMID: 17921491     DOI: 10.1093/bioinformatics/btm479

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  30 in total

1.  HIV Tat controls RNA Polymerase II and the epigenetic landscape to transcriptionally reprogram target immune cells.

Authors:  Jonathan E Reeder; Youn-Tae Kwak; Ryan P McNamara; Christian V Forst; Iván D'Orso
Journal:  Elife       Date:  2015-10-21       Impact factor: 8.140

2.  SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier.

Authors:  Xiao Hu; Iddo Friedberg
Journal:  Gigascience       Date:  2019-10-01       Impact factor: 6.524

3.  Evolution of the Cytokinin Dehydrogenase (CKX) Domain.

Authors:  Siarhei A Dabravolski; Stanislav V Isayenkov
Journal:  J Mol Evol       Date:  2021-11-08       Impact factor: 2.395

4.  The evolution of stomatal traits along the trajectory toward C4 photosynthesis.

Authors:  Yong-Yao Zhao; Mingju Amy Lyu; FenFen Miao; Genyun Chen; Xin-Guang Zhu
Journal:  Plant Physiol       Date:  2022-08-29       Impact factor: 8.005

5.  SssP1, a Streptococcus suis Fimbria-Like Protein Transported by the SecY2/A2 System, Contributes to Bacterial Virulence.

Authors:  Yue Zhang; Pengpeng Lu; Zihao Pan; Yinchu Zhu; Jiale Ma; Xiaojun Zhong; Wenyang Dong; Chengping Lu; Huochun Yao
Journal:  Appl Environ Microbiol       Date:  2018-08-31       Impact factor: 4.792

6.  Intrinsic structural disorder confers cellular viability on oncogenic fusion proteins.

Authors:  Hedi Hegyi; László Buday; Peter Tompa
Journal:  PLoS Comput Biol       Date:  2009-10-30       Impact factor: 4.475

7.  Global invasion history of the agricultural pest butterfly Pieris rapae revealed with genomics and citizen science.

Authors:  Sean F Ryan; Eric Lombaert; Anne Espeset; Roger Vila; Gerard Talavera; Vlad Dincă; Meredith M Doellman; Mark A Renshaw; Matthew W Eng; Emily A Hornett; Yiyuan Li; Michael E Pfrender; DeWayne Shoemaker
Journal:  Proc Natl Acad Sci U S A       Date:  2019-09-10       Impact factor: 11.205

8.  Database indexing for production MegaBLAST searches.

Authors:  Aleksandr Morgulis; George Coulouris; Yan Raytselis; Thomas L Madden; Richa Agarwala; Alejandro A Schäffer
Journal:  Bioinformatics       Date:  2008-06-21       Impact factor: 6.937

9.  Integrating human omics data to prioritize candidate genes.

Authors:  Yong Chen; Xuebing Wu; Rui Jiang
Journal:  BMC Med Genomics       Date:  2013-12-18       Impact factor: 3.063

10.  Identifying potential cancer driver genes by genomic data integration.

Authors:  Yong Chen; Jingjing Hao; Wei Jiang; Tong He; Xuegong Zhang; Tao Jiang; Rui Jiang
Journal:  Sci Rep       Date:  2013-12-18       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.