Literature DB >> 8954800

Sensitivity and selectivity in protein similarity searches: a comparison of Smith-Waterman in hardware to BLAST and FASTA.

E G Shpaer1, M Robinson, D Yee, J D Candlin, R Mines, T Hunkapiller.   

Abstract

To predict the functions of a possible protein product of any new or uncharacterized DNA sequence, it is important first to detect all significant similarities between the encoded amino acid sequence and any accumulated protein sequence data. We have implemented a set of queries and database sequences and proceeded to test and compare various similarity search methods and their parameterizations. We demonstrate here that the Smith-Waterman (S-W) dynamic programming method and the optimized version of FASTA are significantly better able to distinguish true similarities from statistical noise than is the popular database search tool BLAST. Also, a simple "log-length normalization" of S-W scores based on the query and target sequence lengths greatly increased the selectivity of the S-W searches, exceeding the default normalization method of FASTA. An implementation of the modified S-W algorithm in hardware (the Fast Data Finder) is able to match the accuracy of software versions while greatly speeding up its execution. We present here the selectivity and sensitivity data from these tests as well as results for various scoring matrices. We present data that will help users to choose threshold score values for evaluation of database search results. We also illustrate the impact of using simple-sequence masking tools such as SEG or XNU.

Mesh:

Substances:

Year:  1996        PMID: 8954800     DOI: 10.1006/geno.1996.0614

Source DB:  PubMed          Journal:  Genomics        ISSN: 0888-7543            Impact factor:   5.736


  14 in total

1.  Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs.

Authors:  Bastien Chevreux; Thomas Pfisterer; Bernd Drescher; Albert J Driesel; Werner E G Müller; Thomas Wetter; Sándor Suhai
Journal:  Genome Res       Date:  2004-05-12       Impact factor: 9.043

2.  LOESS correction for length variation in gene set-based genomic sequence analysis.

Authors:  Anton Aboukhalil; Martha L Bulyk
Journal:  Bioinformatics       Date:  2012-04-05       Impact factor: 6.937

3.  GNARE: automated system for high-throughput genome analysis with grid computational backend.

Authors:  Dinanath Sulakhe; Alex Rodriguez; Mark D'Souza; Michael Wilde; Veronika Nefedova; Ian Foster; Natalia Maltsev
Journal:  J Clin Monit Comput       Date:  2005-10       Impact factor: 2.502

4.  Characterization of a viral synergism in the monocot Brachypodium distachyon reveals distinctly altered host molecular processes associated with disease.

Authors:  Kranthi K Mandadi; Karen-Beth G Scholthof
Journal:  Plant Physiol       Date:  2012-09-06       Impact factor: 8.340

5.  In silico expressed sequence tag analysis in identification of probable diabetic genes as virtual therapeutic targets.

Authors:  Pabitra Mohan Behera; Deepak Kumar Behera; Aparajeya Panda; Anshuman Dixit; Payodhar Padhi
Journal:  Biomed Res Int       Date:  2013-02-11       Impact factor: 3.411

6.  A comparative genome-wide study of ncRNAs in trypanosomatids.

Authors:  Tirza Doniger; Rodolfo Katz; Chaim Wachtel; Shulamit Michaeli; Ron Unger
Journal:  BMC Genomics       Date:  2010-11-04       Impact factor: 3.969

7.  MatGAT: an application that generates similarity/identity matrices using protein or DNA sequences.

Authors:  James J Campanella; Ledion Bitincka; John Smalley
Journal:  BMC Bioinformatics       Date:  2003-07-10       Impact factor: 3.169

8.  More severe phenotype of early-onset osteoporosis associated with recessive form of LRP5 and combination with DKK1 or WNT3A.

Authors:  Caroline Caetano da Silva; Manon Ricquebourg; Philippe Orcel; Stéphanie Fabre; Thomas Funck-Brentano; Martine Cohen-Solal; Corinne Collet
Journal:  Mol Genet Genomic Med       Date:  2021-05-03       Impact factor: 2.183

9.  Using protein clusters from whole proteomes to construct and augment a dendrogram.

Authors:  Yunyun Zhou; Douglas R Call; Shira L Broschat
Journal:  Adv Bioinformatics       Date:  2013-02-20

10.  SyntTax: a web server linking synteny to prokaryotic taxonomy.

Authors:  Jacques Oberto
Journal:  BMC Bioinformatics       Date:  2013-01-16       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.