Literature DB >> 9520500

Comparative accuracy of methods for protein sequence similarity search.

P Agarwal1, D J States.   

Abstract

MOTIVATION: Searching a protein sequence database for homologs is a powerful tool for discovering the structure and function of a sequence. Two new methods for searching sequence databases have recently been described: Probabilistic Smith-Waterman (PSW), which is based on Hidden Markov models for a single sequence using a standard scoring matrix, and a new version of BLAST (WU-BLAST2), which uses Sum statistics for gapped alignments.
RESULTS: This paper compares and contrasts the effectiveness of these methods with three older methods (Smith-Waterman: SSEARCH, FASTA and BLASTP). The analysis indicates that the new methods are useful, and often offer improved accuracy. These tools are compared using a curated (by Bill Pearson) version of the annotated portion of PIR 39. Three different statistical criteria are utilized: equivalence number, minimum errors and the receiver operating characteristic. For complete-length protein query sequences from large families, PSW's accuracy is superior to that of the other methods, but its accuracy is poor when used with partial-length query sequences. False negatives are twice as common as false positives irrespective of the search methods if a family-specific threshold score that minimizes the total number of errors (i.e. the most favorable threshold score possible) is used. Thus, sensitivity, not selectivity, is the major problem. Among the analyzed methods using default parameters, the best accuracy was obtained from SSEARCH and PSW for complete-length proteins, and the two BLAST programs, plus SSEARCH, for partial-length proteins.

Entities:  

Mesh:

Substances:

Year:  1998        PMID: 9520500     DOI: 10.1093/bioinformatics/14.1.40

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  6 in total

1.  Predicting geographical variations in behavioural risk factors: an analysis of physical and mental healthy days.

Authors:  H Jia; P Muennig; E I Lubetkin; M R Gold
Journal:  J Epidemiol Community Health       Date:  2004-02       Impact factor: 3.710

2.  Genome annotation assessment in Drosophila melanogaster.

Authors:  M G Reese; G Hartzell; N L Harris; U Ohler; J F Abril; S E Lewis
Journal:  Genome Res       Date:  2000-04       Impact factor: 9.043

3.  Testing statistical significance scores of sequence comparison methods with structure similarity.

Authors:  Tim Hulsen; Jacob de Vlieg; Jack A M Leunissen; Peter M A Groenen
Journal:  BMC Bioinformatics       Date:  2006-10-12       Impact factor: 3.169

4.  A computational strategy for protein function assignment which addresses the multidomain problem.

Authors:  A J Pérez; A Rodríguez; O Trelles; G Thode
Journal:  Comp Funct Genomics       Date:  2002

5.  In silico repositioning-chemogenomics strategy identifies new drugs with potential activity against multiple life stages of Schistosoma mansoni.

Authors:  Bruno J Neves; Rodolpho C Braga; José C B Bezerra; Pedro V L Cravo; Carolina H Andrade
Journal:  PLoS Negl Trop Dis       Date:  2015-01-08

6.  Computational Chemogenomics Drug Repositioning Strategy Enables the Discovery of Epirubicin as a New Repurposed Hit for Plasmodium falciparum and P. vivax.

Authors:  Letícia Tiburcio Ferreira; Juliana Rodrigues; Carolina Horta Andrade; Pedro Vitor Lemos Cravo; Fabio Trindade Maranhão Costa; Gustavo Capatti Cassiano; Tatyana Almeida Tavella; Kaira Cristina Peralis Tomaz; Djane Clarys Baia-da-Silva; Macejane Ferreira Souza; Marilia Nunes do Nascimento Lima; Melina Mottin; Ludimila Dias Almeida; Juliana Calit; Maria Carolina Silva de Barros Puça; Gisely Cardoso Melo; Daniel Youssef Bargieri; Stefanie Costa Pinto Lopes; Marcus Vinicius Guimarães Lacerda; Elizabeth Bilsland; Per Sunnerhagen; Bruno Junior Neves
Journal:  Antimicrob Agents Chemother       Date:  2020-08-20       Impact factor: 5.191

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.