Literature DB >> 10547299

Benchmarking PSI-BLAST in genome annotation.

A Müller1, R M MacCallum, M J Sternberg.   

Abstract

The recognition of remote protein homologies is a major aspect of the structural and functional annotation of newly determined genomes. Here we benchmark the coverage and error rate of genome annotation using the widely used homology-searching program PSI-BLAST (position-specific iterated basic local alignment search tool). This study evaluates the one-to-many success rate for recognition, as often there are several homologues in the database and only one needs to be identified for annotating the sequence. In contrast, previous benchmarks considered one-to-one recognition in which a single query was required to find a particular target. The benchmark constructs a model genome from the full sequences of the structural classification of protein (SCOP) database and searches against a target library of remote homologous domains (<20 % identity). The structural benchmark provides a reliable list of correct and false homology assignments. PSI-BLAST successfully annotated 40 % of the domains in the model genome that had at least one homologue in the target library. This coverage is more than three times that if one-to-one recognition is evaluated (11 % coverage of domains). Although a structural benchmark was used, the results equally apply to just sequence homology searches. Accordingly, structural and sequence assignments were made to the sequences of Mycoplasma genitalium and Mycobacterium tuberculosis (see http://www.bmm.icnet. uk). The extent of missed assignments and of new superfamilies can be estimated for these genomes for both structural and functional annotations. Copyright 1999 Academic Press.

Entities:  

Mesh:

Substances:

Year:  1999        PMID: 10547299     DOI: 10.1006/jmbi.1999.3233

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  34 in total

1.  Evaluation of PSI-BLAST alignment accuracy in comparison to structural alignments.

Authors:  I Friedberg; T Kaplan; H Margalit
Journal:  Protein Sci       Date:  2000-11       Impact factor: 6.725

2.  Identification of related proteins with weak sequence identity using secondary structure information.

Authors:  C Geourjon; C Combet; C Blanchet; G Deléage
Journal:  Protein Sci       Date:  2001-04       Impact factor: 6.725

3.  Motif-based fold assignment.

Authors:  L Salwinski; D Eisenberg
Journal:  Protein Sci       Date:  2001-12       Impact factor: 6.725

4.  The CATH extended protein-family database: providing structural annotations for genome sequences.

Authors:  Frances M G Pearl; David Lee; James E Bray; Daniel W A Buchan; Adrian J Shepherd; Christine A Orengo
Journal:  Protein Sci       Date:  2002-02       Impact factor: 6.725

Review 5.  Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements.

Authors:  A A Schäffer; L Aravind; T L Madden; S Shavirin; J L Spouge; Y I Wolf; E V Koonin; S F Altschul
Journal:  Nucleic Acids Res       Date:  2001-07-15       Impact factor: 16.971

6.  Comparative analysis of chloroplast genomes: functional annotation, genome-based phylogeny, and deduced evolutionary patterns.

Authors:  Javier De Las Rivas; Juan Jose Lozano; Angel R Ortiz
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

7.  Gene3D: structural assignment for whole genes and genomes using the CATH domain structure database.

Authors:  Daniel W A Buchan; Adrian J Shepherd; David Lee; Frances M G Pearl; Stuart C G Rison; Janet M Thornton; Christine A Orengo
Journal:  Genome Res       Date:  2002-03       Impact factor: 9.043

8.  Re-annotating the Mycoplasma pneumoniae genome sequence: adding value, function and reading frames.

Authors:  T Dandekar; M Huynen; J T Regula; B Ueberle; C U Zimmermann; M A Andrade; T Doerks; L Sánchez-Pulido; B Snel; M Suyama; Y P Yuan; R Herrmann; P Bork
Journal:  Nucleic Acids Res       Date:  2000-09-01       Impact factor: 16.971

9.  3D-GENOMICS: a database to compare structural and functional annotations of proteins between sequenced genomes.

Authors:  Keiran Fleming; Arne Müller; Robert M MacCallum; Michael J E Sternberg
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

10.  Gene3D: structural assignments for the biologist and bioinformaticist alike.

Authors:  Daniel W A Buchan; Stuart C G Rison; James E Bray; David Lee; Frances Pearl; Janet M Thornton; Christine A Orengo
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.