Literature DB >> 18629055

A computational strategy for protein function assignment which addresses the multidomain problem.

A J Pérez1, A Rodríguez, O Trelles, G Thode.   

Abstract

A method for assigning functions to unknown sequences based on finding correlations between short signals and functional annotations in a protein database is presented. This approach is based on keyword (KW) and feature (FT) information stored in the SWISS-PROT database. The former refers to particular protein characteristics and the latter locates these characteristics at a specific sequence position. In this way, a certain keyword is only assigned to a sequence if sequence similarity is found in the position described by the FT field. Exhaustive tests performed over sequences with homologues (cluster set) and without homologues (singleton set) in the database show that assigning functions is much 'cleaner' when information about domains (FT field) is used, than when only the keywords are used.

Year:  2002        PMID: 18629055      PMCID: PMC2447339          DOI: 10.1002/cfg.208

Source DB:  PubMed          Journal:  Comp Funct Genomics        ISSN: 1531-6912


  50 in total

1.  Automated genome sequence analysis and annotation.

Authors:  M A Andrade; N P Brown; C Leroy; S Hoersch; A de Daruvar; C Reich; A Franchini; J Tamames; A Valencia; C Ouzounis; C Sander
Journal:  Bioinformatics       Date:  1999-05       Impact factor: 6.937

2.  ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisons.

Authors:  F Corpet; F Servant; J Gouzy; D Kahn
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

3.  The Pfam protein families database.

Authors:  A Bateman; E Birney; R Durbin; S R Eddy; K L Howe; E L Sonnhammer
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

4.  Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scores.

Authors:  C A Wilson; J Kreychman; M Gerstein
Journal:  J Mol Biol       Date:  2000-03-17       Impact factor: 5.469

5.  Errors in genome annotation.

Authors:  S E Brenner
Journal:  Trends Genet       Date:  1999-04       Impact factor: 11.639

6.  DANTE: a workbench for sequence analysis.

Authors:  J Tamames; A Tramontano
Journal:  Trends Biochem Sci       Date:  2000-08       Impact factor: 13.807

7.  Search for ancient patterns in protein sequences.

Authors:  G Thode; J A García-Ranea; J Jimenez
Journal:  J Mol Evol       Date:  1996-02       Impact factor: 2.395

8.  Improved tools for biological sequence comparison.

Authors:  W R Pearson; D J Lipman
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

9.  A general method applicable to the search for similarities in the amino acid sequence of two proteins.

Authors:  S B Needleman; C D Wunsch
Journal:  J Mol Biol       Date:  1970-03       Impact factor: 5.469

10.  SCOP: a structural classification of proteins database for the investigation of sequences and structures.

Authors:  A G Murzin; S E Brenner; T Hubbard; C Chothia
Journal:  J Mol Biol       Date:  1995-04-07       Impact factor: 5.469

View more
  1 in total

1.  VICMpred: an SVM-based method for the prediction of functional proteins of Gram-negative bacteria using amino acid patterns and composition.

Authors:  Sudipto Saha; G P S Raghava
Journal:  Genomics Proteomics Bioinformatics       Date:  2006-02       Impact factor: 7.691

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.