BACKGROUND: The size of the protein sequence database has been exponentially increasing due to advances in genome sequencing. However, experimentally characterized proteins only constitute a small portion of the database, such that the majority of sequences have been annotated by computational approaches. Current automatic annotation pipelines inevitably introduce errors, making the annotations unreliable. Instead of such error-prone automatic annotations, functional interpretation should rely on annotations of 'reference proteins' that have been experimentally characterized or manually curated. RESULTS: The Seq2Ref server uses BLAST to detect proteins homologous to a query sequence and identifies the reference proteins among them. Seq2Ref then reports publications with experimental characterizations of the identified reference proteins that might be relevant to the query. Furthermore, a plurality-based rating system is developed to evaluate the homologous relationships and rank the reference proteins by their relevance to the query. CONCLUSIONS: The reference proteins detected by our server will lend insight into proteins of unknown function and provide extensive information to develop in-depth understanding of uncharacterized proteins. Seq2Ref is available at: http://prodata.swmed.edu/seq2ref.
BACKGROUND: The size of the protein sequence database has been exponentially increasing due to advances in genome sequencing. However, experimentally characterized proteins only constitute a small portion of the database, such that the majority of sequences have been annotated by computational approaches. Current automatic annotation pipelines inevitably introduce errors, making the annotations unreliable. Instead of such error-prone automatic annotations, functional interpretation should rely on annotations of 'reference proteins' that have been experimentally characterized or manually curated. RESULTS: The Seq2Ref server uses BLAST to detect proteins homologous to a query sequence and identifies the reference proteins among them. Seq2Ref then reports publications with experimental characterizations of the identified reference proteins that might be relevant to the query. Furthermore, a plurality-based rating system is developed to evaluate the homologous relationships and rank the reference proteins by their relevance to the query. CONCLUSIONS: The reference proteins detected by our server will lend insight into proteins of unknown function and provide extensive information to develop in-depth understanding of uncharacterized proteins. Seq2Ref is available at: http://prodata.swmed.edu/seq2ref.
Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971
Authors: S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman Journal: Nucleic Acids Res Date: 1997-09-01 Impact factor: 16.971
Authors: Antje Brockschmidt; Detlef Trost; Heike Peterziel; Katrin Zimmermann; Marion Ehrler; Henriette Grassmann; Philipp-Niclas Pfenning; Anke Waha; Dirk Wohlleber; Felix F Brockschmidt; Manfred Jugold; Alexander Hoischen; Claudia Kalla; Andreas Waha; Gerald Seifert; Percy A Knolle; Eicke Latz; Volkmar H Hans; Wolfgang Wick; Alexander Pfeifer; Peter Angel; Ruthild G Weber Journal: Brain Date: 2012-03-16 Impact factor: 13.501
Authors: P C Babbitt; M S Hasson; J E Wedekind; D R Palmer; W C Barrett; G H Reed; I Rayment; D Ringe; G L Kenyon; J A Gerlt Journal: Biochemistry Date: 1996-12-24 Impact factor: 3.162
Authors: Stephanie N Joslin; Christine Pybus; Maria Labandeira-Rey; Amanda S Evans; Ahmed S Attia; Chad A Brautigam; Eric J Hansen Journal: Infect Immun Date: 2014-10-13 Impact factor: 3.441