Literature DB >> 15748731

Percolation of annotation errors through hierarchically structured protein sequence databases.

Walter R Gilks1, Benjamin Audit, Daniela de Angelis, Sophia Tsoka, Christos A Ouzounis.   

Abstract

Databases of protein sequences have grown rapidly in recent years as a result of genome sequencing projects. Annotating protein sequences with descriptions of their biological function ideally requires careful experimentation, but this work lags far behind. Instead, biological function is often imputed by copying annotations from similar protein sequences. This gives rise to annotation errors, and more seriously, to chains of misannotation. [Percolation of annotation errors in a database of protein sequences (2002)] developed a probabilistic framework for exploring the consequences of this percolation of errors through protein databases, and applied their theory to a simple database model. Here we apply the theory to hierarchically structured protein sequence databases, and draw conclusions about database quality at different levels of the hierarchy.

Entities:  

Mesh:

Year:  2005        PMID: 15748731     DOI: 10.1016/j.mbs.2004.08.001

Source DB:  PubMed          Journal:  Math Biosci        ISSN: 0025-5564            Impact factor:   2.144


  29 in total

Review 1.  Proteins: form and function.

Authors:  Roy D Sleator
Journal:  Bioeng Bugs       Date:  2012-03-01

2.  Prediction of protein function improving sequence remote alignment search by a fuzzy logic algorithm.

Authors:  Antonio Gómez; Juan Cedano; Jordi Espadaler; Antonio Hermoso; Jaume Piñol; Enrique Querol
Journal:  Protein J       Date:  2008-02       Impact factor: 2.371

3.  GP4: an integrated Gram-Positive Protein Prediction Pipeline for subcellular localization mimicking bacterial sorting.

Authors:  Stefano Grasso; Tjeerd van Rij; Jan Maarten van Dijl
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

4.  More than 1,001 problems with protein domain databases: transmembrane regions, signal peptides and the issue of sequence homology.

Authors:  Wing-Cheong Wong; Sebastian Maurer-Stroh; Frank Eisenhaber
Journal:  PLoS Comput Biol       Date:  2010-07-29       Impact factor: 4.475

5.  GOPred: GO molecular function prediction by combined classifiers.

Authors:  Omer Sinan Saraç; Volkan Atalay; Rengul Cetin-Atalay
Journal:  PLoS One       Date:  2010-08-31       Impact factor: 3.240

6.  FACT: functional annotation transfer between proteins with similar feature architectures.

Authors:  Tina Koestler; Arndt von Haeseler; Ingo Ebersberger
Journal:  BMC Bioinformatics       Date:  2010-08-09       Impact factor: 3.169

7.  The automatic annotation of bacterial genomes.

Authors:  Emily J Richardson; Mick Watson
Journal:  Brief Bioinform       Date:  2012-03-09       Impact factor: 11.622

8.  Protein subfamily assignment using the Conserved Domain Database.

Authors:  Jessica H Fong; Aron Marchler-Bauer
Journal:  BMC Res Notes       Date:  2008-11-14

9.  ANNIE: integrated de novo protein sequence annotation.

Authors:  Hong Sain Ooi; Chia Yee Kwo; Michael Wildpaner; Fernanda L Sirota; Birgit Eisenhaber; Sebastian Maurer-Stroh; Wing Cheong Wong; Alexander Schleiffer; Frank Eisenhaber; Georg Schneider
Journal:  Nucleic Acids Res       Date:  2009-04-23       Impact factor: 16.971

10.  Protein function annotation with Structurally Aligned Local Sites of Activity (SALSAs).

Authors:  Zhouxi Wang; Pengcheng Yin; Joslynn S Lee; Ramya Parasuram; Srinivas Somarowthu; Mary Jo Ondrechen
Journal:  BMC Bioinformatics       Date:  2013-02-28       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.