Literature DB >> 17646325

Manual curation is not sufficient for annotation of genomic databases.

William A Baumgartner1, K Bretonnel Cohen, Lynne M Fox, George Acquaah-Mensah, Lawrence Hunter.   

Abstract

MOTIVATION: Knowledge base construction has been an area of intense activity and great importance in the growth of computational biology. However, there is little or no history of work on the subject of evaluation of knowledge bases, either with respect to their contents or with respect to the processes by which they are constructed. This article proposes the application of a metric from software engineering known as the found/fixed graph to the problem of evaluating the processes by which genomic knowledge bases are built, as well as the completeness of their contents.
RESULTS: Well-understood patterns of change in the found/fixed graph are found to occur in two large publicly available knowledge bases. These patterns suggest that the current manual curation processes will take far too long to complete the annotations of even just the most important model organisms, and that at their current rate of production, they will never be sufficient for completing the annotation of all currently available proteomes.

Mesh:

Substances:

Year:  2007        PMID: 17646325      PMCID: PMC2516305          DOI: 10.1093/bioinformatics/btm229

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  32 in total

1.  Creating the gene ontology resource: design and implementation.

Authors: 
Journal:  Genome Res       Date:  2001-08       Impact factor: 9.043

2.  Gene indexing: characterization and analysis of NLM's GeneRIFs.

Authors:  Joyce A Mitchell; Alan R Aronson; James G Mork; Lillian C Folk; Susanne M Humphrey; Janice M Ward
Journal:  AMIA Annu Symp Proc       Date:  2003

3.  The database revolution.

Authors: 
Journal:  Nature       Date:  2007-01-18       Impact factor: 49.962

4.  Finding GeneRIFs via gene ontology annotations.

Authors:  Zhiyong Lu; K Bretonnel Cohen; Lawrence Hunter
Journal:  Pac Symp Biocomput       Date:  2006

5.  GeneRIF quality assurance as summary revision.

Authors:  Zhiyong Lu; K Bretonnel Cohen; Lawrence Hunter
Journal:  Pac Symp Biocomput       Date:  2007

6.  Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen.

Authors:  C K Stover; X Q Pham; A L Erwin; S D Mizoguchi; P Warrener; M J Hickey; F S Brinkman; W O Hufnagle; D J Kowalik; M Lagrou; R L Garber; L Goltry; E Tolentino; S Westbrock-Wadman; Y Yuan; L L Brody; S N Coulter; K R Folger; A Kas; K Larbig; R Lim; K Smith; D Spencer; G K Wong; Z Wu; I T Paulsen; J Reizer; M H Saier; R E Hancock; S Lory; M V Olson
Journal:  Nature       Date:  2000-08-31       Impact factor: 49.962

7.  ASAP, a systematic annotation package for community analysis of genomes.

Authors:  Jeremy D Glasner; Paul Liss; Guy Plunkett; Aaron Darling; Tejasvini Prasad; Michael Rusch; Alexis Byrnes; Michael Gilson; Bryan Biehl; Frederick R Blattner; Nicole T Perna
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

8.  xGDB: open-source computational infrastructure for the integrated evaluation and analysis of genome features.

Authors:  Shannon D Schlueter; Matthew D Wilkerson; Qunfeng Dong; Volker Brendel
Journal:  Genome Biol       Date:  2006       Impact factor: 13.583

9.  Entrez Gene: gene-centered information at NCBI.

Authors:  Donna Maglott; Jim Ostell; Kim D Pruitt; Tatiana Tatusova
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

10.  Extraction of transcript diversity from scientific literature.

Authors:  Parantu K Shah; Lars J Jensen; Stéphanie Boué; Peer Bork
Journal:  PLoS Comput Biol       Date:  2005-06-24       Impact factor: 4.475

View more
  109 in total

1.  IMID: integrated molecular interaction database.

Authors:  Sentil Balaji; Charles Mcclendon; Rajesh Chowdhary; Jun S Liu; Jinfeng Zhang
Journal:  Bioinformatics       Date:  2012-01-11       Impact factor: 6.937

2.  Mapping annotations with textual evidence using an scLDA model.

Authors:  Bo Jin; Vicky Chen; Lujia Chen; Xinghua Lu
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

Review 3.  Frontiers of biomedical text mining: current progress.

Authors:  Pierre Zweigenbaum; Dina Demner-Fushman; Hong Yu; Kevin B Cohen
Journal:  Brief Bioinform       Date:  2007-10-30       Impact factor: 11.622

4.  Intrinsic evaluation of text mining tools may not predict performance on realistic tasks.

Authors:  J Gregory Caporaso; Nita Deshpande; J Lynn Fink; Philip E Bourne; K Bretonnel Cohen; Lawrence Hunter
Journal:  Pac Symp Biocomput       Date:  2008

5.  Bayesian inference of protein-protein interactions from biological literature.

Authors:  Rajesh Chowdhary; Jinfeng Zhang; Jun S Liu
Journal:  Bioinformatics       Date:  2009-04-15       Impact factor: 6.937

6.  A knowledge-driven conditional approach to extract pharmacogenomics specific drug-gene relationships from free text.

Authors:  Rong Xu; Quanqiu Wang
Journal:  J Biomed Inform       Date:  2012-04-27       Impact factor: 6.317

Review 7.  Recent progress in automatically extracting information from the pharmacogenomic literature.

Authors:  Yael Garten; Adrien Coulet; Russ B Altman
Journal:  Pharmacogenomics       Date:  2010-10       Impact factor: 2.533

8.  Automated comparative auditing of NCIT genomic roles using NCBI.

Authors:  Barry Cohen; Marc Oren; Hua Min; Yehoshua Perl; Michael Halper
Journal:  J Biomed Inform       Date:  2008-03-28       Impact factor: 6.317

9.  Integrating image caption information into biomedical document classification in support of biocuration.

Authors:  Xiangying Jiang; Pengyuan Li; James Kadin; Judith A Blake; Martin Ringwald; Hagit Shatkay
Journal:  Database (Oxford)       Date:  2020-01-01       Impact factor: 3.451

10.  Detection of protein catalytic sites in the biomedical literature.

Authors:  Karin Verspoor; Andrew Mackinlay; Judith D Cohn; Michael E Wall
Journal:  Pac Symp Biocomput       Date:  2013
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.