Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Empirical data on corpus design and usage in biomedical natural language processing.

Literature DB >> 16779021

Empirical data on corpus design and usage in biomedical natural language processing.

K Bretonnel Cohen¹, Lynne Fox, Philip V Ogren, Lawrence Hunter.

Abstract

This paper describes the design of six publicly available biomedical corpora. We then present usage data for the six corpora. We show that corpora that are carefully annotated with respect to structural and linguistic characteristics and that are distributed in standard formats are more widely used than corpora that are not. These findings have implications for the design of the next generation of biomedical corpora.

Mesh：

Year: 2005 PMID： 16779021 PMCID： PMC1560643

Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN： 1559-4076

4 in total

1. Constructing biological knowledge bases by extracting information from text sources.

Authors: M Craven; J Kumlien
Journal: Proc Int Conf Intell Syst Mol Biol Date: 1999

2. Automatic extraction of biological information from scientific text: protein-protein interactions.

Authors: C Blaschke; M A Andrade; C Ouzounis; A Valencia
Journal: Proc Int Conf Intell Syst Mol Biol Date: 1999

3. Protein names and how to find them.

Authors: Kristofer Franzén; Gunnar Eriksson; Fredrik Olsson; Lars Asker; Per Lidén; Joakim Cöster
Journal: Int J Med Inform Date: 2002-12-04 Impact factor: 4.046

4. GENETAG: a tagged corpus for gene/protein named entity recognition.

Authors: Lorraine Tanabe; Natalie Xie; Lynne H Thom; Wayne Matten; W John Wilbur
Journal: BMC Bioinformatics Date: 2005-05-24 Impact factor: 3.169

4 in total

5 in total

1. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications.

Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute
Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497

2. Detection of gene interactions based on syntactic relations.

Authors: Mi-Young Kim
Journal: J Biomed Biotechnol Date: 2008

3. Chapter 16: text mining for translational bioinformatics.

Authors: K Bretonnel Cohen; Lawrence E Hunter
Journal: PLoS Comput Biol Date: 2013-04-25 Impact factor: 4.475

4. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools.

Authors: Karin Verspoor; Kevin Bretonnel Cohen; Arrick Lanfranchi; Colin Warner; Helen L Johnson; Christophe Roeder; Jinho D Choi; Christopher Funk; Yuriy Malenkiy; Miriam Eckert; Nianwen Xue; William A Baumgartner; Michael Bada; Martha Palmer; Lawrence E Hunter
Journal: BMC Bioinformatics Date: 2012-08-17 Impact factor: 3.169

5. Corpus refactoring: a feasibility study.

Authors: Helen L Johnson; William A Baumgartner; Martin Krallinger; K Bretonnel Cohen; Lawrence Hunter
Journal: J Biomed Discov Collab Date: 2007-09-13

5 in total