Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup.

Literature DB >> 12855478

Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup.

Alexander S Yeh¹, Lynette Hirschman, Alexander A Morgan.

Abstract

MOTIVATION: The biological literature is a major repository of knowledge. Many biological databases draw much of their content from a careful curation of this literature. However, as the volume of literature increases, the burden of curation increases. Text mining may provide useful tools to assist in the curation process. To date, the lack of standards has made it impossible to determine whether text mining techniques are sufficiently mature to be useful.
RESULTS: We report on a Challenge Evaluation task that we created for the Knowledge Discovery and Data Mining (KDD) Challenge Cup. We provided a training corpus of 862 articles consisting of journal articles curated in FlyBase, along with the associated lists of genes and gene products, as well as the relevant data fields from FlyBase. For the test, we provided a corpus of 213 new ('blind') articles; the 18 participating groups provided systems that flagged articles for curation, based on whether the article contained experimental evidence for gene expression products. We report on the evaluation results and describe the techniques used by the top performing groups.

Mesh：

Substances：
Acetaminophen

Year: 2003 PMID： 12855478 DOI： 10.1093/bioinformatics/btg1046

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

42 in total

1. A statistical approach to scanning the biomedical literature for pharmacogenetics knowledge.

Authors: Daniel L Rubin; Caroline F Thorn; Teri E Klein; Russ B Altman
Journal: J Am Med Inform Assoc Date: 2004-11-23 Impact factor: 4.497

Review 2. Biomedical language processing: what's beyond PubMed?

Authors: Lawrence Hunter; K Bretonnel Cohen
Journal: Mol Cell Date: 2006-03-03 Impact factor: 17.970

3. Enhancing text categorization with semantic-enriched representation and training data augmentation.

Authors: Xinghua Lu; Bin Zheng; Atulya Velivelli; Chengxiang Zhai
Journal: J Am Med Inform Assoc Date: 2006-06-23 Impact factor: 4.497

Review 4. Frontiers of biomedical text mining: current progress.

Authors: Pierre Zweigenbaum; Dina Demner-Fushman; Hong Yu; Kevin B Cohen
Journal: Brief Bioinform Date: 2007-10-30 Impact factor: 11.622

5. Literature mining on pharmacokinetics numerical data: a feasibility study.

Authors: Zhiping Wang; Seongho Kim; Sara K Quinney; Yingying Guo; Stephen D Hall; Luis M Rocha; Lang Li
Journal: J Biomed Inform Date: 2009-04-02 Impact factor: 6.317

Review 6. Recent progress in automatically extracting information from the pharmacogenomic literature.

Authors: Yael Garten; Adrien Coulet; Russ B Altman
Journal: Pharmacogenomics Date: 2010-10 Impact factor: 2.533

7. Automatically classifying sentences in full-text biomedical articles into Introduction, Methods, Results and Discussion.

Authors: Shashank Agarwal; Hong Yu
Journal: Bioinformatics Date: 2009-09-25 Impact factor: 6.937