Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Automating curation using a natural language processing pipeline.

Literature DB >> 18834488

Automating curation using a natural language processing pipeline.

Beatrice Alex¹, Claire Grover, Barry Haddow, Mijail Kabadjov, Ewan Klein, Michael Matthews, Richard Tobin, Xinglong Wang.

Abstract

BACKGROUND: The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers. The approach to these tasks taken by the University of Edinburgh team was to adapt and extend the existing natural language processing (NLP) system that we have developed as part of a commercial curation assistant. Although this paper concentrates on using NLP to assist with curation, the system can be equally employed to extract types of information from the literature that is immediately relevant to biologists in general.
RESULTS: Our system was among the highest performing on the interaction subtasks, and competitive performance on the gene mention task was achieved with minimal development effort. For the gene normalization task, a string matching technique that can be quickly applied to new domains was shown to perform close to average.
CONCLUSION: The technologies being developed were shown to be readily adapted to the BioCreative II tasks. Although high performance may be obtained on individual tasks such as gene mention recognition and normalization, and document classification, tasks in which a number of components must be combined, such as detection and normalization of interacting protein pairs, are still challenging for NLP systems.

Entities: CellLine Chemical Disease Gene Species

Mesh：

Year: 2008 PMID： 18834488 PMCID： PMC2559981 DOI： 10.1186/gb-2008-9-s2-s10

Source DB: PubMed Journal: Genome Biol ISSN： 1474-7596 Impact factor: 13.583

14 in total

1. A simple algorithm for identifying abbreviation definitions in biomedical text.

Authors: Ariel S Schwartz; Marti A Hearst
Journal: Pac Symp Biocomput Date: 2003

2. Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup.

Authors: Alexander S Yeh; Lynette Hirschman; Alexander A Morgan
Journal: Bioinformatics Date: 2003 Impact factor: 6.937

3. MedPost: a part-of-speech tagger for bioMedical text.

Authors: L Smith; T Rindflesch; W J Wilbur
Journal: Bioinformatics Date: 2004-04-08 Impact factor: 6.937

4. The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000.

Authors: A Bairoch; R Apweiler
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

5. Investigation into biomedical literature classification using support vector machines.

Authors: Nalini Polavarapu; Shamkant B Navathe; Ramprasad Ramnarayanan; Abrar ul Haque; Saurav Sahay; Ying Liu
Journal: Proc IEEE Comput Syst Bioinform Conf Date: 2005

6. PreBIND and Textomy--mining the biomedical literature for protein-protein interactions using a support vector machine.

Authors: Ian Donaldson; Joel Martin; Berry de Bruijn; Cheryl Wolting; Vicki Lay; Brigitte Tuekam; Shudong Zhang; Berivan Baskin; Gary D Bader; Katerina Michalickova; Tony Pawson; Christopher W V Hogue
Journal: BMC Bioinformatics Date: 2003-03-27 Impact factor: 3.169

7. Probabilistic linkage of large public health data files.

Authors: M A Jaro
Journal: Stat Med Date: 1995 Mar 15-Apr 15 Impact factor: 2.373

8. Identifying gene and protein mentions in text using conditional random fields.

Authors: Ryan McDonald; Fernando Pereira
Journal: BMC Bioinformatics Date: 2005-05-24 Impact factor: 3.169

9. Facts from text--is text mining ready to deliver?

Authors: Dietrich Rebholz-Schuhmann; Harald Kirsch; Francisco Couto
Journal: PLoS Biol Date: 2005-02 Impact factor: 8.029

10. Overview of the protein-protein interaction annotation extraction task of BioCreative II.

Authors: Martin Krallinger; Florian Leitner; Carlos Rodriguez-Penagos; Alfonso Valencia
Journal: Genome Biol Date: 2008-09-01 Impact factor: 13.583

5 in total

1. A literature search tool for intelligent extraction of disease-associated genes.

Authors: Jae-Yoon Jung; Todd F DeLuca; Tristan H Nelson; Dennis P Wall
Journal: J Am Med Inform Assoc Date: 2013-09-02 Impact factor: 4.497

2. Detecting experimental techniques and selecting relevant documents for protein-protein interactions from biomedical literature.

Authors: Xinglong Wang; Rafal Rak; Angelo Restificar; Chikashi Nobata; C J Rupp; Riza Theresa B Batista-Navarro; Raheel Nawaz; Sophia Ananiadou
Journal: BMC Bioinformatics Date: 2011-10-03 Impact factor: 3.169

3. Detection of interaction articles and experimental methods in biomedical literature.

Authors: Gerold Schneider; Simon Clematide; Fabio Rinaldi
Journal: BMC Bioinformatics Date: 2011-10-03 Impact factor: 3.169

4. Introducing meta-services for biomedical information extraction.

Authors: Florian Leitner; Martin Krallinger; Carlos Rodriguez-Penagos; Jörg Hakenberg; Conrad Plake; Cheng-Ju Kuo; Chun-Nan Hsu; Richard Tzong-Han Tsai; Hsi-Chuan Hung; William W Lau; Calvin A Johnson; Rune Saetre; Kazuhiro Yoshida; Yan Hua Chen; Sun Kim; Soo-Yong Shin; Byoung-Tak Zhang; William A Baumgartner; Lawrence Hunter; Barry Haddow; Michael Matthews; Xinglong Wang; Patrick Ruch; Frédéric Ehrler; Arzucan Ozgür; Güneş Erkan; Dragomir R Radev; Michael Krauthammer; ThaiBinh Luong; Robert Hoffmann; Chris Sander; Alfonso Valencia
Journal: Genome Biol Date: 2008-09-01 Impact factor: 13.583

5. Overview of the protein-protein interaction annotation extraction task of BioCreative II.

Authors: Martin Krallinger; Florian Leitner; Carlos Rodriguez-Penagos; Alfonso Valencia
Journal: Genome Biol Date: 2008-09-01 Impact factor: 13.583

5 in total