Literature DB >> 18172927

Rapid pattern development for concept recognition systems: application to point mutations.

J Gregory Caporaso1, William A Baumgartner, David A Randolph, K Bretonnel Cohen, Lawrence Hunter.   

Abstract

The primary biomedical literature is being generated at an unprecedented rate, and researchers cannot keep abreast of new developments in their fields. Biomedical natural language processing is being developed to address this issue, but building reliable systems often requires many expert-hours. We present an approach for automatically developing collections of regular expressions to drive high-performance concept recognition systems with minimal human interaction. We applied our approach to develop MutationFinder, a system for automatically extracting mentions of point mutations from the text. MutationFinder achieves performance equivalent to or better than manually developed mutation recognition systems, but the generation of its 759 patterns has required only 5.5 expert-hours. We also discuss the development and evaluation of our recently published high-quality, human-annotated gold standard corpus, which contains 1,515 complete point mutation mentions annotated in 813 abstracts. Both MutationFinder and the complete corpus are publicly available at (http://mutationfinder.sourceforge.net/).

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 18172927     DOI: 10.1142/s0219720007003144

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  7 in total

1.  A semi-supervised approach to extract pharmacogenomics-specific drug-gene pairs from biomedical literature for personalized medicine.

Authors:  Rong Xu; Quanqiu Wang
Journal:  J Biomed Inform       Date:  2013-04-06       Impact factor: 6.317

2.  Improved mutation tagging with gene identifiers applied to membrane protein stability prediction.

Authors:  Rainer Winnenburg; Conrad Plake; Michael Schroeder
Journal:  BMC Bioinformatics       Date:  2009-08-27       Impact factor: 3.169

3.  Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research.

Authors:  Àlex Bravo; Janet Piñero; Núria Queralt-Rosinach; Michael Rautschka; Laura I Furlong
Journal:  BMC Bioinformatics       Date:  2015-02-21       Impact factor: 3.169

4.  dRiskKB: a large-scale disease-disease risk relationship knowledge base constructed from biomedical text.

Authors:  Rong Xu; Li Li; Quanqiu Wang
Journal:  BMC Bioinformatics       Date:  2014-04-12       Impact factor: 3.169

5.  Extraction of human kinase mutations from literature, databases and genotyping studies.

Authors:  Martin Krallinger; Jose M G Izarzugaza; Carlos Rodriguez-Penagos; Alfonso Valencia
Journal:  BMC Bioinformatics       Date:  2009-08-27       Impact factor: 3.169

6.  Gene mention normalization and interaction extraction with context models and sentence motifs.

Authors:  Jörg Hakenberg; Conrad Plake; Loic Royer; Hendrik Strobelt; Ulf Leser; Michael Schroeder
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

7.  Challenges for automatically extracting molecular interactions from full-text articles.

Authors:  Tara McIntosh; James R Curran
Journal:  BMC Bioinformatics       Date:  2009-09-24       Impact factor: 3.169

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.