Literature DB >> 20671318

Exploring species-based strategies for gene normalization.

Karin Verspoor1, Christophe Roeder, Helen L Johnson, K Bretonnel Cohen, William A Baumgartner, Lawrence E Hunter.   

Abstract

We introduce a system developed for the BioCreative II.5 community evaluation of information extraction of proteins and protein interactions. The paper focuses primarily on the gene normalization task of recognizing protein mentions in text and mapping them to the appropriate database identifiers based on contextual clues. We outline a ""fuzzy" dictionary lookup approach to protein mention detection that matches regularized text to similarly regularized dictionary entries. We describe several different strategies for gene normalization that focus on species or organism mentions in the text, both globally throughout the document and locally in the immediate vicinity of a protein mention, and present the results of experimentation with a series of system variations that explore the effectiveness of the various normalization strategies, as well as the role of external knowledge sources. While our system was neither the best nor the worst performing system in the evaluation, the gene normalization strategies show promise and the system affords the opportunity to explore some of the variables affecting performance on the BCII.5 tasks.

Entities:  

Mesh:

Year:  2010        PMID: 20671318      PMCID: PMC2929766          DOI: 10.1109/TCBB.2010.48

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  11 in total

1.  A simple algorithm for identifying abbreviation definitions in biomedical text.

Authors:  Ariel S Schwartz; Marti A Hearst
Journal:  Pac Symp Biocomput       Date:  2003

2.  Gene symbol disambiguation using knowledge-based profiles.

Authors:  Hua Xu; Jung-Wei Fan; George Hripcsak; Eneida A Mendonça; Marianthi Markatou; Carol Friedman
Journal:  Bioinformatics       Date:  2007-02-21       Impact factor: 6.937

3.  ProMiner: rule-based protein and gene entity recognition.

Authors:  Daniel Hanisch; Katrin Fundel; Heinz-Theodor Mevissen; Ralf Zimmer; Juliane Fluck
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

4.  Disambiguating the species of biomedical named entities using natural language parsers.

Authors:  Xinglong Wang; Jun'ichi Tsujii; Sophia Ananiadou
Journal:  Bioinformatics       Date:  2010-01-06       Impact factor: 6.937

5.  An open-source framework for large-scale, flexible evaluation of biomedical text mining systems.

Authors:  William A Baumgartner; K Bretonnel Cohen; Lawrence Hunter
Journal:  J Biomed Discov Collab       Date:  2008-01-29

6.  Overview of BioCreative II gene normalization.

Authors:  Alexander A Morgan; Zhiyong Lu; Xinglong Wang; Aaron M Cohen; Juliane Fluck; Patrick Ruch; Anna Divoli; Katrin Fundel; Robert Leaman; Jörg Hakenberg; Chengjie Sun; Heng-hui Liu; Rafael Torres; Michael Krauthammer; William W Lau; Hongfang Liu; Chun-Nan Hsu; Martijn Schuemie; K Bretonnel Cohen; Lynette Hirschman
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

7.  Introducing meta-services for biomedical information extraction.

Authors:  Florian Leitner; Martin Krallinger; Carlos Rodriguez-Penagos; Jörg Hakenberg; Conrad Plake; Cheng-Ju Kuo; Chun-Nan Hsu; Richard Tzong-Han Tsai; Hsi-Chuan Hung; William W Lau; Calvin A Johnson; Rune Saetre; Kazuhiro Yoshida; Yan Hua Chen; Sun Kim; Soo-Yong Shin; Byoung-Tak Zhang; William A Baumgartner; Lawrence Hunter; Barry Haddow; Michael Matthews; Xinglong Wang; Patrick Ruch; Frédéric Ehrler; Arzucan Ozgür; Güneş Erkan; Dragomir R Radev; Michael Krauthammer; ThaiBinh Luong; Robert Hoffmann; Chris Sander; Alfonso Valencia
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

8.  Evaluation of text-mining systems for biology: overview of the Second BioCreative community challenge.

Authors:  Martin Krallinger; Alexander Morgan; Larry Smith; Florian Leitner; Lorraine Tanabe; John Wilbur; Lynette Hirschman; Alfonso Valencia
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

9.  Concept recognition for extracting protein interaction relations from biomedical text.

Authors:  William A Baumgartner; Zhiyong Lu; Helen L Johnson; J Gregory Caporaso; Jesse Paquette; Anna Lindemann; Elizabeth K White; Olga Medvedeva; K Bretonnel Cohen; Lawrence Hunter
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

10.  OpenDMAP: an open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression.

Authors:  Lawrence Hunter; Zhiyong Lu; James Firby; William A Baumgartner; Helen L Johnson; Philip V Ogren; K Bretonnel Cohen
Journal:  BMC Bioinformatics       Date:  2008-01-31       Impact factor: 3.169

View more
  8 in total

1.  Cross-species gene normalization by species inference.

Authors:  Chih-Hsuan Wei; Hung-Yu Kao
Journal:  BMC Bioinformatics       Date:  2011-10-03       Impact factor: 3.169

2.  Soft tagging of overlapping high confidence gene mention variants for cross-species full-text gene normalization.

Authors:  Cheng-Ju Kuo; Maurice H T Ling; Chun-Nan Hsu
Journal:  BMC Bioinformatics       Date:  2011-10-03       Impact factor: 3.169

3.  Assigning species information to corresponding genes by a sequence labeling framework.

Authors:  Ling Luo; Chih-Hsuan Wei; Po-Ting Lai; Qingyu Chen; Rezarta Islamaj; Zhiyong Lu
Journal:  Database (Oxford)       Date:  2022-10-13       Impact factor: 4.462

4.  Text mining improves prediction of protein functional sites.

Authors:  Karin M Verspoor; Judith D Cohn; Komandur E Ravikumar; Michael E Wall
Journal:  PLoS One       Date:  2012-02-29       Impact factor: 3.240

5.  pGenN, a gene normalization tool for plant genes and proteins in scientific literature.

Authors:  Ruoyao Ding; Cecilia N Arighi; Jung-Youn Lee; Cathy H Wu; K Vijay-Shanker
Journal:  PLoS One       Date:  2015-08-10       Impact factor: 3.240

6.  Entity recognition in the biomedical domain using a hybrid approach.

Authors:  Marco Basaldella; Lenz Furrer; Carlo Tasso; Fabio Rinaldi
Journal:  J Biomed Semantics       Date:  2017-11-09

7.  Literature mining of protein-residue associations with graph rules learned through distant supervision.

Authors:  Ke Ravikumar; Haibin Liu; Judith D Cohn; Michael E Wall; Karin Verspoor
Journal:  J Biomed Semantics       Date:  2012-10-05

8.  Chapter 16: text mining for translational bioinformatics.

Authors:  K Bretonnel Cohen; Lawrence E Hunter
Journal:  PLoS Comput Biol       Date:  2013-04-25       Impact factor: 4.475

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.