Literature DB >> 17314123

Gene symbol disambiguation using knowledge-based profiles.

Hua Xu1, Jung-Wei Fan, George Hripcsak, Eneida A Mendonça, Marianthi Markatou, Carol Friedman.   

Abstract

MOTIVATION: The ambiguity of biomedical entities, particularly of gene symbols, is a big challenge for text-mining systems in the biomedical domain. Existing knowledge sources, such as Entrez Gene and the MEDLINE database, contain information concerning the characteristics of a particular gene that could be used to disambiguate gene symbols.
RESULTS: For each gene, we create a profile with different types of information automatically extracted from related MEDLINE abstracts and readily available annotated knowledge sources. We apply the gene profiles to the disambiguation task via an information retrieval method, which ranks the similarity scores between the context where the ambiguous gene is mentioned, and candidate gene profiles. The gene profile with the highest similarity score is then chosen as the correct sense. We evaluated the method on three automatically generated testing sets of mouse, fly and yeast organisms, respectively. The method achieved the highest precision of 93.9% for the mouse, 77.8% for the fly and 89.5% for the yeast. AVAILABILITY: The testing data sets and disambiguation programs are available at http://www.dbmi.columbia.edu/~hux7002/gsd2006

Entities:  

Mesh:

Year:  2007        PMID: 17314123     DOI: 10.1093/bioinformatics/btm056

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  19 in total

1.  Automated non-alphanumeric symbol resolution in clinical texts.

Authors:  SungRim Moon; Serguei Pakhomov; James Ryan; Genevieve B Melton
Journal:  AMIA Annu Symp Proc       Date:  2011-10-22

2.  A literature search tool for intelligent extraction of disease-associated genes.

Authors:  Jae-Yoon Jung; Todd F DeLuca; Tristan H Nelson; Dennis P Wall
Journal:  J Am Med Inform Assoc       Date:  2013-09-02       Impact factor: 4.497

3.  A fast document classification algorithm for gene symbol disambiguation in the BITOLA literature-based discovery support system.

Authors:  Andrej Kastrin; Dimitar Hristovski
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

4.  Exploring species-based strategies for gene normalization.

Authors:  Karin Verspoor; Christophe Roeder; Helen L Johnson; K Bretonnel Cohen; William A Baumgartner; Lawrence E Hunter
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2010 Jul-Sep       Impact factor: 3.710

5.  Combining corpus-derived sense profiles with estimated frequency information to disambiguate clinical abbreviations.

Authors:  Hua Xu; Peter D Stetson; Carol Friedman
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

6.  Using PharmGKB to train text mining approaches for identifying potential gene targets for pharmacogenomic studies.

Authors:  S Pakhomov; B T McInnes; J Lamba; Y Liu; G B Melton; Y Ghodke; N Bhise; V Lamba; A K Birnbaum
Journal:  J Biomed Inform       Date:  2012-05-04       Impact factor: 6.317

7.  Integrating various resources for gene name normalization.

Authors:  Yuncui Hu; Yanpeng Li; Hongfei Lin; Zhihao Yang; Liangxi Cheng
Journal:  PLoS One       Date:  2012-09-12       Impact factor: 3.240

8.  Identifying the status of genetic lesions in cancer clinical trial documents using machine learning.

Authors:  Yonghui Wu; Mia A Levy; Christine M Micheel; Paul Yeh; Buzhou Tang; Michael J Cantrell; Stacy M Cooreman; Hua Xu
Journal:  BMC Genomics       Date:  2012-12-17       Impact factor: 3.969

9.  GeneRIF indexing: sentence selection based on machine learning.

Authors:  Antonio J Jimeno-Yepes; J Caitlin Sticco; James G Mork; Alan R Aronson
Journal:  BMC Bioinformatics       Date:  2013-05-31       Impact factor: 3.169

10.  The eFIP system for text mining of protein interaction networks of phosphorylated proteins.

Authors:  Catalina O Tudor; Cecilia N Arighi; Qinghua Wang; Cathy H Wu; K Vijay-Shanker
Journal:  Database (Oxford)       Date:  2012-12-05       Impact factor: 3.451

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.