Literature DB >> 27016698

Discovering biomedical semantic relations in PubMed queries for information retrieval and database curation.

Chung-Chi Huang1, Zhiyong Lu2.   

Abstract

Identifying relevant papers from the literature is a common task in biocuration. Most current biomedical literature search systems primarily rely on matching user keywords. Semantic search, on the other hand, seeks to improve search accuracy by understanding the entities and contextual relations in user keywords. However, past research has mostly focused on semantically identifying biological entities (e.g. chemicals, diseases and genes) with little effort on discovering semantic relations. In this work, we aim to discover biomedical semantic relations in PubMed queries in an automated and unsupervised fashion. Specifically, we focus on extracting and understanding the contextual information (or context patterns) that is used by PubMed users to represent semantic relations between entities such as 'CHEMICAL-1 compared to CHEMICAL-2' With the advances in automatic named entity recognition, we first tag entities in PubMed queries and then use tagged entities as knowledge to recognize pattern semantics. More specifically, we transform PubMed queries into context patterns involving participating entities, which are subsequently projected to latent topics via latent semantic analysis (LSA) to avoid the data sparseness and specificity issues. Finally, we mine semantically similar contextual patterns or semantic relations based on LSA topic distributions. Our two separate evaluation experiments of chemical-chemical (CC) and chemical-disease (CD) relations show that the proposed approach significantly outperforms a baseline method, which simply measures pattern semantics by similarity in participating entities. The highest performance achieved by our approach is nearly 0.9 and 0.85 respectively for the CC and CD task when compared against the ground truth in terms of normalized discounted cumulative gain (nDCG), a standard measure of ranking quality. These results suggest that our approach can effectively identify and return related semantic patterns in a ranked order covering diverse bio-entity relations. To assess the potential utility of our automated top-ranked patterns of a given relation in semantic search, we performed a pilot study on frequently sought semantic relations in PubMed and observed improved literature retrieval effectiveness based on post-hoc human relevance evaluation. Further investigation in larger tests and in real-world scenarios is warranted. Published by Oxford University Press 2016. This work is written by US Government employees and is in the public domain in the US.

Entities:  

Mesh:

Year:  2016        PMID: 27016698      PMCID: PMC4808250          DOI: 10.1093/database/baw025

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


  18 in total

1.  PharmGKB: the pharmacogenetics and pharmacogenomics knowledge base.

Authors:  T E Klein; R B Altman
Journal:  Pharmacogenomics J       Date:  2004       Impact factor: 3.550

2.  Finding query suggestions for PubMed.

Authors:  Zhiyong Lu; W John Wilbur; Johanna R McEntyre; Alexey Iskhakov; Lee Szilagyi
Journal:  AMIA Annu Symp Proc       Date:  2009-11-14

3.  Semi-automatic semantic annotation of PubMed queries: a study on quality, efficiency, satisfaction.

Authors:  Aurélie Névéol; Rezarta Islamaj Doğan; Zhiyong Lu
Journal:  J Biomed Inform       Date:  2010-11-20       Impact factor: 6.317

4.  Use of controlled vocabularies to improve biomedical information retrieval tasks.

Authors:  Emilie Pasche; Julien Gobeill; Dina Vishnyakova; Patrick Ruch; Christian Lovis
Journal:  Stud Health Technol Inform       Date:  2013

5.  Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature.

Authors:  Rong Xu; QuanQiu Wang
Journal:  J Biomed Inform       Date:  2014-06-10       Impact factor: 6.317

6.  Evaluation of Query Expansion Using MeSH in PubMed.

Authors:  Zhiyong Lu; Won Kim; W John Wilbur
Journal:  Inf Retr Boston       Date:  2009       Impact factor: 2.293

7.  Understanding PubMed user search behavior through log analysis.

Authors:  Rezarta Islamaj Dogan; G Craig Murray; Aurélie Névéol; Zhiyong Lu
Journal:  Database (Oxford)       Date:  2009-11-27       Impact factor: 3.451

8.  The Comparative Toxicogenomics Database (CTD).

Authors:  Carolyn J Mattingly; Glenn T Colby; John N Forrest; James L Boyer
Journal:  Environ Health Perspect       Date:  2003-05       Impact factor: 9.031

9.  Overview of the protein-protein interaction annotation extraction task of BioCreative II.

Authors:  Martin Krallinger; Florian Leitner; Carlos Rodriguez-Penagos; Alfonso Valencia
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

10.  DNorm: disease name normalization with pairwise learning to rank.

Authors:  Robert Leaman; Rezarta Islamaj Dogan; Zhiyong Lu
Journal:  Bioinformatics       Date:  2013-08-21       Impact factor: 6.937

View more
  1 in total

1.  How user intelligence is improving PubMed.

Authors:  Nicolas Fiorini; Robert Leaman; David J Lipman; Zhiyong Lu
Journal:  Nat Biotechnol       Date:  2018-10-01       Impact factor: 54.908

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.