Literature DB >> 28286031

A passage retrieval method based on probabilistic information retrieval model and UMLS concepts in biomedical question answering.

Mourad Sarrouti1, Said Ouatik El Alaoui2.   

Abstract

BACKGROUND AND
OBJECTIVE: Passage retrieval, the identification of top-ranked passages that may contain the answer for a given biomedical question, is a crucial component for any biomedical question answering (QA) system. Passage retrieval in open-domain QA is a longstanding challenge widely studied over the last decades. However, it still requires further efforts in biomedical QA. In this paper, we present a new biomedical passage retrieval method based on Stanford CoreNLP sentence/passage length, probabilistic information retrieval (IR) model and UMLS concepts.
METHODS: In the proposed method, we first use our document retrieval system based on PubMed search engine and UMLS similarity to retrieve relevant documents to a given biomedical question. We then take the abstracts from the retrieved documents and use Stanford CoreNLP for sentence splitter to make a set of sentences, i.e., candidate passages. Using stemmed words and UMLS concepts as features for the BM25 model, we finally compute the similarity scores between the biomedical question and each of the candidate passages and keep the N top-ranked ones.
RESULTS: Experimental evaluations performed on large standard datasets, provided by the BioASQ challenge, show that the proposed method achieves good performances compared with the current state-of-the-art methods. The proposed method significantly outperforms the current state-of-the-art methods by an average of 6.84% in terms of mean average precision (MAP).
CONCLUSION: We have proposed an efficient passage retrieval method which can be used to retrieve relevant passages in biomedical QA systems with high mean average precision.
Copyright © 2017 Elsevier Inc. All rights reserved.

Keywords:  Biomedical informatics; Biomedical passage retieval; Biomedical question answering system; Natural language processing; Probabilistic information retrieval model; Unified medical language system

Mesh:

Year:  2017        PMID: 28286031     DOI: 10.1016/j.jbi.2017.03.001

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  9 in total

1.  Resource and Response Type Classification for Consumer Health Question Answering.

Authors:  William R Kearns; Jason A Thomas
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05

2.  LitSense: making sense of biomedical literature at sentence level.

Authors:  Alexis Allot; Qingyu Chen; Sun Kim; Roberto Vera Alvarez; Donald C Comeau; W John Wilbur; Zhiyong Lu
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

3.  Survey on evaluation methods for dialogue systems.

Authors:  Jan Deriu; Alvaro Rodrigo; Arantxa Otegi; Guillermo Echegoyen; Sophie Rosset; Eneko Agirre; Mark Cieliebak
Journal:  Artif Intell Rev       Date:  2020-06-25       Impact factor: 8.139

4.  Towards a unified search: Improving PubMed retrieval with full text.

Authors:  Won Kim; Lana Yeganova; Donald C Comeau; W John Wilbur; Zhiyong Lu
Journal:  J Biomed Inform       Date:  2022-09-21       Impact factor: 8.000

5.  A New Biomedical Passage Retrieval Framework for Laboratory Medicine: Leveraging Domain-specific Ontology, Multilevel PRF, and Negation Differential Weighting.

Authors:  Keejun Han; Hyoeun Shim; Mun Y Yi
Journal:  J Healthc Eng       Date:  2018-12-24       Impact factor: 2.682

6.  Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records.

Authors:  Qingyu Chen; Jingcheng Du; Sun Kim; W John Wilbur; Zhiyong Lu
Journal:  BMC Med Inform Decis Mak       Date:  2020-04-30       Impact factor: 2.796

7.  List-wise learning to rank biomedical question-answer pairs with deep ranking recursive autoencoders.

Authors:  Yan Yan; Bo-Wen Zhang; Xu-Feng Li; Zhenhan Liu
Journal:  PLoS One       Date:  2020-11-09       Impact factor: 3.240

8.  Protocol for a reproducible experimental survey on biomedical sentence similarity.

Authors:  Alicia Lara-Clares; Juan J Lastra-Díaz; Ana Garcia-Serrano
Journal:  PLoS One       Date:  2021-03-24       Impact factor: 3.240

9.  HESML: a real-time semantic measures library for the biomedical domain with a reproducible survey.

Authors:  Juan J Lastra-Díaz; Alicia Lara-Clares; Ana Garcia-Serrano
Journal:  BMC Bioinformatics       Date:  2022-01-06       Impact factor: 3.169

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.