Literature DB >> 30811548

deepBioWSD: effective deep neural word sense disambiguation of biomedical text data.

Ahmad Pesaranghader1,2, Stan Matwin1,2, Marina Sokolova2,3,4, Ali Pesaranghader3.   

Abstract

OBJECTIVE: In biomedicine, there is a wealth of information hidden in unstructured narratives such as research articles and clinical reports. To exploit these data properly, a word sense disambiguation (WSD) algorithm prevents downstream difficulties in the natural language processing applications pipeline. Supervised WSD algorithms largely outperform un- or semisupervised and knowledge-based methods; however, they train 1 separate classifier for each ambiguous term, necessitating a large number of expert-labeled training data, an unattainable goal in medical informatics. To alleviate this need, a single model that shares statistical strength across all instances and scales well with the vocabulary size is desirable.
MATERIALS AND METHODS: Built on recent advances in deep learning, our deepBioWSD model leverages 1 single bidirectional long short-term memory network that makes sense prediction for any ambiguous term. In the model, first, the Unified Medical Language System sense embeddings will be computed using their text definitions; and then, after initializing the network with these embeddings, it will be trained on all (available) training data collectively. This method also considers a novel technique for automatic collection of training data from PubMed to (pre)train the network in an unsupervised manner.
RESULTS: We use the MSH WSD dataset to compare WSD algorithms, with macro and micro accuracies employed as evaluation metrics. deepBioWSD outperforms existing models in biomedical text WSD by achieving the state-of-the-art performance of 96.82% for macro accuracy.
CONCLUSIONS: Apart from the disambiguation improvement and unsupervised training, deepBioWSD depends on considerably less number of expert-labeled data as it learns the target and the context terms jointly. These merit deepBioWSD to be conveniently deployable in real-time biomedical applications.
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  bidirectional long short-term memory network; biomedical text mining; deep neural networks; word sense disambiguation; zero-shot learning

Mesh:

Year:  2019        PMID: 30811548      PMCID: PMC7787358          DOI: 10.1093/jamia/ocy189

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  29 in total

1.  Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study.

Authors:  Serguei Pakhomov; Bridget McInnes; Terrence Adam; Ying Liu; Ted Pedersen; Genevieve B Melton
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

2.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

3.  Automatically classifying question types for consumer health questions.

Authors:  Kirk Roberts; Halil Kilicoglu; Marcelo Fiszman; Dina Demner-Fushman
Journal:  AMIA Annu Symp Proc       Date:  2014-11-14

4.  Hyperdimensional computing approach to word sense disambiguation.

Authors:  Bjoern-Toby Berster; J Caleb Goodwin; Trevor Cohen
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

5.  Corpus domain effects on distributional semantic modeling of medical terms.

Authors:  Serguei V S Pakhomov; Greg Finley; Reed McEwan; Yan Wang; Genevieve B Melton
Journal:  Bioinformatics       Date:  2016-08-16       Impact factor: 6.937

6.  An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records.

Authors:  Ramakanth Kavuluru; Anthony Rios; Yuan Lu
Journal:  Artif Intell Med       Date:  2015-05-15       Impact factor: 5.326

7.  Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation.

Authors:  Antonio J Jimeno-Yepes; Bridget T McInnes; Alan R Aronson
Journal:  BMC Bioinformatics       Date:  2011-06-02       Impact factor: 3.169

8.  Supervised Learning and Knowledge-Based Approaches Applied to Biomedical Word Sense Disambiguation.

Authors:  Rui Antunes; Sérgio Matos
Journal:  J Integr Bioinform       Date:  2017-12-13

9.  BRONCO: Biomedical entity Relation ONcology COrpus for extracting gene-variant-disease-drug relations.

Authors:  Kyubum Lee; Sunwon Lee; Sungjoon Park; Sunkyu Kim; Suhkyung Kim; Kwanghun Choi; Aik Choon Tan; Jaewoo Kang
Journal:  Database (Oxford)       Date:  2016-04-13       Impact factor: 3.451

10.  The effect of word sense disambiguation accuracy on literature based discovery.

Authors:  Judita Preiss; Mark Stevenson
Journal:  BMC Med Inform Decis Mak       Date:  2016-07-18       Impact factor: 2.796

View more
  5 in total

1.  deepSimDEF: deep neural embeddings of gene products and Gene Ontology terms for functional analysis of genes.

Authors:  Ahmad Pesaranghader; Stan Matwin; Marina Sokolova; Jean-Christophe Grenier; Robert G Beiko; Julie Hussin
Journal:  Bioinformatics       Date:  2022-05-10       Impact factor: 6.931

2.  Ambiguity in medical concept normalization: An analysis of types and coverage in electronic health record datasets.

Authors:  Denis Newman-Griffis; Guy Divita; Bart Desmet; Ayah Zirikly; Carolyn P Rosé; Eric Fosler-Lussier
Journal:  J Am Med Inform Assoc       Date:  2021-03-01       Impact factor: 4.497

3.  Improving broad-coverage medical entity linking with semantic type prediction and large-scale datasets.

Authors:  Shikhar Vashishth; Denis Newman-Griffis; Rishabh Joshi; Ritam Dutt; Carolyn P Rosé
Journal:  J Biomed Inform       Date:  2021-08-12       Impact factor: 6.317

4.  A deep database of medical abbreviations and acronyms for natural language processing.

Authors:  Lisa Grossman Liu; Raymond H Grossman; Elliot G Mitchell; Chunhua Weng; Karthik Natarajan; George Hripcsak; David K Vawdrey
Journal:  Sci Data       Date:  2021-06-02       Impact factor: 6.444

Review 5.  Implementing Machine Learning in Interventional Cardiology: The Benefits Are Worth the Trouble.

Authors:  Walid Ben Ali; Ahmad Pesaranghader; Robert Avram; Pavel Overtchouk; Nils Perrin; Stéphane Laffite; Raymond Cartier; Reda Ibrahim; Thomas Modine; Julie G Hussin
Journal:  Front Cardiovasc Med       Date:  2021-12-08
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.