Literature DB >> 31265066

Enhancing clinical concept extraction with contextual embeddings.

Yuqi Si1, Jingqi Wang1, Hua Xu1, Kirk Roberts1.   

Abstract

OBJECTIVE: Neural network-based representations ("embeddings") have dramatically advanced natural language processing (NLP) tasks, including clinical NLP tasks such as concept extraction. Recently, however, more advanced embedding methods and representations (eg, ELMo, BERT) have further pushed the state of the art in NLP, yet there are no common best practices for how to integrate these representations into clinical tasks. The purpose of this study, then, is to explore the space of possible options in utilizing these new models for clinical concept extraction, including comparing these to traditional word embedding methods (word2vec, GloVe, fastText).
MATERIALS AND METHODS: Both off-the-shelf, open-domain embeddings and pretrained clinical embeddings from MIMIC-III (Medical Information Mart for Intensive Care III) are evaluated. We explore a battery of embedding methods consisting of traditional word embeddings and contextual embeddings and compare these on 4 concept extraction corpora: i2b2 2010, i2b2 2012, SemEval 2014, and SemEval 2015. We also analyze the impact of the pretraining time of a large language model like ELMo or BERT on the extraction performance. Last, we present an intuitive way to understand the semantic information encoded by contextual embeddings.
RESULTS: Contextual embeddings pretrained on a large clinical corpus achieves new state-of-the-art performances across all concept extraction tasks. The best-performing model outperforms all state-of-the-art methods with respective F1-measures of 90.25, 93.18 (partial), 80.74, and 81.65.
CONCLUSIONS: We demonstrate the potential of contextual embeddings through the state-of-the-art performance these methods achieve on clinical concept extraction. Additionally, we demonstrate that contextual embeddings encode valuable semantic information not accounted for in traditional word representations.
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Keywords:  clinical concept extraction; contextual embeddings; language model

Mesh:

Year:  2019        PMID: 31265066      PMCID: PMC6798561          DOI: 10.1093/jamia/ocz096

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  19 in total

1.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.

Authors:  Özlem Uzuner; Brett R South; Shuying Shen; Scott L DuVall
Journal:  J Am Med Inform Assoc       Date:  2011-06-16       Impact factor: 4.497

2.  Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study.

Authors:  Serguei Pakhomov; Bridget McInnes; Terrence Adam; Ying Liu; Ted Pedersen; Genevieve B Melton
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

3.  Automatic extraction of relations between medical concepts in clinical texts.

Authors:  Bryan Rink; Sanda Harabagiu; Kirk Roberts
Journal:  J Am Med Inform Assoc       Date:  2011 Sep-Oct       Impact factor: 4.497

4.  Recognizing Disjoint Clinical Concepts in Clinical Text Using Machine Learning-based Methods.

Authors:  Buzhou Tang; Qingcai Chen; Xiaolong Wang; Yonghui Wu; Yaoyun Zhang; Min Jiang; Jingqi Wang; Hua Xu
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

Review 5.  Clinical information extraction applications: A literature review.

Authors:  Yanshan Wang; Liwei Wang; Majid Rastegar-Mojarad; Sungrim Moon; Feichen Shen; Naveed Afzal; Sijia Liu; Yuqun Zeng; Saeed Mehrabi; Sunghwan Sohn; Hongfang Liu
Journal:  J Biomed Inform       Date:  2017-11-21       Impact factor: 6.317

Review 6.  Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.

Authors:  Amber Stubbs; Christopher Kotfila; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2015-07-28       Impact factor: 6.317

Review 7.  Evaluating temporal relations in clinical text: 2012 i2b2 Challenge.

Authors:  Weiyi Sun; Anna Rumshisky; Ozlem Uzuner
Journal:  J Am Med Inform Assoc       Date:  2013-04-05       Impact factor: 4.497

8.  Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010.

Authors:  Berry de Bruijn; Colin Cherry; Svetlana Kiritchenko; Joel Martin; Xiaodan Zhu
Journal:  J Am Med Inform Assoc       Date:  2011-05-12       Impact factor: 4.497

9.  MIMIC-III, a freely accessible critical care database.

Authors:  Alistair E W Johnson; Tom J Pollard; Lu Shen; Li-Wei H Lehman; Mengling Feng; Mohammad Ghassemi; Benjamin Moody; Peter Szolovits; Leo Anthony Celi; Roger G Mark
Journal:  Sci Data       Date:  2016-05-24       Impact factor: 6.444

10.  Deep learning with word embeddings improves biomedical named entity recognition.

Authors:  Maryam Habibi; Leon Weber; Mariana Neves; David Luis Wiegandt; Ulf Leser
Journal:  Bioinformatics       Date:  2017-07-15       Impact factor: 6.937

View more
  32 in total

1.  RadLex Normalization in Radiology Reports.

Authors:  Surabhi Datta; Jordan Godfrey-Stovall; Kirk Roberts
Journal:  AMIA Annu Symp Proc       Date:  2021-01-25

2.  Leveraging Contextual Information in Extracting Long Distance Relations from Clinical Notes.

Authors:  Hong Guan; Murthy Devarakonda
Journal:  AMIA Annu Symp Proc       Date:  2020-03-04

3.  Automatic classification of scanned electronic health record documents.

Authors:  Heath Goodrum; Kirk Roberts; Elmer V Bernstam
Journal:  Int J Med Inform       Date:  2020-10-17       Impact factor: 4.046

4.  Does BERT need domain adaptation for clinical negation detection?

Authors:  Chen Lin; Steven Bethard; Dmitriy Dligach; Farig Sadeque; Guergana Savova; Timothy A Miller
Journal:  J Am Med Inform Assoc       Date:  2020-04-01       Impact factor: 4.497

Review 5.  Deep learning in clinical natural language processing: a methodical review.

Authors:  Stephen Wu; Kirk Roberts; Surabhi Datta; Jingcheng Du; Zongcheng Ji; Yuqi Si; Sarvesh Soni; Qiong Wang; Qiang Wei; Yang Xiang; Bo Zhao; Hua Xu
Journal:  J Am Med Inform Assoc       Date:  2020-03-01       Impact factor: 4.497

6.  Transfer Learning from BERT to Support Insertion of New Concepts into SNOMED CT.

Authors:  Hao Liu; Yehoshua Perl; James Geller
Journal:  AMIA Annu Symp Proc       Date:  2020-03-04

7.  Relation Extraction from Clinical Narratives Using Pre-trained Language Models.

Authors:  Qiang Wei; Zongcheng Ji; Yuqi Si; Jingcheng Du; Jingqi Wang; Firat Tiryaki; Stephen Wu; Cui Tao; Kirk Roberts; Hua Xu
Journal:  AMIA Annu Symp Proc       Date:  2020-03-04

8.  Understanding spatial language in radiology: Representation framework, annotation, and spatial relation extraction from chest X-ray reports using deep learning.

Authors:  Surabhi Datta; Yuqi Si; Laritza Rodriguez; Sonya E Shooshan; Dina Demner-Fushman; Kirk Roberts
Journal:  J Biomed Inform       Date:  2020-06-18       Impact factor: 6.317

9.  Generalizability of SuperAlarm via Cross-Institutional Performance Evaluation.

Authors:  Ran Xiao; Duc Do; Cheng Ding; Karl Meisel; Randall Lee; Xiao Hu
Journal:  IEEE Access       Date:  2020-07-16       Impact factor: 3.367

10.  Application of Artificial Intelligence Methods to Pharmacy Data for Cancer Surveillance and Epidemiology Research: A Systematic Review.

Authors:  Andrew E Grothen; Bethany Tennant; Catherine Wang; Andrea Torres; Bonny Bloodgood Sheppard; Glenn Abastillas; Marina Matatova; Jeremy L Warner; Donna R Rivera
Journal:  JCO Clin Cancer Inform       Date:  2020-11
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.