Literature DB >> 35308988

Intrinsic Evaluation of Contextual and Non-contextual Word Embeddings using Radiology Reports.

Mirza S Khan1,2,3, Bennett A Landman2,3, Stephen A Deppen3, Michael E Matheny1,3.   

Abstract

Many clinical natural language processing methods rely on non-contextual word embedding (NCWE) or contextual word embedding (CWE) models. Yet, few, if any, intrinsic evaluation benchmarks exist comparing embedding representations against clinician judgment. We developed intrinsic evaluation tasks for embedding models using a corpus of radiology reports: term pair similarity for NCWEs and cloze task accuracy for CWEs. Using surveys, we quantified the agreement between clinician judgment and embedding model representations. We compare embedding models trained on a custom radiology report corpus (RRC), a general corpus, and PubMed and MIMIC-III corpora (P&MC). Cloze task accuracy was equivalent for RRC and P&MC models. For term pair similarity, P&MC-trained NCWEs outperformed all other NCWE models (ρspearman 0.61 vs. 0.27-0.44). Among models trained on RRC, fastText models often outperformed other NCWE models and spherical embeddings provided overly optimistic representations of term pair similarity. ©2021 AMIA - All rights reserved.

Entities:  

Mesh:

Year:  2022        PMID: 35308988      PMCID: PMC8861761     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  14 in total

1.  Measures of semantic similarity and relatedness in the biomedical domain.

Authors:  Ted Pedersen; Serguei V S Pakhomov; Siddharth Patwardhan; Christopher G Chute
Journal:  J Biomed Inform       Date:  2006-06-10       Impact factor: 6.317

2.  Fleischner Society: glossary of terms for thoracic imaging.

Authors:  David M Hansell; Alexander A Bankier; Heber MacMahon; Theresa C McLoud; Nestor L Müller; Jacques Remy
Journal:  Radiology       Date:  2008-01-14       Impact factor: 11.105

3.  Semantic Similarity and Relatedness between Clinical Terms: An Experimental Study.

Authors:  Serguei Pakhomov; Bridget McInnes; Terrence Adam; Ying Liu; Ted Pedersen; Genevieve B Melton
Journal:  AMIA Annu Symp Proc       Date:  2010-11-13

4.  Intelligent Word Embeddings of Free-Text Radiology Reports.

Authors:  Imon Banerjee; Sriraman Madhavan; Roger Eric Goldman; Daniel L Rubin
Journal:  AMIA Annu Symp Proc       Date:  2018-04-16

5.  Extracting similar terms from multiple EMR-based semantic embeddings to support chart reviews.

Authors:  Cheng Ye; Daniel Fabbri
Journal:  J Biomed Inform       Date:  2018-05-22       Impact factor: 6.317

6.  Corpus domain effects on distributional semantic modeling of medical terms.

Authors:  Serguei V S Pakhomov; Greg Finley; Reed McEwan; Yan Wang; Genevieve B Melton
Journal:  Bioinformatics       Date:  2016-08-16       Impact factor: 6.937

7.  Semantic similarity in the biomedical domain: an evaluation across knowledge sources.

Authors:  Vijay N Garla; Cynthia Brandt
Journal:  BMC Bioinformatics       Date:  2012-10-10       Impact factor: 3.169

8.  Secondary use of clinical data: the Vanderbilt approach.

Authors:  Ioana Danciu; James D Cowan; Melissa Basford; Xiaoming Wang; Alexander Saip; Susan Osgood; Jana Shirey-Rice; Jacqueline Kirby; Paul A Harris
Journal:  J Biomed Inform       Date:  2014-02-14       Impact factor: 6.317

9.  Differential Documentation of Race in the First Line of the History of Present Illness.

Authors:  Jessica R Balderston; Zachary M Gertz; Raees Seedat; Jackson L Rankin; Amanda W Hayes; Viviana A Rodriguez; Gregory J Golladay
Journal:  JAMA Intern Med       Date:  2021-03-01       Impact factor: 21.873

10.  BioWordVec, improving biomedical word embeddings with subword information and MeSH.

Authors:  Yijia Zhang; Qingyu Chen; Zhihao Yang; Hongfei Lin; Zhiyong Lu
Journal:  Sci Data       Date:  2019-05-10       Impact factor: 6.444

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.