Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Domain specific word embeddings for natural language processing in radiology.

Literature DB >> 33333323

Domain specific word embeddings for natural language processing in radiology.

Timothy L Chen¹, Max Emerling², Gunvant R Chaudhari³, Yeshwant R Chillakuru⁴, Youngho Seo³, Thienkhai H Vu³, Jae Ho Sohn⁵.

Abstract

BACKGROUND: There has been increasing interest in machine learning based natural language processing (NLP) methods in radiology; however, models have often used word embeddings trained on general web corpora due to lack of a radiology-specific corpus.
PURPOSE: We examined the potential of Radiopaedia to serve as a general radiology corpus to produce radiology specific word embeddings that could be used to enhance performance on a NLP task on radiological text.
MATERIALS AND METHODS: Embeddings of dimension 50, 100, 200, and 300 were trained on articles collected from Radiopaedia using a GloVe algorithm and evaluated on analogy completion. A shallow neural network using input from either our trained embeddings or pre-trained Wikipedia 2014 + Gigaword 5 (WG) embeddings was used to label the Radiopaedia articles. Labeling performance was evaluated based on exact match accuracy and Hamming loss. The McNemar's test with continuity and the Benjamini-Hochberg correction and a 5×2 cross validation paired two-tailed t-test were used to assess statistical significance.
RESULTS: For accuracy in the analogy task, 50-dimensional (50-D) Radiopaedia embeddings outperformed WG embeddings on tumor origin analogies (p < 0.05) and organ adjectives (p < 0.01) whereas WG embeddings tended to outperform on inflammation location and bone vs. muscle analogies (p < 0.01). The two embeddings had comparable performance on other subcategories. In the labeling task, the Radiopaedia-based model outperformed the WG based model at 50, 100, 200, and 300-D for exact match accuracy (p < 0.001, p < 0.001, p < 0.01, and p < 0.05, respectively) and Hamming loss (p < 0.001, p < 0.001, p < 0.01, and p < 0.05, respectively).
CONCLUSION: We have developed a set of word embeddings from Radiopaedia and shown that they can preserve relevant medical semantics and augment performance on a radiology NLP task. Our results suggest that the cultivation of a radiology-specific corpus can benefit radiology NLP models in the future.

Entities: Chemical

Keywords: Analogy completion; Multi-label classification; Natural language processing; Word embeddings

Mesh：

Year: 2020 PMID： 33333323 PMCID： PMC7856086 DOI： 10.1016/j.jbi.2020.103665

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

20 in total

1. RadLex: a new method for indexing online educational materials.

Authors: Curtis P Langlotz
Journal: Radiographics Date: 2006 Nov-Dec Impact factor: 5.333

2. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms.

Authors:
Journal: Neural Comput Date: 1998-09-15 Impact factor: 2.026

3. Recurrent neural networks for classifying relations in clinical notes.

Authors: Yuan Luo
Journal: J Biomed Inform Date: 2017-07-08 Impact factor: 6.317

4. Extraction of BI-RADS findings from breast ultrasound reports in Chinese using deep learning approaches.

Authors: Shumei Miao; Tingyu Xu; Yonghui Wu; Hui Xie; Jingqi Wang; Shenqi Jing; Yaoyun Zhang; Xiaoliang Zhang; Yinshuang Yang; Xin Zhang; Tao Shan; Li Wang; Hua Xu; Shui Wang; Yun Liu
Journal: Int J Med Inform Date: 2018-08-18 Impact factor: 4.046

Review 5. Natural Language Processing in Radiology: A Systematic Review.

Authors: Ewoud Pons; Loes M M Braun; M G Myriam Hunink; Jan A Kors
Journal: Radiology Date: 2016-05 Impact factor: 11.105

6. Radiology report annotation using intelligent word embeddings: Applied to multi-institutional chest CT cohort.

Authors: Imon Banerjee; Matthew C Chen; Matthew P Lungren; Daniel L Rubin
Journal: J Biomed Inform Date: 2017-11-23 Impact factor: 6.317

7. Deep Learning to Classify Radiology Free-Text Reports.

Authors: Matthew C Chen; Robyn L Ball; Lingyao Yang; Nathaniel Moradzadeh; Brian E Chapman; David B Larson; Curtis P Langlotz; Timothy J Amrhein; Matthew P Lungren
Journal: Radiology Date: 2017-11-13 Impact factor: 11.105

8. Comparative effectiveness of convolutional neural network (CNN) and recurrent neural network (RNN) architectures for radiology text report classification.

Authors: Imon Banerjee; Yuan Ling; Matthew C Chen; Sadid A Hasan; Curtis P Langlotz; Nathaniel Moradzadeh; Brian Chapman; Timothy Amrhein; David Mong; Daniel L Rubin; Oladimeji Farri; Matthew P Lungren
Journal: Artif Intell Med Date: 2018-11-23 Impact factor: 5.326

9. Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning.

Authors: Hoo-Chang Shin; Holger R Roth; Mingchen Gao; Le Lu; Ziyue Xu; Isabella Nogues; Jianhua Yao; Daniel Mollura; Ronald M Summers
Journal: IEEE Trans Med Imaging Date: 2016-02-11 Impact factor: 10.048

10. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports.

Authors: Alistair E W Johnson; Tom J Pollard; Seth J Berkowitz; Nathaniel R Greenbaum; Matthew P Lungren; Chih-Ying Deng; Roger G Mark; Steven Horng
Journal: Sci Data Date: 2019-12-12 Impact factor: 6.444

1 in total

1. Automatic text classification of actionable radiology reports of tinnitus patients using bidirectional encoder representations from transformer (BERT) and in-domain pre-training (IDPT).

Authors: Jia Li; Yucong Lin; Pengfei Zhao; Wenjuan Liu; Linkun Cai; Jing Sun; Lei Zhao; Zhenghan Yang; Hong Song; Han Lv; Zhenchang Wang
Journal: BMC Med Inform Decis Mak Date: 2022-07-30 Impact factor: 3.298

1 in total