Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A Study of Deep Learning Methods for De-identification of Clinical Notes at Cross Institute Settings.

Literature DB >> 31879734

A Study of Deep Learning Methods for De-identification of Clinical Notes at Cross Institute Settings.

Xi Yang¹, Tianchen Lyu¹, Chih-Yin Lee¹, Jiang Bian¹, William R Hogan¹, Yonghui Wu¹.

Abstract

In this study, we examined a deep learning method for de-identification of clinical notes at UF Health under a cross-institute setting. We developed deep learning models using 2014 i2b2/UTHealth corpus and evaluated the performance using clinical notes collected from UF Health. We compared four pre-trained word embeddings, including two embeddings from the general domain and two embeddings from the clinical domain. We also explored linguistic features (i.e., word shape and part-of-speech) to further improve the performance of de-identification. The experimental results show that the performance of deep learning models trained using i2b2/UTHealth corpus significantly dropped (strict and relax F1 scores dropped from 0.9547 and 0.9646 to 0.8360 and 0.8870) when applied to another corpus from a different institution (UF Health). Linguistic features, including word shapes and part-of-speech, could further improve the performance of de-identification in cross-institute settings (improved to 0.8527 and 0.9052).

Entities: Chemical Disease Gene Species

Keywords: De-identification; Deep Learning; Natural Language Processing

Year: 2019 PMID： 31879734 PMCID： PMC6932867 DOI： 10.1109/ICHI.2019.8904544

Source DB: PubMed Journal: IEEE Int Conf Healthc Inform ISSN： 2575-2626

Keyword Cloud
References

6 in total

A Study of Deep Learning Methods for De-identification of Clinical Notes at Cross Institute Settings.

1. Evaluating the state-of-the-art in automatic de-identification.

Review 2. De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.

Review 3. Natural language processing: an introduction.

4. Combine Factual Medical Knowledge and Distributed Word Representation to Improve Clinical Named Entity Recognition.

Review 5. Automatic de-identification of textual documents in the electronic health record: a review of recent research.

6. MIMIC-III, a freely accessible critical care database.