Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 An Empirical Test of GRUs and Deep Contextualized Word Representations on De-Identification.

Literature DB >> 31437917

An Empirical Test of GRUs and Deep Contextualized Word Representations on De-Identification.

Kahyun Lee¹, Michele Filannino¹, Özlem Uzuner¹.

Abstract

De-identification aims to remove 18 categories of protected health information from electronic health records. Ideally, de-identification systems should be reliable and generalizable. Previous research has focused on improving performance but has not examined generalizability. This paper investigates both performance and generalizability. To improve current state-of-the-art performance based on long short-term memory (LSTM) units, we introduce a system that uses gated recurrent units (GRUs) and deep contextualized word representations, both of which have never been applied to de-identification. We measure performance and generalizability of each system using the 2014 i2b2/UTHealth and 2016 CEGS N-GRID de-identification datasets. We show that deep contextualized word representations improve state-of-the-art performance, while the benefit of switching LSTM units with GRUs is not significant. The generalizability of de-identification system significantly improved with deep contextualized word representations; in addition, LSTM units-based system is more generalizable than the GRUs-based system.

Entities: Disease

Keywords: Data Anonymization; Machine Learning; Natural Language Processing

Mesh：

Year: 2019 PMID： 31437917 DOI： 10.3233/SHTI190215

Source DB: PubMed Journal: Stud Health Technol Inform ISSN： 0926-9630

Keyword Cloud
Cited

3 in total

An Empirical Test of GRUs and Deep Contextualized Word Representations on De-Identification.

1. Comparative Study of Various Approaches for Ensemble-based De-identification of Electronic Health Record Narratives.

2. A Context-Enhanced De-identification System.

3. Text Score Analysis under the IPE Environment Based on Improved Transformer.