Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Annotating German Clinical Documents for De-Identification.

Literature DB >> 31437914

Annotating German Clinical Documents for De-Identification.

Tobias Kolditz¹, Christina Lohr¹, Johannes Hellrich¹, Luise Modersohn¹, Boris Betz², Michael Kiehntopf², Udo Hahn¹.

Abstract

We devised annotation guidelines for the de-identification of German clinical documents and assembled a corpus of 1,106 discharge summaries and transfer letters with 44K annotated protected health information (PHI) items. After three iteration rounds, our annotation team finally reached an inter-annotator agreement of 0.96 on the instance level and 0.97 on the token level of annotation (averaged pair-wise F1 score). To establish a baseline for automatic de-identification on our corpus, we trained a recurrent neural network (RNN) and achieved F1 scores greater than 0.9 on most major PHI categories.

Keywords: Confidentiality; Data Anonymization; Natural Language Processing

Mesh：

Year: 2019 PMID： 31437914 DOI： 10.3233/SHTI190212

Source DB: PubMed Journal: Stud Health Technol Inform ISSN： 0926-9630

Keyword Cloud
Cited

1 in total

1. Knowledge-based best of breed approach for automated detection of clinical events based on German free text digital hospital discharge letters.

Authors: Maximilian König; André Sander; Ilja Demuth; Daniel Diekmann; Elisabeth Steinhagen-Thiessen
Journal: PLoS One Date: 2019-11-27 Impact factor: 3.240

1 in total