Literature DB >> 31437914

Annotating German Clinical Documents for De-Identification.

Tobias Kolditz1, Christina Lohr1, Johannes Hellrich1, Luise Modersohn1, Boris Betz2, Michael Kiehntopf2, Udo Hahn1.   

Abstract

We devised annotation guidelines for the de-identification of German clinical documents and assembled a corpus of 1,106 discharge summaries and transfer letters with 44K annotated protected health information (PHI) items. After three iteration rounds, our annotation team finally reached an inter-annotator agreement of 0.96 on the instance level and 0.97 on the token level of annotation (averaged pair-wise F1 score). To establish a baseline for automatic de-identification on our corpus, we trained a recurrent neural network (RNN) and achieved F1 scores greater than 0.9 on most major PHI categories.

Keywords:  Confidentiality; Data Anonymization; Natural Language Processing

Mesh:

Year:  2019        PMID: 31437914     DOI: 10.3233/SHTI190212

Source DB:  PubMed          Journal:  Stud Health Technol Inform        ISSN: 0926-9630


  1 in total

1.  Knowledge-based best of breed approach for automated detection of clinical events based on German free text digital hospital discharge letters.

Authors:  Maximilian König; André Sander; Ilja Demuth; Daniel Diekmann; Elisabeth Steinhagen-Thiessen
Journal:  PLoS One       Date:  2019-11-27       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.