Literature DB >> 19482543

Developing a standard for de-identifying electronic patient records written in Swedish: precision, recall and F-measure in a manual and computerized annotation trial.

Sumithra Velupillai1, Hercules Dalianis, Martin Hassel, Gunnar H Nilsson.   

Abstract

BACKGROUND: Electronic patient records (EPRs) contain a large amount of information written in free text. This information is considered very valuable for research but is also very sensitive since the free text parts may contain information that could reveal the identity of a patient. Therefore, methods for de-identifying EPRs are needed. The work presented here aims to perform a manual and automatic Protected Health Information (PHI)-annotation trial for EPRs written in Swedish.
METHODS: This study consists of two main parts: the initial creation of a manually PHI-annotated gold standard, and the porting and evaluation of an existing de-identification software written for American English to Swedish in a preliminary automatic de-identification trial. Results are measured with precision, recall and F-measure.
RESULTS: This study reports fairly high Inter-Annotator Agreement (IAA) results on the manually created gold standard, especially for specific tags such as names. The average IAA over all tags was 0.65 F-measure (0.84 F-measure highest pairwise agreement). For name tags the average IAA was 0.80 F-measure (0.91 F-measure highest pairwise agreement). Porting a de-identification software written for American English to Swedish directly was unfortunately non-trivial, yielding poor results.
CONCLUSION: Developing gold standard sets as well as automatic systems for de-identification tasks in Swedish is feasible. However, discussions and definitions on identifiable information is needed, as well as further developments both on the tag sets and the annotation guidelines, in order to get a reliable gold standard. A completely new de-identification software needs to be developed.

Entities:  

Mesh:

Year:  2009        PMID: 19482543     DOI: 10.1016/j.ijmedinf.2009.04.005

Source DB:  PubMed          Journal:  Int J Med Inform        ISSN: 1386-5056            Impact factor:   4.046


  8 in total

Review 1.  Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies.

Authors:  Clete A Kushida; Deborah A Nichols; Rik Jadrnicek; Ric Miller; James K Walsh; Kara Griffin
Journal:  Med Care       Date:  2012-07       Impact factor: 2.983

2.  Recognition and pseudonymisation of medical records for secondary use.

Authors:  Johannes Heurix; Stefan Fenz; Antonio Rella; Thomas Neubauer
Journal:  Med Biol Eng Comput       Date:  2015-06-04       Impact factor: 2.602

3.  BoB, a best-of-breed automated text de-identification system for VHA clinical documents.

Authors:  Oscar Ferrández; Brett R South; Shuying Shen; F Jeffrey Friedlin; Matthew H Samore; Stéphane M Meystre
Journal:  J Am Med Inform Assoc       Date:  2012-09-04       Impact factor: 4.497

4.  Is the Juice Worth the Squeeze? Costs and Benefits of Multiple Human Annotators for Clinical Text De-identification.

Authors:  David S Carrell; David J Cronkite; Bradley A Malin; John S Aberdeen; Lynette Hirschman
Journal:  Methods Inf Med       Date:  2016-07-13       Impact factor: 2.176

5.  A De-identification method for bilingual clinical texts of various note types.

Authors:  Soo-Yong Shin; Yu Rang Park; Yongdon Shin; Hyo Joung Choi; Jihyun Park; Yongman Lyu; Moo-Song Lee; Chang-Min Choi; Woo-Sung Kim; Jae Ho Lee
Journal:  J Korean Med Sci       Date:  2014-12-23       Impact factor: 2.153

6.  De-identifying Swedish clinical text - refinement of a gold standard and experiments with Conditional random fields.

Authors:  Hercules Dalianis; Sumithra Velupillai
Journal:  J Biomed Semantics       Date:  2010-04-12

7.  De-identification of primary care electronic medical records free-text data in Ontario, Canada.

Authors:  Karen Tu; Julie Klein-Geltink; Tezeta F Mitiku; Chiriac Mihai; Joel Martin
Journal:  BMC Med Inform Decis Mak       Date:  2010-06-18       Impact factor: 2.796

8.  Persian sentiment analysis of an online store independent of pre-processing using convolutional neural network with fastText embeddings.

Authors:  Sajjad Shumaly; Mohsen Yazdinejad; Yanhui Guo
Journal:  PeerJ Comput Sci       Date:  2021-03-05
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.