Literature DB >> 24370391

Proposal and evaluation of FASDIM, a Fast And Simple De-Identification Method for unstructured free-text clinical records.

Emmanuel Chazard1, Capucine Mouret2, Grégoire Ficheur3, Aurélien Schaffar3, Jean-Baptiste Beuscart4, Régis Beuscart3.   

Abstract

PURPOSE: Medical free-text records enable to get rich information about the patients, but often need to be de-identified by removing the Protected Health Information (PHI), each time the identification of the patient is not mandatory. Pattern matching techniques require pre-defined dictionaries, and machine learning techniques require an extensive training set. Methods exist in French, but either bring weak results or are not freely available. The objective is to define and evaluate FASDIM, a Fast And Simple De-Identification Method for French medical free-text records.
METHODS: FASDIM consists in removing all the words that are not present in the authorized word list, and in removing all the numbers except those that match a list of protection patterns. The corresponding lists are incremented in the course of the iterations of the method. For the evaluation, the workload is estimated in the course of records de-identification. The efficiency of the de-identification is assessed by independent medical experts on 508 discharge letters that are randomly selected and de-identified by FASDIM. Finally, the letters are encoded after and before de-identification according to 3 terminologies (ATC, ICD10, CCAM) and the codes are compared.
RESULTS: The construction of the list of authorized words is progressive: 12h for the first 7000 letters, 16 additional hours for 20,000 additional letters. The Recall (proportion of removed Protected Health Information, PHI) is 98.1%, the Precision (proportion of PHI within the removed token) is 79.6% and the F-measure (harmonic mean) is 87.9%. In average 30.6 terminology codes are encoded per letter, and 99.02% of those codes are preserved despite the de-identification.
CONCLUSION: FASDIM gets good results in French and is freely available. It is easy to implement and does not require any predefined dictionary.
Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

Entities:  

Keywords:  Anonymization; Confidentiality; De-identification; Free text; Natural language processing

Mesh:

Year:  2013        PMID: 24370391     DOI: 10.1016/j.ijmedinf.2013.11.005

Source DB:  PubMed          Journal:  Int J Med Inform        ISSN: 1386-5056            Impact factor:   4.046


  7 in total

1.  De-identifying Spanish medical texts - named entity recognition applied to radiology reports.

Authors:  Irene Pérez-Díez; Raúl Pérez-Moraga; Adolfo López-Cerdán; Jose-Maria Salinas-Serrano; María de la Iglesia-Vayá
Journal:  J Biomed Semantics       Date:  2021-03-29

2.  Underuse of Oral Anticoagulants and Inappropriate Prescription of Antiplatelet Therapy in Older Inpatients with Atrial Fibrillation.

Authors:  Lorette Averlant; Grégoire Ficheur; Laurie Ferret; Stéphane Boulé; François Puisieux; Michel Luyckx; Julien Soula; Alexandre Georges; Régis Beuscart; Emmanuel Chazard; Jean-Baptiste Beuscart
Journal:  Drugs Aging       Date:  2017-09       Impact factor: 3.923

Review 3.  Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress.

Authors:  S M Meystre; C Lovis; T Bürkle; G Tognola; A Budrionis; C U Lehmann
Journal:  Yearb Med Inform       Date:  2017-09-11

4.  Adverse drug events with hyperkalaemia during inpatient stays: evaluation of an automated method for retrospective detection in hospital databases.

Authors:  Grégoire Ficheur; Emmanuel Chazard; Jean-Baptiste Beuscart; Béatrice Merlin; Michel Luyckx; Régis Beuscart
Journal:  BMC Med Inform Decis Mak       Date:  2014-09-12       Impact factor: 2.796

Review 5.  Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review.

Authors:  Raphaël Chevrier; Vasiliki Foufi; Christophe Gaudet-Blavignac; Arnaud Robert; Christian Lovis
Journal:  J Med Internet Res       Date:  2019-05-31       Impact factor: 5.428

Review 6.  Clinical Natural Language Processing in languages other than English: opportunities and challenges.

Authors:  Aurélie Névéol; Hercules Dalianis; Sumithra Velupillai; Guergana Savova; Pierre Zweigenbaum
Journal:  J Biomed Semantics       Date:  2018-03-30

7.  Community-Acquired Acute Kidney Injury Induced By Drugs In Older Patients: A Multifactorial Event.

Authors:  Laurine Robert; Grégoire Ficheur; Sophie Gautier; Alexandre Servais; Michel Luyckx; Julien Soula; Bertrand Decaudin; François Glowacki; François Puisieux; Emmanuel Chazard; Jean-Baptiste Beuscart
Journal:  Clin Interv Aging       Date:  2019-12-05       Impact factor: 4.458

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.