Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 State-of-the-art anonymization of medical records using an iterative machine learning framework.

Literature DB >> 17823086

State-of-the-art anonymization of medical records using an iterative machine learning framework.

György Szarvas¹, Richárd Farkas, Róbert Busa-Fekete.

Abstract

OBJECTIVE: The anonymization of medical records is of great importance in the human life sciences because a de-identified text can be made publicly available for non-hospital researchers as well, to facilitate research on human diseases. Here the authors have developed a de-identification model that can successfully remove personal health information (PHI) from discharge records to make them conform to the guidelines of the Health Information Portability and Accountability Act.
DESIGN: We introduce here a novel, machine learning-based iterative Named Entity Recognition approach intended for use on semi-structured documents like discharge records. Our method identifies PHI in several steps. First, it labels all entities whose tags can be inferred from the structure of the text and it then utilizes this information to find further PHI phrases in the flow text parts of the document. MEASUREMENTS: Following the standard evaluation method of the first Workshop on Challenges in Natural Language Processing for Clinical Data, we used token-level Precision, Recall and F(beta=1) measure metrics for evaluation.
RESULTS: Our system achieved outstanding accuracy on the standard evaluation dataset of the de-identification challenge, with an F measure of 99.7534% for the best submitted model.
CONCLUSION: We can say that our system is competitive with the current state-of-the-art solutions, while we describe here several techniques that can be beneficial in other tasks that need to handle structured documents such as clinical records.

Entities: Disease Species

Mesh：

Year: 2007 PMID： 17823086 PMCID： PMC1975791 DOI： 10.1197/j.jamia.M2441

Source DB: PubMed Journal: J Am Med Inform Assoc ISSN： 1067-5027 Impact factor: 4.497

5 in total

1. Medical document anonymization with a semantic lexicon.

Authors: P Ruch; R H Baud; A M Rassinoux; P Bouillon; G Robert
Journal: Proc AMIA Symp Date: 2000

2. Identification of patient name references within medical documents using semantic selectional restrictions.

Authors: Ricky K Taira; Alex A T Bui; Hooshang Kangarloo
Journal: Proc AMIA Symp Date: 2002

3. A successful technique for removing names in pathology reports using an augmented search and replace method.

Authors: Sean M Thomas; Burke Mamlin; Gunther Schadow; Clement McDonald
Journal: Proc AMIA Symp Date: 2002

4. Replacing personally-identifying information in medical records, the Scrub system.

Authors: L Sweeney
Journal: Proc AMIA Annu Fall Symp Date: 1996

5. Evaluation of a deidentification (De-Id) software engine to share pathology reports and clinical documents for research.

Authors: Dilip Gupta; Melissa Saul; John Gilbertson
Journal: Am J Clin Pathol Date: 2004-02 Impact factor: 2.493

5 in total

46 in total

Review 1. Strategies for de-identification and anonymization of electronic health record data for use in multicenter research studies.

Authors: Clete A Kushida; Deborah A Nichols; Rik Jadrnicek; Ric Miller; James K Walsh; Kara Griffin
Journal: Med Care Date: 2012-07 Impact factor: 2.983

2. Hiding in plain sight: use of realistic surrogates to reduce exposure of protected health information in clinical text.

Authors: David Carrell; Bradley Malin; John Aberdeen; Samuel Bayer; Cheryl Clark; Ben Wellner; Lynette Hirschman
Journal: J Am Med Inform Assoc Date: 2012-07-06 Impact factor: 4.497

3. Embedding a hiding function in a portable electronic health record for privacy preservation.

Authors: Lu-Chou Huang; Huei-Chung Chu; Chung-Yueh Lien; Chia-Hung Hsiao; Tsair Kao
Journal: J Med Syst Date: 2010-06 Impact factor: 4.460

4. CRFs based de-identification of medical records.

Authors: Bin He; Yi Guan; Jianyi Cheng; Keting Cen; Wenlan Hua
Journal: J Biomed Inform Date: 2015-08-24 Impact factor: 6.317

5. Evaluating the state-of-the-art in automatic de-identification.

Authors: Ozlem Uzuner; Yuan Luo; Peter Szolovits
Journal: J Am Med Inform Assoc Date: 2007-06-28 Impact factor: 4.497

6. Leveraging existing corpora for de-identification of psychiatric notes using domain adaptation.

Authors: Hee-Jin Lee; Yaoyun Zhang; Kirk Roberts; Hua Xu
Journal: AMIA Annu Symp Proc Date: 2018-04-16

7. Advancing the framework: use of health data--a report of a working conference of the American Medical Informatics Association.

Authors: Meryl Bloomrosen; Don Detmer
Journal: J Am Med Inform Assoc Date: 2008-08-28 Impact factor: 4.497

8. Understanding identifiability in secondary health data.

Authors: Niko Yiannakoulias
Journal: Can J Public Health Date: 2011 Jul-Aug

9. Repurposing the clinical record: can an existing natural language processing system de-identify clinical notes?

Authors: Frances P Morrison; Li Li; Albert M Lai; George Hripcsak
Journal: J Am Med Inform Assoc Date: 2008-10-24 Impact factor: 4.497

10. Resilience of clinical text de-identified with "hiding in plain sight" to hostile reidentification attacks by human readers.

Authors: David S Carrell; Bradley A Malin; David J Cronkite; John S Aberdeen; Cheryl Clark; Muqun Rachel Li; Dikshya Bastakoty; Steve Nyemba; Lynette Hirschman
Journal: J Am Med Inform Assoc Date: 2020-07-01 Impact factor: 4.497