Literature DB >> 26231070

Automatic detection of protected health information from clinic narratives.

Hui Yang1, Jonathan M Garibaldi2.   

Abstract

This paper presents a natural language processing (NLP) system that was designed to participate in the 2014 i2b2 de-identification challenge. The challenge task aims to identify and classify seven main Protected Health Information (PHI) categories and 25 associated sub-categories. A hybrid model was proposed which combines machine learning techniques with keyword-based and rule-based approaches to deal with the complexity inherent in PHI categories. Our proposed approaches exploit a rich set of linguistic features, both syntactic and word surface-oriented, which are further enriched by task-specific features and regular expression template patterns to characterize the semantics of various PHI categories. Our system achieved promising accuracy on the challenge test data with an overall micro-averaged F-measure of 93.6%, which was the winner of this de-identification challenge.
Copyright © 2015 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Clinical text mining; De-identification; Hybrid model; Natural language processing; Protected Health Information (PHI)

Mesh:

Year:  2015        PMID: 26231070      PMCID: PMC4989090          DOI: 10.1016/j.jbi.2015.06.015

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  17 in total

1.  Identification of patient name references within medical documents using semantic selectional restrictions.

Authors:  Ricky K Taira; Alex A T Bui; Hooshang Kangarloo
Journal:  Proc AMIA Symp       Date:  2002

2.  A de-identifier for medical discharge summaries.

Authors:  Ozlem Uzuner; Tawanda C Sibanda; Yuan Luo; Peter Szolovits
Journal:  Artif Intell Med       Date:  2007-11-28       Impact factor: 5.326

3.  Rapidly retargetable approaches to de-identification in medical records.

Authors:  Ben Wellner; Matt Huyck; Scott Mardis; John Aberdeen; Alex Morgan; Leonid Peshkin; Alex Yeh; Janet Hitzeman; Lynette Hirschman
Journal:  J Am Med Inform Assoc       Date:  2007-06-28       Impact factor: 4.497

4.  A system for de-identifying medical message board text.

Authors:  Adrian Benton; Shawndra Hill; Lyle Ungar; Annie Chung; Charles Leonard; Cristin Freeman; John H Holmes
Journal:  BMC Bioinformatics       Date:  2011-06-09       Impact factor: 3.169

5.  BoB, a best-of-breed automated text de-identification system for VHA clinical documents.

Authors:  Oscar Ferrández; Brett R South; Shuying Shen; F Jeffrey Friedlin; Matthew H Samore; Stéphane M Meystre
Journal:  J Am Med Inform Assoc       Date:  2012-09-04       Impact factor: 4.497

6.  The MITRE Identification Scrubber Toolkit: design, training, and assessment.

Authors:  John Aberdeen; Samuel Bayer; Reyyan Yeniterzi; Ben Wellner; Cheryl Clark; David Hanauer; Bradley Malin; Lynette Hirschman
Journal:  Int J Med Inform       Date:  2010-10-14       Impact factor: 4.046

7.  Repurposing the clinical record: can an existing natural language processing system de-identify clinical notes?

Authors:  Frances P Morrison; Li Li; Albert M Lai; George Hripcsak
Journal:  J Am Med Inform Assoc       Date:  2008-10-24       Impact factor: 4.497

8.  Large-scale evaluation of automated clinical note de-identification and its impact on information extraction.

Authors:  Louise Deleger; Katalin Molnar; Guergana Savova; Fei Xia; Todd Lingren; Qi Li; Keith Marsolo; Anil Jegga; Megan Kaiser; Laura Stoutenborough; Imre Solti
Journal:  J Am Med Inform Assoc       Date:  2012-08-02       Impact factor: 4.497

9.  Development and evaluation of an open source software tool for deidentification of pathology reports.

Authors:  Bruce A Beckwith; Rajeshwarri Mahaadevan; Ulysses J Balis; Frank Kuo
Journal:  BMC Med Inform Decis Mak       Date:  2006-03-06       Impact factor: 2.796

10.  Improved de-identification of physician notes through integrative modeling of both public and private medical text.

Authors:  Andrew J McMurry; Britt Fitch; Guergana Savova; Isaac S Kohane; Ben Y Reis
Journal:  BMC Med Inform Decis Mak       Date:  2013-10-02       Impact factor: 2.796

View more
  22 in total

1.  Leveraging existing corpora for de-identification of psychiatric notes using domain adaptation.

Authors:  Hee-Jin Lee; Yaoyun Zhang; Kirk Roberts; Hua Xu
Journal:  AMIA Annu Symp Proc       Date:  2018-04-16

2.  Ensemble-based Methods to Improve De-identification of Electronic Health Record Narratives.

Authors:  Youngjun Kim; Paul Heider; Stéphane Meystre
Journal:  AMIA Annu Symp Proc       Date:  2018-12-05

3.  Comparative Study of Various Approaches for Ensemble-based De-identification of Electronic Health Record Narratives.

Authors:  Youngjun Kim; Paul M Heider; Stéphane M Meystre
Journal:  AMIA Annu Symp Proc       Date:  2021-01-25

4.  Healthcare Data Breaches: Implications for Digital Forensic Readiness.

Authors:  Maxim Chernyshev; Sherali Zeadally; Zubair Baig
Journal:  J Med Syst       Date:  2018-11-28       Impact factor: 4.460

5.  Automatic prediction of coronary artery disease from clinical narratives.

Authors:  Kevin Buchan; Michele Filannino; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2017-06-27       Impact factor: 6.317

6.  Practical applications for natural language processing in clinical research: The 2014 i2b2/UTHealth shared tasks.

Authors:  Özlem Uzuner; Amber Stubbs
Journal:  J Biomed Inform       Date:  2015-10-24       Impact factor: 6.317

7.  Scalable Iterative Classification for Sanitizing Large-Scale Datasets.

Authors:  Bo Li; Yevgeniy Vorobeychik; Muqun Li; Bradley Malin
Journal:  IEEE Trans Knowl Data Eng       Date:  2016-11-11       Impact factor: 6.977

8.  The UAB Informatics Institute and 2016 CEGS N-GRID de-identification shared task challenge.

Authors:  Duy Duc An Bui; Mathew Wyatt; James J Cimino
Journal:  J Biomed Inform       Date:  2017-05-03       Impact factor: 6.317

9.  De-identification of medical records using conditional random fields and long short-term memory networks.

Authors:  Zhipeng Jiang; Chao Zhao; Bin He; Yi Guan; Jingchi Jiang
Journal:  J Biomed Inform       Date:  2017-10-13       Impact factor: 6.317

Review 10.  Clinical concept extraction: A methodology review.

Authors:  Sunyang Fu; David Chen; Huan He; Sijia Liu; Sungrim Moon; Kevin J Peterson; Feichen Shen; Liwei Wang; Yanshan Wang; Andrew Wen; Yiqing Zhao; Sunghwan Sohn; Hongfang Liu
Journal:  J Biomed Inform       Date:  2020-08-06       Impact factor: 6.317

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.