Literature DB >> 28602908

Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes.

Azad Dehghan1, Aleksandar Kovacevic2, George Karystianis3, John A Keane4, Goran Nenadic5.   

Abstract

De-identification of clinical narratives is one of the main obstacles to making healthcare free text available for research. In this paper we describe our experience in expanding and tailoring two existing tools as part of the 2016 CEGS N-GRID Shared Tasks Track 1, which evaluated de-identification methods on a set of psychiatric evaluation notes for up to 25 different types of Protected Health Information (PHI). The methods we used rely on machine learning on either a large or small feature space, with additional strategies, including two-pass tagging and multi-class models, which both proved to be beneficial. The results show that the integration of the proposed methods can identify Health Information Portability and Accountability Act (HIPAA) defined PHIs with overall F1-scores of ∼90% and above. Yet, some classes (Profession, Organization) proved again to be challenging given the variability of expressions used to reference given information.
Copyright © 2017. Published by Elsevier Inc.

Entities:  

Keywords:  Clinical text mining; De-identification; Electronic health record; Information extraction; Named entity recognition

Mesh:

Year:  2017        PMID: 28602908      PMCID: PMC5705401          DOI: 10.1016/j.jbi.2017.06.005

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  19 in total

1.  Hiding in plain sight: use of realistic surrogates to reduce exposure of protected health information in clinical text.

Authors:  David Carrell; Bradley Malin; John Aberdeen; Samuel Bayer; Cheryl Clark; Ben Wellner; Lynette Hirschman
Journal:  J Am Med Inform Assoc       Date:  2012-07-06       Impact factor: 4.497

2.  A de-identifier for medical discharge summaries.

Authors:  Ozlem Uzuner; Tawanda C Sibanda; Yuan Luo; Peter Szolovits
Journal:  Artif Intell Med       Date:  2007-11-28       Impact factor: 5.326

3.  Text de-identification for privacy protection: a study of its impact on clinical text information content.

Authors:  Stéphane M Meystre; Óscar Ferrández; F Jeffrey Friedlin; Brett R South; Shuying Shen; Matthew H Samore
Journal:  J Biomed Inform       Date:  2014-02-03       Impact factor: 6.317

4.  The MITRE Identification Scrubber Toolkit: design, training, and assessment.

Authors:  John Aberdeen; Samuel Bayer; Reyyan Yeniterzi; Ben Wellner; Cheryl Clark; David Hanauer; Bradley Malin; Lynette Hirschman
Journal:  Int J Med Inform       Date:  2010-10-14       Impact factor: 4.046

5.  Repurposing the clinical record: can an existing natural language processing system de-identify clinical notes?

Authors:  Frances P Morrison; Li Li; Albert M Lai; George Hripcsak
Journal:  J Am Med Inform Assoc       Date:  2008-10-24       Impact factor: 4.497

Review 6.  Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.

Authors:  Amber Stubbs; Christopher Kotfila; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2015-07-28       Impact factor: 6.317

7.  Combining rules and machine learning for extraction of temporal expressions and events from clinical narratives.

Authors:  Aleksandar Kovacevic; Azad Dehghan; Michele Filannino; John A Keane; Goran Nenadic
Journal:  J Am Med Inform Assoc       Date:  2013-04-20       Impact factor: 4.497

8.  Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text.

Authors:  Brett R South; Danielle Mowery; Ying Suo; Jianwei Leng; Óscar Ferrández; Stephane M Meystre; Wendy W Chapman
Journal:  J Biomed Inform       Date:  2014-05-20       Impact factor: 6.317

9.  Automatic detection of protected health information from clinic narratives.

Authors:  Hui Yang; Jonathan M Garibaldi
Journal:  J Biomed Inform       Date:  2015-07-29       Impact factor: 6.317

10.  Development and evaluation of a de-identification procedure for a case register sourced from mental health electronic records.

Authors:  Andrea C Fernandes; Danielle Cloete; Matthew T M Broadbent; Richard D Hayes; Chin-Kuo Chang; Richard G Jackson; Angus Roberts; Jason Tsang; Murat Soncul; Jennifer Liebscher; Robert Stewart; Felicity Callard
Journal:  BMC Med Inform Decis Mak       Date:  2013-07-11       Impact factor: 2.796

View more
  3 in total

Review 1.  De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.

Authors:  Amber Stubbs; Michele Filannino; Özlem Uzuner
Journal:  J Biomed Inform       Date:  2017-06-11       Impact factor: 6.317

2.  A natural language processing challenge for clinical records: Research Domains Criteria (RDoC) for psychiatry.

Authors:  Özlem Uzuner; Amber Stubbs; Michele Filannino
Journal:  J Biomed Inform       Date:  2017-10-16       Impact factor: 6.317

Review 3.  Clinical concept extraction: A methodology review.

Authors:  Sunyang Fu; David Chen; Huan He; Sijia Liu; Sungrim Moon; Kevin J Peterson; Feichen Shen; Liwei Wang; Yanshan Wang; Andrew Wen; Yiqing Zhao; Sunghwan Sohn; Hongfang Liu
Journal:  J Biomed Inform       Date:  2020-08-06       Impact factor: 6.317

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.