Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.

Literature DB >> 26225918

Automated systems for the de-identification of longitudinal clinical narratives: Overview of 2014 i2b2/UTHealth shared task Track 1.

Amber Stubbs¹, Christopher Kotfila², Özlem Uzuner².

Abstract

The 2014 i2b2/UTHealth Natural Language Processing (NLP) shared task featured four tracks. The first of these was the de-identification track focused on identifying protected health information (PHI) in longitudinal clinical narratives. The longitudinal nature of clinical narratives calls particular attention to details of information that, while benign on their own in separate records, can lead to identification of patients in combination in longitudinal records. Accordingly, the 2014 de-identification track addressed a broader set of entities and PHI than covered by the Health Insurance Portability and Accountability Act - the focus of the de-identification shared task that was organized in 2006. Ten teams tackled the 2014 de-identification task and submitted 22 system outputs for evaluation. Each team was evaluated on their best performing system output. Three of the 10 systems achieved F1 scores over .90, and seven of the top 10 scored over .75. The most successful systems combined conditional random fields and hand-written rules. Our findings indicate that automated systems can be very effective for this task, but that de-identification is not yet a solved problem.

Entities: Disease Gene Species

Keywords: Machine learning; Medical records; Natural language processing; Shared task

Mesh：

Year: 2015 PMID： 26225918 PMCID： PMC4989908 DOI： 10.1016/j.jbi.2015.06.007

Source DB: PubMed Journal: J Biomed Inform ISSN： 1532-0464 Impact factor: 6.317

13 in total

Review 1. Identifying risk factors for heart disease over time: Overview of 2014 i2b2/UTHealth shared task Track 2.

Authors: Amber Stubbs; Christopher Kotfila; Hua Xu; Özlem Uzuner
Journal: J Biomed Inform Date: 2015-07-22 Impact factor: 6.317

2. MedEx: a medication information extraction system for clinical narratives.

Authors: Hua Xu; Shane P Stenner; Son Doan; Kevin B Johnson; Lemuel R Waitman; Joshua C Denny
Journal: J Am Med Inform Assoc Date: 2010 Jan-Feb Impact factor: 4.497

3. BoB, a best-of-breed automated text de-identification system for VHA clinical documents.

Authors: Oscar Ferrández; Brett R South; Shuying Shen; F Jeffrey Friedlin; Matthew H Samore; Stéphane M Meystre
Journal: J Am Med Inform Assoc Date: 2012-09-04 Impact factor: 4.497

4. Creation of a new longitudinal corpus of clinical narratives.

Authors: Vishesh Kumar; Amber Stubbs; Stanley Shaw; Özlem Uzuner
Journal: J Biomed Inform Date: 2015-10-01 Impact factor: 6.317

5. The MITRE Identification Scrubber Toolkit: design, training, and assessment.

Authors: John Aberdeen; Samuel Bayer; Reyyan Yeniterzi; Ben Wellner; Cheryl Clark; David Hanauer; Bradley Malin; Lynette Hirschman
Journal: Int J Med Inform Date: 2010-10-14 Impact factor: 4.046

Review 6. Automatic de-identification of textual documents in the electronic health record: a review of recent research.

Authors: Stephane M Meystre; F Jeffrey Friedlin; Brett R South; Shuying Shen; Matthew H Samore
Journal: BMC Med Res Methodol Date: 2010-08-02 Impact factor: 4.615

7. The re-identification risk of Canadians from longitudinal demographics.

Authors: Khaled El Emam; David Buckeridge; Robyn Tamblyn; Angelica Neisa; Elizabeth Jonker; Aman Verma
Journal: BMC Med Inform Decis Mak Date: 2011-06-22 Impact factor: 2.796

8. Overview of the ID, EPI and REL tasks of BioNLP Shared Task 2011.

Authors: Sampo Pyysalo; Tomoko Ohta; Rafal Rak; Dan Sullivan; Chunhong Mao; Chunxia Wang; Bruno Sobral; Jun'ichi Tsujii; Sophia Ananiadou
Journal: BMC Bioinformatics Date: 2012-06-26 Impact factor: 3.169

9. Overview of BioCreAtIvE: critical assessment of information extraction for biology.

Authors: Lynette Hirschman; Alexander Yeh; Christian Blaschke; Alfonso Valencia
Journal: BMC Bioinformatics Date: 2005-05-24 Impact factor: 3.169

10. Large-scale evaluation of automated clinical note de-identification and its impact on information extraction.

Authors: Louise Deleger; Katalin Molnar; Guergana Savova; Fei Xia; Todd Lingren; Qi Li; Keith Marsolo; Anil Jegga; Megan Kaiser; Laura Stoutenborough; Imre Solti
Journal: J Am Med Inform Assoc Date: 2012-08-02 Impact factor: 4.497

62 in total

1. Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus.

Authors: Amber Stubbs; Özlem Uzuner
Journal: J Biomed Inform Date: 2015-08-28 Impact factor: 6.317

2. Leveraging existing corpora for de-identification of psychiatric notes using domain adaptation.

Authors: Hee-Jin Lee; Yaoyun Zhang; Kirk Roberts; Hua Xu
Journal: AMIA Annu Symp Proc Date: 2018-04-16

Review 3. De-identification of psychiatric intake records: Overview of 2016 CEGS N-GRID shared tasks Track 1.

Authors: Amber Stubbs; Michele Filannino; Özlem Uzuner
Journal: J Biomed Inform Date: 2017-06-11 Impact factor: 6.317

4. Leveraging Health Information Exchange to Construct a Registry for Traumatic Brain Injury, Spinal Cord Injury and Stroke in Indiana.

Authors: Saurabh Rahurkar; Timothy D McFarlane; Jane Wang; Sarah Hoover; Flora Hammond; Jacob Kean; Brian E Dixon
Journal: AMIA Annu Symp Proc Date: 2018-04-16

5. Enhancing clinical concept extraction with contextual embeddings.

Authors: Yuqi Si; Jingqi Wang; Hua Xu; Kirk Roberts
Journal: J Am Med Inform Assoc Date: 2019-11-01 Impact factor: 4.497

6. The machine giveth and the machine taketh away: a parrot attack on clinical text deidentified with hiding in plain sight.

Authors: David S Carrell; David J Cronkite; Muqun Rachel Li; Steve Nyemba; Bradley A Malin; John S Aberdeen; Lynette Hirschman
Journal: J Am Med Inform Assoc Date: 2019-12-01 Impact factor: 4.497

7. Ensemble-based Methods to Improve De-identification of Electronic Health Record Narratives.

Authors: Youngjun Kim; Paul Heider; Stéphane Meystre
Journal: AMIA Annu Symp Proc Date: 2018-12-05

8. Comparative Study of Various Approaches for Ensemble-based De-identification of Electronic Health Record Narratives.

Authors: Youngjun Kim; Paul M Heider; Stéphane M Meystre
Journal: AMIA Annu Symp Proc Date: 2021-01-25

Review 9. Clinical Data Reuse or Secondary Use: Current Status and Potential Future Progress.

Authors: S M Meystre; C Lovis; T Bürkle; G Tognola; A Budrionis; C U Lehmann
Journal: Yearb Med Inform Date: 2017-09-11

10. Resilience of clinical text de-identified with "hiding in plain sight" to hostile reidentification attacks by human readers.

Authors: David S Carrell; Bradley A Malin; David J Cronkite; John S Aberdeen; Cheryl Clark; Muqun Rachel Li; Dikshya Bastakoty; Steve Nyemba; Lynette Hirschman
Journal: J Am Med Inform Assoc Date: 2020-07-01 Impact factor: 4.497