| Literature DB >> 23304289 |
Óscar Ferrández1, Brett R South, Shuying Shen, F Jeff Friedlin, Matthew H Samore, Stéphane M Meystre.
Abstract
In this paper, we present an evaluation of the hybrid best-of-breed automated VHA (Veteran's Health Administration) clinical text de-identification system, nicknamed BoB, developed within the VHA Consortium for Healthcare Informatics Research. We also evaluate two available machine learning-based text de-identifications systems: MIST and HIDE. Two different clinical corpora were used for this evaluation: a manually annotated VHA corpus, and the 2006 i2b2 de-identification challenge corpus. These experiments focus on the generalizability and portability of the classification models across different document sources. BoB demonstrated good recall (92.6%), satisfactorily prioritizing patient privacy, and also achieved competitive precision (83.6%) for preserving subsequent document interpretability. MIST and HIDE reached very competitive results, in most cases with high precision (92.6% and 93.6%), although recall was sometimes lower than desired for the most sensitive PHI categories.Entities:
Mesh:
Year: 2012 PMID: 23304289 PMCID: PMC3540471
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076