| Literature DB >> 24026308 |
Mehmet Kayaalp1, Allen C Browne, Fiona M Callaghan, Zeyno A Dodd, Guy Divita, Selcuk Ozturk, Clement J McDonald.
Abstract
OBJECTIVE: To understand the factors that influence success in scrubbing personal names from narrative text.Entities:
Keywords: Chart Research; De-Identification; Electronic Medical Records; PHI
Mesh:
Year: 2013 PMID: 24026308 PMCID: PMC3994850 DOI: 10.1136/amiajnl-2013-001689
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1Portion of a patient report after VTT tagging. Red signifies patient name, pink a numeric identifier, yellow a date, and green an age. This report includes only bogus PHI for demonstration purposes.
Scrubbing tools studied and information about their origins and availability
| Name of scrubber | Source institution | Availability | Version tested |
|---|---|---|---|
| NLM-NS | National Library of Medicine, Lister Hill National Center for Biomedical Communications (Bethesda, MD) | Contact first author, Mehmet Kayaalp (mkayaalp@mail.nih.gov). | V1 |
| MITdeid | MIT (Cambridge, MA) | ||
| MIST | Mitre Corporation (Bedford, MA) | ||
| LingPipe | Alias-i, Inc. (Brooklyn, NY) | ||
| ANNIE/GATE | University of Sheffield, |
Study 1: experimental results at the token level tallying scrubbing success when scrubbers used their native name list
| Token instances | Patient tokens | Provider tokens | All tokens (non-name tokens) | ||||
|---|---|---|---|---|---|---|---|
| Total N | 2388 | 20 160 | 1 126 241 (1 103 693) | ||||
| Sensitivity (95% CI) | FN | Sensitivity (95% CI) | FN | Specificity (95% CI) | F-measure | FP | |
| NLM-NS | 0.999 (0.997 to 1) | 2 | 0.999 (0.999 to 1) | 11 | 0.987 (0.987 to 0.987) | 0.756 | 14 510 |
| MITdeid | 0.939 (0.929 to 0.948) | 145 | 0.850 (0.845 to 0.855) | 3027 | 0.998 (0.998 to 0.998) | 0.871 | 2580 |
| MIST | 0.843 (0.828 to 0.857) | 375 | 0.999 (0.998 to 0.999) | 19 | 0.993 (0.993 to 0.993) | 0.848 | 7573 |
| LingPipe | 0.829 (0.813 to 0.844) | 409 | 0.978 (0.976 to 0.980) | 443 | 0.954 (0.953 to 0.954) | 0.456 | 50 905 |
| ANNIE | 0.825 (0.809 to 0.840) | 417 | 0.741 (0.735 to 0.747) | 5221 | 0.989 (0.989 to 0.989) | 0.659 | 11 893 |
We report false negative results for patient names and provider names separately and false positive results for the patients and providers combined.
*For measures based on unique tokens, we had to use different FN+TP values per scrubber, rather than these totals, because when comparing uniques, a given token could be a FP in one context and TP in another, and the classification could change by scrubber.
FN, false negative; FP, false positive; TN, true negative; TP, true positive.
Study 2: results of comparison of personal name removal when the compared systems had access to the name list derived from the Clinical Center data and the names in the HL7 header segments directly linked to the report
| Token instances | Patient name tokens | Provider name tokens | All tokens (non-name tokens) | ||||
|---|---|---|---|---|---|---|---|
| Total N | 2388 | 20 160 | 1 126 241 (1 103 693) | ||||
| Sensitivity (95% CI) | FN | Sensitivity (95% CI) | FN | Specificity (95% CI) | F-measure | FP | |
| NLM-NS† | 0.999 (0.997 to 1) | 2 | 1 (0.999 to 1) | 6 | 0.986 (0.986 to 0.986) | 0.748 | 15 214 |
| MITdeid⋄ | 0.995 (0.992 to 0.998) | 11 | 0.999 (0.999 to 0.999) | 18 | 0.983 (0.983 to 0.983) | 0.705 | 18 835 |
| MIST‡ | 0.965 (0.957 to 0.972) | 83 | 0.999 (0.999 to 1) | 14 | 0.997 (0.997 to 0.997) | 0.938 | 2876 |
| LingPipe‡ | 0.989 (0.983 to 0.992) | 27 | 0.999 (0.998 to 0.999) | 30 | 0.935 (0.935 to 0.936) | 0.386 | 71 619 |
| ANNIE‡ | 0.944 (0.934 to 0.953) | 133 | 0.938 (0.935 to 0.941) | 1247 | 0.983 (0.983 to 0.983) | 0.676 | 18 939 |
False negative rates reported separately for patients and providers, and false positive rates reported together.
*For measures based on unique tokens, we had to use different FN+TP values per scrubber, rather than these totals, because when comparing uniques, a given token could be a FP in one context and TP in another, and the classification could change by scrubber.
†Access by report to patient and provider names from the HL7 header.
⋄Access by report to patient names from the HL7 header and all providers.
‡Access to all patient names in study and to provider names.
FN, false negative; FP, false positives; TN, true negative; TP, true positive.
Study 3: token level performance of systems after adding NLM mega name list plus Clinical Center name list to the native name list of MITdeid, MIST, and ANNIE
| Token instances | Patient tokens | Provider tokens | All tokens (non-name tokens) | ||||
|---|---|---|---|---|---|---|---|
| Total N | 2388 | 20 160 | 1 126 241 (1 103 693) | ||||
| Sensitivity (95% CI) | FN | Sensitivity (95% CI) | FN | Specificity (95% CI) | F-measure | FP | |
| NLM-NS† | 0.999 (0.997 to 1.000) | 2 | 1.000 (0.999 to 1.000) | 6 | 0.986 (0.986 to 0.986) | 0.748 | 15 214 |
| MITdeid⋄ | 0.997 (0.994 to 0.999) | 7 | 1.000 (1.000 to 1.000) | 3 | 0.318 (0.317 to 0.319) | 0.057 | 752 437 |
| MIST‡ | 0.871 (0.857 to 0.884) | 307 | 0.999 (0.999 to 0.999) | 17 | 0.994 (0.994 to 0.994) | 0.870 | 6 338 |
| ANNIE‡ | 0.969 (0.962 to 0.976) | 73 | 0.942 (0.939 to 0.946) | 1162 | 0.498 (0.497 to 0.499) | 0.071 | 553 854 |
*For measures based on unique tokens, we had to use different FN+TP values per scrubber, rather than these totals, because when comparing uniques, a given token could be a FP in one context and TP in another, and the classification could change by scrubber.
†Access by report to patient and provider names from the HL7 header.
⋄Access by report to patient names from the HL7 header and all providers.
‡Access to all patient names in study and to provider names.
FN, false negative; FP, false positives; TN, true negative; TP, true positive.