Literature DB >> 31029254

Identification and classification of DICOM files with burned-in text content.

Petr Vcelak1, Martin Kryl2, Michal Kratochvil3, Jana Kleckova4.   

Abstract

BACKGROUND: Protected health information burned in pixel data is not indicated for various reasons in DICOM. It complicates the secondary use of such data. In recent years, there have been several attempts to anonymize or de-identify DICOM files. Existing approaches have different constraints. No completely reliable solution exists. Especially for large datasets, it is necessary to quickly analyse and identify files potentially violating privacy.
METHODS: Classification is based on adaptive-iterative algorithm designed to identify one of three classes. There are several image transformations, optical character recognition, and filters; then a local decision is made. A confirmed local decision is the final one. The classifier was trained on a dataset composed of 15,334 images of various modalities.
RESULTS: The false positive rates are in all cases below 4.00%, and 1.81% in the mission-critical problem of detecting protected health information. The classifier's weighted average recall was 94.85%, the weighted average inverse recall was 97.42% and Cohen's Kappa coefficient was 0.920.
CONCLUSION: The proposed novel approach for classification of burned-in text is highly configurable and able to analyse images from different modalities with a noisy background. The solution was validated and is intended to identify DICOM files that need to have restricted access or be thoroughly de-identified due to privacy issues. Unlike with existing tools, the recognised text, including its coordinates, can be further used for de-identification.
Copyright © 2019 Elsevier B.V. All rights reserved.

Keywords:  Burned-in protected health information; Classification; DICOM; De-identification; HIPAA; Text detection

Mesh:

Year:  2019        PMID: 31029254     DOI: 10.1016/j.ijmedinf.2019.02.011

Source DB:  PubMed          Journal:  Int J Med Inform        ISSN: 1386-5056            Impact factor:   4.046


  4 in total

1.  De-Identification of Radiomics Data Retaining Longitudinal Temporal Information.

Authors:  Surajit Kundu; Santam Chakraborty; Sanjoy Chatterjee; Syamantak Das; Rimpa Basu Achari; Jayanta Mukhopadhyay; Partha Pratim Das; Indranil Mallick; Moses Arunsingh; Tapesh Bhattacharyyaa; Soumendranath Ray
Journal:  J Med Syst       Date:  2020-04-02       Impact factor: 4.460

2.  A Two-Stage De-Identification Process for Privacy-Preserving Medical Image Analysis.

Authors:  Arsalan Shahid; Mehran H Bazargani; Paul Banahan; Brian Mac Namee; Tahar Kechadi; Ceara Treacy; Gilbert Regan; Peter MacMahon
Journal:  Healthcare (Basel)       Date:  2022-04-19

3.  The Ethics of Artificial Intelligence in Pathology and Laboratory Medicine: Principles and Practice.

Authors:  Brian R Jackson; Ye Ye; James M Crawford; Michael J Becich; Somak Roy; Jeffrey R Botkin; Monica E de Baca; Liron Pantanowitz
Journal:  Acad Pathol       Date:  2021-02-16

4.  Research Goal-Driven Data Model and Harmonization for De-Identifying Patient Data in Radiomics.

Authors:  Surajit Kundu; Santam Chakraborty; Jayanta Mukhopadhyay; Syamantak Das; Sanjoy Chatterjee; Rimpa Basu Achari; Indranil Mallick; Partha Pratim Das; Moses Arunsingh; Tapesh Bhattacharyya; Soumendranath Ray
Journal:  J Digit Imaging       Date:  2021-07-09       Impact factor: 4.903

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.