Literature DB >> 36120716

Measuring the impact of anonymization on real-world consolidated health datasets engineered for secondary research use: Experiments in the context of MODELHealth project.

Stavros Pitoglou1,2, Arianna Filntisi1, Athanasios Anastasiou2, George K Matsopoulos2, Dimitrios Koutsouris2.   

Abstract

Introduction: Electronic Health Records (EHRs) are essential data structures, enabling the sharing of valuable medical care information for a diverse patient population and being reused as input to predictive models for clinical research. However, issues such as the heterogeneity of EHR data and the potential compromisation of patient privacy inhibit the secondary use of EHR data in clinical research.
Objectives: This study aims to present the main elements of the MODELHealth project implementation and the evaluation method that was followed to assess the efficiency of its mechanism.
Methods: The MODELHealth project was implemented as an Extract-Transform-Load system that collects data from the hospital databases, performs harmonization to the HL7 FHIR standard and anonymization using the k-anonymity method, before loading the transformed data to a central repository. The integrity of the anonymization process was validated by developing a database query tool. The information loss occurring due to the anonymization was estimated with the metrics of generalized information loss, discernibility and average equivalence class size for various values of k.
Results: The average values of generalized information loss, discernibility and average equivalence class size obtained across all tested datasets and k values were 0.008473 ± 0.006216252886, 115,145,464.3 ± 79,724,196.11 and 12.1346 ± 6.76096647, correspondingly. The values of those metrics appear correlated with factors such as the k value and the dataset characteristics, as expected.
Conclusion: The experimental results of the study demonstrate that it is feasible to perform effective harmonization and anonymization on EHR data while preserving essential patient information.
© 2022 Pitoglou, Filntisi, Anastasiou, Matsopoulos and Koutsouris.

Entities:  

Keywords:  anonymization; electronic health records; harmonization; information loss; real data

Year:  2022        PMID: 36120716      PMCID: PMC9474677          DOI: 10.3389/fdgth.2022.841853

Source DB:  PubMed          Journal:  Front Digit Health        ISSN: 2673-253X


  15 in total

1.  Protecting privacy using k-anonymity.

Authors:  Khaled El Emam; Fida Kamal Dankar
Journal:  J Am Med Inform Assoc       Date:  2008-06-25       Impact factor: 4.497

2.  Quantifying the costs and benefits of privacy-preserving health data publishing.

Authors:  Rashid Hussain Khokhar; Rui Chen; Benjamin C M Fung; Siu Man Lui
Journal:  J Biomed Inform       Date:  2014-04-24       Impact factor: 6.317

3.  $\mathtt {Deepr}$: A Convolutional Net for Medical Records.

Authors:  Phuoc Nguyen; Truyen Tran; Nilmini Wickramasinghe; Svetha Venkatesh
Journal:  IEEE J Biomed Health Inform       Date:  2016-12-01       Impact factor: 5.772

4.  MODELHealth: Facilitating Machine Learning on Big Health Data Networks.

Authors:  Stavros Pitoglou; Athanasios Anastasiou; Thelma Androutsou; Dimitra Giannouli; Evaggelos Kostalas; Georgios Matsopoulos; Dimitrios Koutsouris
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2019-07

5.  Using HL7 FHIR to achieve interoperability in patient health record.

Authors:  Rishi Saripalle; Christopher Runyan; Mitchell Russell
Journal:  J Biomed Inform       Date:  2019-05-04       Impact factor: 6.317

Review 6.  Publishing data from electronic health records while preserving privacy: a survey of algorithms.

Authors:  Aris Gkoulalas-Divanis; Grigorios Loukides; Jimeng Sun
Journal:  J Biomed Inform       Date:  2014-06-14       Impact factor: 6.317

Review 7.  Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research.

Authors:  Nicole Gray Weiskopf; Chunhua Weng
Journal:  J Am Med Inform Assoc       Date:  2012-06-25       Impact factor: 4.497

8.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records.

Authors:  Riccardo Miotto; Li Li; Brian A Kidd; Joel T Dudley
Journal:  Sci Rep       Date:  2016-05-17       Impact factor: 4.379

9.  Knowledge graph prediction of unknown adverse drug reactions and validation in electronic health records.

Authors:  Daniel M Bean; Honghan Wu; Ehtesham Iqbal; Olubanke Dzahini; Zina M Ibrahim; Matthew Broadbent; Robert Stewart; Richard J B Dobson
Journal:  Sci Rep       Date:  2017-11-27       Impact factor: 4.379

10.  Extract, transform, load framework for the conversion of health databases to OMOP.

Authors:  Juan C Quiroz; Tim Chard; Zhisheng Sa; Angus Ritchie; Louisa Jorm; Blanca Gallego
Journal:  PLoS One       Date:  2022-04-11       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.