Literature DB >> 32604667

Generation of Realistic Synthetic Validation Healthcare Datasets Using Generative Adversarial Networks.

Eda Bilici Ozyigit1, Theodoros N Arvanitis1, George Despotou1.   

Abstract

BACKGROUND: Assurance of digital health interventions involves, amongst others, clinical validation, which requires large datasets to test the application in realistic clinical scenarios. Development of such datasets is time consuming and challenging in terms of maintaining patient anonymity and consent.
OBJECTIVE: The development of synthetic datasets that maintain the statistical properties of the real datasets.
METHOD: An artificial neural network based, generative adversarial network was implemented and trained, using numerical and categorical variables, including ICD-9 codes from the MIMIC III dataset, to produce a synthetic dataset.
RESULTS: The synthetic dataset, exhibits a correlation matrix highly similar to the real dataset, good Jaccard similarity and passing the KS test.
CONCLUSIONS: The proof of concept was successful with the approach being promising for further work.

Entities:  

Keywords:  Machine learning; generative adversarial networks; privacy; realistic synthetic dataset

Mesh:

Year:  2020        PMID: 32604667     DOI: 10.3233/SHTI200560

Source DB:  PubMed          Journal:  Stud Health Technol Inform        ISSN: 0926-9630


  1 in total

1.  Application of Bayesian networks to generate synthetic health data.

Authors:  Dhamanpreet Kaur; Matthew Sobiesk; Shubham Patil; Jin Liu; Puran Bhagat; Amar Gupta; Natasha Markuzon
Journal:  J Am Med Inform Assoc       Date:  2021-03-18       Impact factor: 4.497

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.