Literature DB >> 33693390

Variational Autoencoder Modular Bayesian Networks for Simulation of Heterogeneous Clinical Study Data.

Luise Gootjes-Dreesbach1, Meemansa Sood2,3, Akrishta Sahay2, Martin Hofmann-Apitius2,3, Holger Fröhlich2,3,4.   

Abstract

In the area of Big Data, one of the major obstacles for the progress of biomedical research is the existence of data "silos" because legal and ethical constraints often do not allow for sharing sensitive patient data from clinical studies across institutions. While federated machine learning now allows for building models from scattered data of the same format, there is still the need to investigate, mine, and understand data of separate and very differently designed clinical studies that can only be accessed within each of the data-hosting organizations. Simulation of sufficiently realistic virtual patients based on the data within each individual organization could be a way to fill this gap. In this work, we propose a new machine learning approach [Variational Autoencoder Modular Bayesian Network (VAMBN)] to learn a generative model of longitudinal clinical study data. VAMBN considers typical key aspects of such data, namely limited sample size coupled with comparable many variables of different numerical scales and statistical properties, and many missing values. We show that with VAMBN, we can simulate virtual patients in a sufficiently realistic manner while making theoretical guarantees on data privacy. In addition, VAMBN allows for simulating counterfactual scenarios. Hence, VAMBN could facilitate data sharing as well as design of clinical trials.
Copyright © 2020 Gootjes-Dreesbach, Sood, Sahay, Hofmann-Apitius and Fröhlich.

Entities:  

Keywords:  Bayesian Networks; autoencoders; clinical study simulation; longitudinal data; time series data

Year:  2020        PMID: 33693390      PMCID: PMC7931863          DOI: 10.3389/fdata.2020.00016

Source DB:  PubMed          Journal:  Front Big Data        ISSN: 2624-909X


  2 in total

1.  SASC: A simple approach to synthetic cohorts for generating longitudinal observational patient cohorts from COVID-19 clinical data.

Authors:  Takoua Khorchani; Yojana Gadiya; Gesa Witt; Delia Lanzillotta; Carsten Claussen; Andrea Zaliani
Journal:  Patterns (N Y)       Date:  2022-02-09

2.  Generation of realistic synthetic data using Multimodal Neural Ordinary Differential Equations.

Authors:  Philipp Wendland; Colin Birkenbihl; Marc Gomez-Freixa; Meemansa Sood; Maik Kschischo; Holger Fröhlich
Journal:  NPJ Digit Med       Date:  2022-08-20
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.