Literature DB >> 33623891

Spot the difference: comparing results of analyses from real patient data and synthetic derivatives.

Randi E Foraker1,2, Sean C Yu2, Aditi Gupta2, Andrew P Michelson3, Jose A Pineda Soto4, Ryan Colvin2,4, Francis Loh5, Marin H Kollef3, Thomas Maddox6, Bradley Evanoff1, Hovav Dror7, Noa Zamstein7, Albert M Lai1,2, Philip R O Payne1,2.   

Abstract

BACKGROUND: Synthetic data may provide a solution to researchers who wish to generate and share data in support of precision healthcare. Recent advances in data synthesis enable the creation and analysis of synthetic derivatives as if they were the original data; this process has significant advantages over data deidentification.
OBJECTIVES: To assess a big-data platform with data-synthesizing capabilities (MDClone Ltd., Beer Sheva, Israel) for its ability to produce data that can be used for research purposes while obviating privacy and confidentiality concerns.
METHODS: We explored three use cases and tested the robustness of synthetic data by comparing the results of analyses using synthetic derivatives to analyses using the original data using traditional statistics, machine learning approaches, and spatial representations of the data. We designed these use cases with the purpose of conducting analyses at the observation level (Use Case 1), patient cohorts (Use Case 2), and population-level data (Use Case 3).
RESULTS: For each use case, the results of the analyses were sufficiently statistically similar (P > 0.05) between the synthetic derivative and the real data to draw the same conclusions. DISCUSSION AND
CONCLUSION: This article presents the results of each use case and outlines key considerations for the use of synthetic data, examining their role in clinical research for faster insights and improved data sharing in support of precision healthcare.
© The Author(s) 2020. Published by Oxford University Press on behalf of the American Medical Informatics Association.

Entities:  

Keywords:  data analysis; electronic health records and systems; precision health care; protected health information; synthetic data

Year:  2020        PMID: 33623891      PMCID: PMC7886551          DOI: 10.1093/jamiaopen/ooaa060

Source DB:  PubMed          Journal:  JAMIA Open        ISSN: 2574-2531


  10 in total

1.  Synthesizing electronic health records using improved generative adversarial networks.

Authors:  Mrinal Kanti Baowaly; Chia-Ching Lin; Chao-Lin Liu; Kuan-Ta Chen
Journal:  J Am Med Inform Assoc       Date:  2019-03-01       Impact factor: 4.497

2.  Hospital deaths in patients with sepsis from 2 independent cohorts.

Authors:  Vincent Liu; Gabriel J Escobar; John D Greene; Jay Soule; Alan Whippy; Derek C Angus; Theodore J Iwashyna
Journal:  JAMA       Date:  2014-07-02       Impact factor: 56.272

Review 3.  2001 SCCM/ESICM/ACCP/ATS/SIS International Sepsis Definitions Conference.

Authors:  Mitchell M Levy; Mitchell P Fink; John C Marshall; Edward Abraham; Derek Angus; Deborah Cook; Jonathan Cohen; Steven M Opal; Jean-Louis Vincent; Graham Ramsay
Journal:  Intensive Care Med       Date:  2003-03-28       Impact factor: 17.440

4.  PRISM III: an updated Pediatric Risk of Mortality score.

Authors:  M M Pollack; K M Patel; U E Ruttimann
Journal:  Crit Care Med       Date:  1996-05       Impact factor: 7.598

5.  Incidence and Trends of Sepsis in US Hospitals Using Clinical vs Claims Data, 2009-2014.

Authors:  Chanu Rhee; Raymund Dantes; Lauren Epstein; David J Murphy; Christopher W Seymour; Theodore J Iwashyna; Sameer S Kadri; Derek C Angus; Robert L Danner; Anthony E Fiore; John A Jernigan; Greg S Martin; Edward Septimus; David K Warren; Anita Karcz; Christina Chan; John T Menchaca; Rui Wang; Susan Gruber; Michael Klompas
Journal:  JAMA       Date:  2017-10-03       Impact factor: 56.272

6.  Data-driven approach for creating synthetic electronic medical records.

Authors:  Anna L Buczak; Steven Babin; Linda Moniz
Journal:  BMC Med Inform Decis Mak       Date:  2010-10-14       Impact factor: 2.796

7.  The validity of synthetic clinical data: a validation study of a leading synthetic data generator (Synthea) using clinical quality measures.

Authors:  Junqiao Chen; David Chun; Milesh Patel; Epson Chiang; Jesse James
Journal:  BMC Med Inform Decis Mak       Date:  2019-03-14       Impact factor: 2.796

8.  Analyzing Medical Research Results Based on Synthetic Data and Their Relation to Real Data Results: Systematic Comparison From Five Observational Studies.

Authors:  Anat Reiner Benaim; Ronit Almog; Yuri Gorelik; Irit Hochberg; Laila Nassar; Tanya Mashiach; Mogher Khamaisi; Yael Lurie; Zaher S Azzam; Johad Khoury; Daniel Kurnik; Rafael Beyar
Journal:  JMIR Med Inform       Date:  2020-02-20

9.  Are Synthetic Data Derivatives the Future of Translational Medicine?

Authors:  Randi Foraker; Douglas L Mann; Philip R O Payne
Journal:  JACC Basic Transl Sci       Date:  2018-11-12

10.  Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record.

Authors:  Jason Walonoski; Mark Kramer; Joseph Nichols; Andre Quina; Chris Moesel; Dylan Hall; Carlton Duffett; Kudakwashe Dube; Thomas Gallagher; Scott McLachlan
Journal:  J Am Med Inform Assoc       Date:  2018-03-01       Impact factor: 4.497

  10 in total
  10 in total

Review 1.  Artificial intelligence in spine surgery.

Authors:  Ahmed Benzakour; Pavlos Altsitzioglou; Jean Michel Lemée; Alaaeldin Ahmad; Andreas F Mavrogenis; Thami Benzakour
Journal:  Int Orthop       Date:  2022-07-29       Impact factor: 3.479

2.  Demonstrating an approach for evaluating synthetic geospatial and temporal epidemiologic data utility: results from analyzing >1.8 million SARS-CoV-2 tests in the United States National COVID Cohort Collaborative (N3C).

Authors:  Jason A Thomas; Randi E Foraker; Noa Zamstein; Jon D Morrow; Philip R O Payne; Adam B Wilcox
Journal:  J Am Med Inform Assoc       Date:  2022-07-12       Impact factor: 7.942

3.  Utility Metrics for Evaluating Synthetic Health Data Generation Methods: Validation Study.

Authors:  Khaled El Emam; Lucy Mosquera; Xi Fang; Alaa El-Hussuna
Journal:  JMIR Med Inform       Date:  2022-04-07

4.  Can synthetic data be a proxy for real clinical trial data? A validation study.

Authors:  Zahra Azizi; Chaoyi Zheng; Lucy Mosquera; Louise Pilote; Khaled El Emam
Journal:  BMJ Open       Date:  2021-04-16       Impact factor: 2.692

5.  The National COVID Cohort Collaborative: Analyses of Original and Computationally Derived Electronic Health Record Data.

Authors:  Randi Foraker; Aixia Guo; Jason Thomas; Noa Zamstein; Philip Ro Payne; Adam Wilcox
Journal:  J Med Internet Res       Date:  2021-10-04       Impact factor: 5.428

6.  Predicting mortality among patients with liver cirrhosis in electronic health records with machine learning.

Authors:  Aixia Guo; Nikhilesh R Mazumder; Daniela P Ladner; Randi E Foraker
Journal:  PLoS One       Date:  2021-08-31       Impact factor: 3.240

7.  Unforeseen changes in seasonality of pediatric respiratory illnesses during the first COVID-19 pandemic year.

Authors:  Moria Be'er; Israel Amirav; Michal Cahal; Mika Rochman; Yotam Lior; Ayelet Rimon; Roni G Lavy; Moran Lavie
Journal:  Pediatr Pulmonol       Date:  2022-03-31

8.  Case report: evaluation of an open-source synthetic data platform for simulation studies.

Authors:  Daniella Meeker; Crystal Kallem; Yan Heras; Stephanie Garcia; Casey Thompson
Journal:  JAMIA Open       Date:  2022-08-08

9.  Validating a membership disclosure metric for synthetic health data.

Authors:  Khaled El Emam; Lucy Mosquera; Xi Fang
Journal:  JAMIA Open       Date:  2022-10-11

10.  Demonstrating an approach for evaluating synthetic geospatial and temporal epidemiologic data utility: Results from analyzing >1.8 million SARS-CoV-2 tests in the United States National COVID Cohort Collaborative (N3C).

Authors:  Jason A Thomas; Randi E Foraker; Noa Zamstein; Philip R O Payne; Adam B Wilcox
Journal:  medRxiv       Date:  2021-07-08
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.