Arthur Jochems1, Timo M Deist2, Johan van Soest2, Michael Eble3, Paul Bulens4, Philippe Coucke5, Wim Dries6, Philippe Lambin2, Andre Dekker7. 1. Department of Radiation Oncology (MAASTRO Clinic), Maastricht, The Netherlands. Electronic address: arthur.jochems@maastro.nl. 2. Department of Radiation Oncology (MAASTRO Clinic), Maastricht, The Netherlands; GROW - School for Oncology and Developmental Biology, Maastricht University Medical Centre, The Netherlands. 3. Klinik für Strahlentherapie (University clinic Aachen), Germany. 4. Department of Radiation Oncology (Jessa Hospital), Hasselt, The Netherlands. 5. Departement de Physique Medicale (CHU de Liège), Belgium. 6. Catharina-Hospital Eindhoven, The Netherlands. 7. Department of Radiation Oncology (MAASTRO Clinic), Maastricht, The Netherlands.
Abstract
PURPOSE: One of the major hurdles in enabling personalized medicine is obtaining sufficient patient data to feed into predictive models. Combining data originating from multiple hospitals is difficult because of ethical, legal, political, and administrative barriers associated with data sharing. In order to avoid these issues, a distributed learning approach can be used. Distributed learning is defined as learning from data without the data leaving the hospital. PATIENTS AND METHODS: Clinical data from 287 lung cancer patients, treated with curative intent with chemoradiation (CRT) or radiotherapy (RT) alone were collected from and stored in 5 different medical institutes (123 patients at MAASTRO (Netherlands, Dutch), 24 at Jessa (Belgium, Dutch), 34 at Liege (Belgium, Dutch and French), 48 at Aachen (Germany, German) and 58 at Eindhoven (Netherlands, Dutch)). A Bayesian network model is adapted for distributed learning (watch the animation: http://youtu.be/nQpqMIuHyOk). The model predicts dyspnea, which is a common side effect after radiotherapy treatment of lung cancer. RESULTS: We show that it is possible to use the distributed learning approach to train a Bayesian network model on patient data originating from multiple hospitals without these data leaving the individual hospital. The AUC of the model is 0.61 (95%CI, 0.51-0.70) on a 5-fold cross-validation and ranges from 0.59 to 0.71 on external validation sets. CONCLUSION: Distributed learning can allow the learning of predictive models on data originating from multiple hospitals while avoiding many of the data sharing barriers. Furthermore, the distributed learning approach can be used to extract and employ knowledge from routine patient data from multiple hospitals while being compliant to the various national and European privacy laws.
PURPOSE: One of the major hurdles in enabling personalized medicine is obtaining sufficient patient data to feed into predictive models. Combining data originating from multiple hospitals is difficult because of ethical, legal, political, and administrative barriers associated with data sharing. In order to avoid these issues, a distributed learning approach can be used. Distributed learning is defined as learning from data without the data leaving the hospital. PATIENTS AND METHODS: Clinical data from 287 lung cancerpatients, treated with curative intent with chemoradiation (CRT) or radiotherapy (RT) alone were collected from and stored in 5 different medical institutes (123 patients at MAASTRO (Netherlands, Dutch), 24 at Jessa (Belgium, Dutch), 34 at Liege (Belgium, Dutch and French), 48 at Aachen (Germany, German) and 58 at Eindhoven (Netherlands, Dutch)). A Bayesian network model is adapted for distributed learning (watch the animation: http://youtu.be/nQpqMIuHyOk). The model predicts dyspnea, which is a common side effect after radiotherapy treatment of lung cancer. RESULTS: We show that it is possible to use the distributed learning approach to train a Bayesian network model on patient data originating from multiple hospitals without these data leaving the individual hospital. The AUC of the model is 0.61 (95%CI, 0.51-0.70) on a 5-fold cross-validation and ranges from 0.59 to 0.71 on external validation sets. CONCLUSION: Distributed learning can allow the learning of predictive models on data originating from multiple hospitals while avoiding many of the data sharing barriers. Furthermore, the distributed learning approach can be used to extract and employ knowledge from routine patient data from multiple hospitals while being compliant to the various national and European privacy laws.
Authors: Mohamed S Barakat; Matthew Field; Aditya Ghose; David Stirling; Lois Holloway; Shalini Vinod; Andre Dekker; David Thwaites Journal: Health Inf Sci Syst Date: 2017-12-06
Authors: Issam El Naqa; Gaurav Pandey; Hugo Aerts; Jen-Tzung Chien; Christian Nicolaj Andreassen; Andrzej Niemierko; Randall K Ten Haken Journal: Int J Radiat Oncol Biol Phys Date: 2018-10-18 Impact factor: 7.038
Authors: Barry S Rosenstein; Arvind Rao; Jean M Moran; Daniel E Spratt; Marc S Mendonca; Bissan Al-Lazikani; Charles S Mayo; Corey Speers Journal: Med Phys Date: 2018-09-18 Impact factor: 4.071
Authors: Arthur Jochems; Timo M Deist; Issam El Naqa; Marc Kessler; Chuck Mayo; Jackson Reeves; Shruti Jolly; Martha Matuszak; Randall Ten Haken; Johan van Soest; Cary Oberije; Corinne Faivre-Finn; Gareth Price; Dirk de Ruysscher; Philippe Lambin; Andre Dekker Journal: Int J Radiat Oncol Biol Phys Date: 2017-04-24 Impact factor: 7.038
Authors: Thomas J Whitaker; Charles S Mayo; Daniel J Ma; Michael G Haddock; Robert C Miller; Kimberly S Corbin; Michelle Neben-Wittich; James L Leenstra; Nadia N Laack; Mirek Fatyga; Steven E Schild; Carlos E Vargas; Katherine S Tzou; Austin R Hadley; Steven J Buskirk; Robert L Foote Journal: J Radiat Res Date: 2018-03-01 Impact factor: 2.724
Authors: Timo M Deist; A Jochems; Johan van Soest; Georgi Nalbantov; Cary Oberije; Seán Walsh; Michael Eble; Paul Bulens; Philippe Coucke; Wim Dries; Andre Dekker; Philippe Lambin Journal: Clin Transl Radiat Oncol Date: 2017-05-19
Authors: Yi Luo; Shruti Jolly; David Palma; Theodore S Lawrence; Huan-Hsin Tseng; Gilmer Valdes; Daniel McShan; Randall K Ten Haken; Issam Ei Naqa Journal: Phys Med Date: 2021-06-04 Impact factor: 3.119