Chia-Lun Lu1, Shuang Wang2, Zhanglong Ji1, Yuan Wu3, Li Xiong4, Xiaoqian Jiang5, Lucila Ohno-Machado1. 1. Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA Email: challen@ucsd.edu, shw070@ucsd.edu, z1ji@ucsd.edu, x1jiang@ucsd.edu, machado@ucsd.edu. 2. Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA Email: challen@ucsd.edu, shw070@ucsd.edu, z1ji@ucsd.edu, x1jiang@ucsd.edu, machado@ucsd.edu shw070@ucsd.edu x1jiang@ucsd.edu. 3. Department of Biostatistics & Bioinformatics, Duke University, Durham, NC, 27708, USA Email: yuan.wu@duke.edu. 4. Department of Mathematics & Computer Science, Emory University, Atlanta, GA 30322, USA. Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA Email: challen@ucsd.edu, shw070@ucsd.edu, z1ji@ucsd.edu, x1jiang@ucsd.edu, machado@ucsd.edu. 5. Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA Email: challen@ucsd.edu, shw070@ucsd.edu, z1ji@ucsd.edu, x1jiang@ucsd.edu, machado@ucsd.edu Department of Biomedical Informatics, University of California, San Diego, La Jolla, CA, 92093, USA Email: challen@ucsd.edu, shw070@ucsd.edu, z1ji@ucsd.edu, x1jiang@ucsd.edu, machado@ucsd.edu.
Abstract
OBJECTIVE: The Cox proportional hazards model is a widely used method for analyzing survival data. To achieve sufficient statistical power in a survival analysis, it usually requires a large amount of data. Data sharing across institutions could be a potential workaround for providing this added power. METHODS AND MATERIALS: The authors develop a web service for distributed Cox model learning (WebDISCO), which focuses on the proof-of-concept and algorithm development for federated survival analysis. The sensitive patient-level data can be processed locally and only the less-sensitive intermediate statistics are exchanged to build a global Cox model. Mathematical derivation shows that the proposed distributed algorithm is identical to the centralized Cox model. RESULTS: The authors evaluated the proposed framework at the University of California, San Diego (UCSD), Emory, and Duke. The experimental results show that both distributed and centralized models result in near-identical model coefficients with differences in the range [Formula: see text] to [Formula: see text]. The results confirm the mathematical derivation and show that the implementation of the distributed model can achieve the same results as the centralized implementation. LIMITATION: The proposed method serves as a proof of concept, in which a publicly available dataset was used to evaluate the performance. The authors do not intend to suggest that this method can resolve policy and engineering issues related to the federated use of institutional data, but they should serve as evidence of the technical feasibility of the proposed approach.Conclusions WebDISCO (Web-based Distributed Cox Regression Model; https://webdisco.ucsd-dbmi.org:8443/cox/) provides a proof-of-concept web service that implements a distributed algorithm to conduct distributed survival analysis without sharing patient level data.
OBJECTIVE: The Cox proportional hazards model is a widely used method for analyzing survival data. To achieve sufficient statistical power in a survival analysis, it usually requires a large amount of data. Data sharing across institutions could be a potential workaround for providing this added power. METHODS AND MATERIALS: The authors develop a web service for distributed Cox model learning (WebDISCO), which focuses on the proof-of-concept and algorithm development for federated survival analysis. The sensitive patient-level data can be processed locally and only the less-sensitive intermediate statistics are exchanged to build a global Cox model. Mathematical derivation shows that the proposed distributed algorithm is identical to the centralized Cox model. RESULTS: The authors evaluated the proposed framework at the University of California, San Diego (UCSD), Emory, and Duke. The experimental results show that both distributed and centralized models result in near-identical model coefficients with differences in the range [Formula: see text] to [Formula: see text]. The results confirm the mathematical derivation and show that the implementation of the distributed model can achieve the same results as the centralized implementation. LIMITATION: The proposed method serves as a proof of concept, in which a publicly available dataset was used to evaluate the performance. The authors do not intend to suggest that this method can resolve policy and engineering issues related to the federated use of institutional data, but they should serve as evidence of the technical feasibility of the proposed approach.Conclusions WebDISCO (Web-based Distributed Cox Regression Model; https://webdisco.ucsd-dbmi.org:8443/cox/) provides a proof-of-concept web service that implements a distributed algorithm to conduct distributed survival analysis without sharing patient level data.
Authors: David Madigan; Patrick B Ryan; Martijn Schuemie; Paul E Stang; J Marc Overhage; Abraham G Hartzema; Marc A Suchard; William DuMouchel; Jesse A Berlin Journal: Am J Epidemiol Date: 2013-05-05 Impact factor: 4.897
Authors: Lucila Ohno-Machado; Zia Agha; Douglas S Bell; Lisa Dahm; Michele E Day; Jason N Doctor; Davera Gabriel; Maninder K Kahlon; Katherine K Kim; Michael Hogarth; Michael E Matheny; Daniella Meeker; Jonathan R Nebeker Journal: J Am Med Inform Assoc Date: 2014-04-29 Impact factor: 4.497
Authors: Rui Duan; Mary Regina Boland; Zixuan Liu; Yue Liu; Howard H Chang; Hua Xu; Haitao Chu; Christopher H Schmid; Christopher B Forrest; John H Holmes; Martijn J Schuemie; Jesse A Berlin; Jason H Moore; Yong Chen Journal: J Am Med Inform Assoc Date: 2020-03-01 Impact factor: 4.497