Rui Duan1, Mary Regina Boland1, Zixuan Liu2, Yue Liu3, Howard H Chang4, Hua Xu5, Haitao Chu6, Christopher H Schmid7, Christopher B Forrest8, John H Holmes1, Martijn J Schuemie9, Jesse A Berlin9, Jason H Moore1, Yong Chen1. 1. Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania, USA. 2. Department of Electrical Engineering, Stanford University, Stanford, California, USA. 3. Department of Statistics, Harvard University, Cambridge, Massachusetts, USA. 4. Department of Biostatistics and Bioinformatics, Emory University, Atlanta, Georgia, USA. 5. School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, Texas, USA. 6. Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, USA. 7. Department of Biostatistics, Brown University, Providence, Rhode Island, USA. 8. Division of General Pediatrics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA. 9. Janssen Research and Development LLC, Titusville, New Jersey, USA.
Abstract
OBJECTIVES: We propose a one-shot, privacy-preserving distributed algorithm to perform logistic regression (ODAL) across multiple clinical sites. MATERIALS AND METHODS: ODAL effectively utilizes the information from the local site (where the patient-level data are accessible) and incorporates the first-order (ODAL1) and second-order (ODAL2) gradients of the likelihood function from other sites to construct an estimator without requiring iterative communication across sites or transferring patient-level data. We evaluated ODAL via extensive simulation studies and an application to a dataset from the University of Pennsylvania Health System. The estimation accuracy was evaluated by comparing it with the estimator based on the combined individual participant data or pooled data (ie, gold standard). RESULTS: Our simulation studies revealed that the relative estimation bias of ODAL1 compared with the pooled estimates was <3%, and the ratio of standard errors was <1.25 for all scenarios. ODAL2 achieved higher accuracy (with relative bias <0.1% and ratio of standard errors <1.05). In real data analysis, we investigated the associations of 100 medications with fetal loss during pregnancy. We found that ODAL1 provided estimates with relative bias <10% for 85% of medications, and ODAL2 has relative bias <10% for 99% of medications. For communication cost, ODAL1 requires transferring p numbers from each site to the local site and ODAL2 requires transferring (p×p+p) numbers from each site to the local site, where p is the number of parameters in the regression model. CONCLUSIONS: This study demonstrates that ODAL is privacy-preserving and communication-efficient with small bias and high statistical efficiency.
OBJECTIVES: We propose a one-shot, privacy-preserving distributed algorithm to perform logistic regression (ODAL) across multiple clinical sites. MATERIALS AND METHODS: ODAL effectively utilizes the information from the local site (where the patient-level data are accessible) and incorporates the first-order (ODAL1) and second-order (ODAL2) gradients of the likelihood function from other sites to construct an estimator without requiring iterative communication across sites or transferring patient-level data. We evaluated ODAL via extensive simulation studies and an application to a dataset from the University of Pennsylvania Health System. The estimation accuracy was evaluated by comparing it with the estimator based on the combined individual participant data or pooled data (ie, gold standard). RESULTS: Our simulation studies revealed that the relative estimation bias of ODAL1 compared with the pooled estimates was <3%, and the ratio of standard errors was <1.25 for all scenarios. ODAL2 achieved higher accuracy (with relative bias <0.1% and ratio of standard errors <1.05). In real data analysis, we investigated the associations of 100 medications with fetal loss during pregnancy. We found that ODAL1 provided estimates with relative bias <10% for 85% of medications, and ODAL2 has relative bias <10% for 99% of medications. For communication cost, ODAL1 requires transferring p numbers from each site to the local site and ODAL2 requires transferring (p×p+p) numbers from each site to the local site, where p is the number of parameters in the regression model. CONCLUSIONS: This study demonstrates that ODAL is privacy-preserving and communication-efficient with small bias and high statistical efficiency.
Authors: John H Holmes; Thomas E Elliott; Jeffrey S Brown; Marsha A Raebel; Arthur Davidson; Andrew F Nelson; Annie Chung; Pierre La Chance; John F Steiner Journal: J Am Med Inform Assoc Date: 2014-03-28 Impact factor: 4.497
Authors: Jeanette A Stingone; Nancy Mervish; Patricia Kovatch; Deborah L McGuinness; Chris Gennings; Susan L Teitelbaum Journal: Curr Opin Pediatr Date: 2017-04 Impact factor: 2.856
Authors: A Danielle Iuliano; Katherine M Roguski; Howard H Chang; David J Muscatello; Rakhee Palekar; Stefano Tempia; Cheryl Cohen; Jon Michael Gran; Dena Schanzer; Benjamin J Cowling; Peng Wu; Jan Kyncl; Li Wei Ang; Minah Park; Monika Redlberger-Fritz; Hongjie Yu; Laura Espenhain; Anand Krishnan; Gideon Emukule; Liselotte van Asten; Susana Pereira da Silva; Suchunya Aungkulanon; Udo Buchholz; Marc-Alain Widdowson; Joseph S Bresee Journal: Lancet Date: 2017-12-14 Impact factor: 79.321
Authors: Mary Regina Boland; Pradipta Parhi; Li Li; Riccardo Miotto; Robert Carroll; Usman Iqbal; Phung-Anh Alex Nguyen; Martijn Schuemie; Seng Chan You; Donahue Smith; Sean Mooney; Patrick Ryan; Yu-Chuan Jack Li; Rae Woong Park; Josh Denny; Joel T Dudley; George Hripcsak; Pierre Gentine; Nicholas P Tatonetti Journal: J Am Med Inform Assoc Date: 2018-03-01 Impact factor: 4.497
Authors: Chongliang Luo; Md Nazmul Islam; Natalie E Sheils; John Buresh; Martijn J Schuemie; Jalpa A Doshi; Rachel M Werner; David A Asch; Yong Chen Journal: J Am Med Inform Assoc Date: 2022-07-12 Impact factor: 7.942
Authors: Jiayi Tong; Zhaoyi Chen; Rui Duan; Wei-Hsuan Lo-Ciganic; Tianchen Lyu; Cui Tao; Peter A Merkel; Henry R Kranzler; Jiang Bian; Yong Chen Journal: AMIA Annu Symp Proc Date: 2021-01-25
Authors: John H Holmes; James Beinlich; Mary R Boland; Kathryn H Bowles; Yong Chen; Tessa S Cook; George Demiris; Michael Draugelis; Laura Fluharty; Peter E Gabriel; Robert Grundmeier; C William Hanson; Daniel S Herman; Blanca E Himes; Rebecca A Hubbard; Charles E Kahn; Dokyoon Kim; Ross Koppel; Qi Long; Nebojsa Mirkovic; Jeffrey S Morris; Danielle L Mowery; Marylyn D Ritchie; Ryan Urbanowicz; Jason H Moore Journal: Methods Inf Med Date: 2021-07-19 Impact factor: 1.800
Authors: Mary Regina Boland; Lena M Davidson; Silvia P Canelón; Jessica Meeker; Trevor Penning; John H Holmes; Jason H Moore Journal: NPJ Digit Med Date: 2021-08-11