| Literature DB >> 32496200 |
Qoua Her1, Jessica Malenfant1, Zilu Zhang1, Yury Vilk1, Jessica Young1, David Tabano2,3, Jack Hamilton4, Ron Johnson5, Marsha Raebel2, Denise Boudreau5, Sengwee Toh1.
Abstract
BACKGROUND: A distributed data network approach combined with distributed regression analysis (DRA) can reduce the risk of disclosing sensitive individual and institutional information in multicenter studies. However, software that facilitates large-scale and efficient implementation of DRA is limited.Entities:
Keywords: PopMedNet; distributed data networks; distributed regression analysis; pharmacoepidemiology; privacy-protecting analytics
Year: 2020 PMID: 32496200 PMCID: PMC7303834 DOI: 10.2196/15073
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Distributed regression analysis with horizontally partitioned data.
Figure 2Three-step process to conduct distributed regression analysis with PopMedNet. CIDA: Cohort Identification and Descriptive Analysis Tool; DRA: Distributed Regression Analysis; SOC: Sentinel Operations Center.
Analytical datasets and variables.
| Regression model type | Outcome variable (within 1-year postsurgery) | Variables (exposure and confounders) |
| Linear | Change in BMI | Bariatric surgery exposure, age at surgery, sex, race and ethnicity, combined Charlson-Elixhauser comorbidity score, number of ambulatory visits, number of other ambulatory visits, number of inpatient stays, number of nonacute institutional stays, number of emergency department visits, BMI before bariatric surgery, number of days between last weight and height measurement and bariatric surgery, and data partner |
| Logistic | Weight loss ≥20% | Same as above |
| Cox | Time to weight loss ≥20% | Same as above |
Distributed linear regression vs pooled individual-level linear regression.
| Covariates | Distributed regression | Pooled individual-level | Difference in parameter estimate | Difference in SE | |||||
|
| Parameter | SE | Parameter | SE |
|
| |||
| Intercept | 34.03935 | 0.61075 | 34.03935 | 0.61075 | 3.66 x 10−12 | −9.14 x 10−13 | |||
| Exposure | 2.04714 | 0.28723 | 2.04714 | 0.28723 | −4.15 x 10−13 | −4.30 x 10−13 | |||
| Age | −0.03334 | 0.00837 | −0.03334 | 0.00837 | −3.68 x 10−14 | −1.25 x 10−14 | |||
| Preindex BMI | −0.99983 | 0.00050 | −0.99983 | 0.00050 | −6.00 x 10−15 | −7.44 x 10−16 | |||
| Combined comorbidity score | 0.04388 | 0.06949 | 0.04388 | 0.06949 | 3.59 x 10−15 | −1.04 x 10−13 | |||
| Number of ambulatory visits | −0.03068 | 0.01008 | −0.03068 | 0.01008 | −6.59 x 10−17 | −1.51 x 10−14 | |||
| Number of emergency department visits | 0.10329 | 0.08749 | 0.10329 | 0.08749 | −2.79 x 10−14 | −1.31 x 10−13 | |||
| Number of inpatient visits | 0.88725 | 0.25976 | 0.88725 | 0.25976 | −6.51 x 10−13 | −3.89 x 10−13 | |||
| Number of nonacute institutional stay | 1.32338 | 1.79056 | 1.32338 | 1.79056 | 4.21 x 10−13 | −2.68 x 10−12 | |||
| Number of other ambulatory visits | 0.02159 | 0.00873 | 0.02159 | 0.00873 | 1.22 x 10−14 | −1.31 x 10−14 | |||
| Days between BMI measurement and index procedure | 0.01207 | 0.00567 | 0.01207 | 0.00567 | 3.92 x 10−15 | −8.48 x 10−15 | |||
|
| |||||||||
| Unknown | 0.94212 | 0.26841 | 0.94212 | 0.26841 | −4.16 x 10−13 | −4.02 x 10−13 | |||
| American Indian or Alaska Native | −0.30948 | 0.69817 | −0.30948 | 0.69817 | −2.39 x 10−13 | −1.04 x 10−12 | |||
|
| Asian | −0.16853 | 0.63001 | −0.16853 | 0.63001 | −4.52 x 10−13 | −9.42 x 10−13 | ||
| Black or African American | 1.51961 | 0.29206 | 1.51961 | 0.29206 | −9.95 x 10−14 | −4.37 x 10−13 | |||
|
| Native Hawaiian or other Pacific Islander | −1.22315 | 1.04973 | −1.22315 | 1.04973 | −4.11 x 10−13 | −1.57 x 10−12 | ||
| Female | −1.22366 | 0.23205 | −1.22366 | 0.23205 | −5.33 x 10−13 | −3.47 x 10−13 | |||
|
| |||||||||
|
| 2011 | 0.15150 | 0.30361 | 0.15150 | 0.30361 | −5.94 x 10−13 | −4.54 x 10−13 | ||
|
| 2012 | −0.24904 | 0.30372 | −0.24904 | 0.30372 | −6.47 x 10−13 | −4.54 x 10−13 | ||
|
| 2013 | −0.02308 | 0.30223 | −0.02308 | 0.30223 | −6.08 x 10−13 | −4.52 x 10−13 | ||
|
| 2014 | 0.32767 | 0.30609 | 0.32767 | 0.30609 | −5.93 x 10−13 | −4.58 x 10−13 | ||
|
| 2015 | −0.25767 | 0.33352 | −0.25767 | 0.33352 | −6.18 x 10−13 | −4.99 x 10−13 | ||
|
| |||||||||
|
| 2 | −1.10559 | 0.31373 | −1.10559 | 0.31373 | 2.89 x 10−15 | −4.69 x 10−13 | ||
|
| 3 | −0.10990 | 0.30341 | −0.10990 | 0.30341 | −2.07 x 10−13 | −4.54 x 10−13 | ||
aReference groups: race (white), surgery year (2010), and data partner site (1).
Distributed Cox proportional hazards regression vs pooled individual-level Cox proportional hazards regression.
| Covariates | Distributed regression analysis | Pooled individual-level analysis | Difference in parameter estimate | Difference in SE | |||||
|
| Parameter | SE | Parameter | SE |
|
| |||
| Exposure | −0.58160 | 0.05275 | −0.58160 | 0.05275 | 6.66 x 10−16 | −8.33 x 10−17 | |||
| Age | −0.01107 | 0.00146 | −0.01107 | 0.00146 | 1.39 x 10−17 | −9.11 x 10−18 | |||
| Preindex BMI | −0.00006 | 0.00009 | −0.00006 | 0.00009 | 2.85 x 10−19 | −1.49 x 10−19 | |||
| Combined comorbidity score | −0.00787 | 0.01205 | −0.00787 | 0.01205 | −3.64 x 10−17 | −1.04 x 10−17 | |||
| Number of ambulatory visits | 0.00584 | 0.00158 | 0.00584 | 0.00158 | −2.95 x 10−17 | 1.08 x 10−18 | |||
| Number of emergency department visits | −0.01873 | 0.01679 | −0.01873 | 0.00158 | 1.56 x 10−16 | −2.43 x 10−17 | |||
| Number of inpatient visits | −0.08587 | 0.04580 | −0.08587 | 0.04580 | −9.58 x 10−16 | −1.25 x 10−16 | |||
| Number of nonacute institutional stay | 0.06626 | 0.29266 | 0.06626 | 0.29266 | 3.75 x 10−16 | −3.33 x 10−16 | |||
| Number of other ambulatory visits | 0.00279 | 0.00134 | 0.00279 | 0.00134 | 4.03 x 10−17 | −1.52 x 10−18 | |||
| Days between BMI measurement and index procedure | −0.00221 | 0.00096 | −0.00221 | 0.00096 | 2.39 x 10−17 | −2.17 x 10−18 | |||
|
| |||||||||
| Unknown | −0.18898 | 0.04765 | −0.18898 | 0.04765 | 5.27 x 10−16 | 0.00 x 10+00 | |||
|
| American Indian or Alaska Native | −0.07476 | 0.12019 | −0.07476 | 0.12019 | 1.25 x 10−16 | 2.78 x 10−17 | ||
|
| Asian | −0.22309 | 0.10933 | −0.22309 | 0.10933 | −2.78 x 10−17 | 6.94 x 10−17 | ||
|
| Black or African American | −0.18457 | 0.05116 | −0.18457 | 0.05116 | 1.94 x 10−16 | −1.39 x 10−17 | ||
|
| Native Hawaiian or Other Pacific Islander | −0.19748 | 0.17333 | −0.19748 | 0.17333 | 1.42 x 10−15 | 2.78 x 10−17 | ||
| Female | −0.00887 | 0.04052 | −0.00887 | 0.04052 | −1.24 x 10−15 | −3.47 x 10−17 | |||
|
| |||||||||
|
| 2011 | −0.08021 | 0.05176 | −0.08021 | 0.05176 | 8.60 x 10−16 | 1.11 x 10−16 | ||
|
| 2012 | −0.02547 | 0.05136 | −0.02547 | 0.05136 | 4.61 x 10−16 | 7.63 x 10−17 | ||
|
| 2013 | −0.09519 | 0.05195 | −0.09519 | 0.05195 | 1.17 x 10−15 | 4.86 x 10−17 | ||
|
| 2014 | −0.16866 | 0.05235 | −0.16866 | 0.05235 | 8.60 x 10−16 | 1.18 x 10−16 | ||
|
| 2015 | 0.24763 | 0.05640 | 0.24763 | 0.05640 | 3.89 x 10−16 | 1.04 x 10−16 | ||
|
| |||||||||
|
| 2 | −0.15270 | 0.05188 | −0.15270 | 0.05188 | 2.11 x 10−15 | -6.94 x 10−18 | ||
|
| 3 | 0.33440 | 0.05161 | 0.33440 | 0.05161 | 8.33 x 10−16 | 2.08 x 10−17 | ||
aReference groups: race (white), surgery year (2010), and data partner site (1).
Comparison of model fit statistics between distributed regression and pooled individual-level data analysis.
| Regression model type and statistic or test | Distributed regression analysis | Pooled individual-level data analysis | Difference in model fit statistics | |
|
| ||||
|
|
| 0.9987 | 0.9987 | 3.89 x 10−15 |
|
| Akaike information criterion | 20089.6538 | 20089.6538 | −1.59 x 10−08 |
|
| Sawa's Bayesian information criterion | 20091.8710 | 20091.8710 | −1.59 x 10−08 |
|
| Schwarz's Bayesian information criterion | 20247.5868 | 20247.5868 | −1.59 x 10−08 |
|
| ||||
|
| -2 log-likelihood | 5423.2491 | 5423.2491 | 1.36 x 10−11 |
|
| Akaike information criterion | 5471.2491 | 5471.2491 | 1.36 x 10−11 |
|
| Sawa's Bayesian information criterion | 5629.5265 | 5629.5265 | 1.36 x 10−11 |
|
| Area under the ROCa curve | 0.6591 | 0.6592 | −1.00 x 10−04 |
|
| Hosmer-Lemeshow (chi-square statistics) | 1.3405 | 1.5596 | −2.19 x 10−01 |
|
| Hosmer-Lemeshow, | .995 (8) | .991 (8) | 3.38 x 10−03 |
|
| ||||
|
| -2 log-likelihood | 66217.7270 | 66217.7270 | 1.46 x 10−11 |
|
| Akaike information criterion | 66263.7270 | 66263.7270 | 1.46 x 10−11 |
|
| Schwarz's Bayesian information criterion | 66409.6070 | 66409.6070 | 1.46 x 10−11 |
|
| Median time to event (days) | 184 | 184 | 0 |
aROC: receiver operating characteristic.
Figure 3Comparison of receiver operating characteristic curves between distributed logistic regression (left) and pooled individual-level logistic regression (right). To offer better privacy-protecting, individual-level predicted values were summarized in bins of 6 and transferred to the analysis center for aggregation in the distributed logistic regression analysis. The size of the bin is user-specified. ROC: receiver operating characteristic.
Figure 4Comparison of survival functions between distributed cox proportional hazards regression (left) and pooled individual-level cox proportional hazards regression (right). The survival curves were evaluated at the mean value of covariates for patients with events.
Operational performance of the distributed regression analysis application.
| Performance metric | Linear | Logistic | Cox | Overall | |
| Required number of iterations for model convergence | 2 | 6 | 6 | —a | |
| Total run time | 440.7 | 925.5 | 1,016.0 | — | |
| Average iteration time, mean (SE) | 91.5 (10.5) | 95 (3.1) | 113.5 (5.2) | 102.4 (3.8) | |
|
| |||||
| Average download time, mean (SE) | 20.5 (5.4) | 20.6 (1.3) | 39.4 (4) | 28.6 (3.2) | |
|
| Average computation time, mean (SE) | 4.3 (2.6) | 3 (1.1) | 4.4 (0.4) | 3.8 (0.6) |
|
| Average upload time, mean (SE) | 8.4 (1.1) | 10.2 (0.7) | 9.9 (0.6) | 9.8 (0.4) |
|
| Average file transfer time (to data partners), mean (SE) | 10.5 (0.4) | 9.1 (0.5) | 9.4 (0.5) | 9.4 (0.3) |
|
| |||||
|
| Average download time, mean (SE) | 8.6 (1.2) | 10.3 (0.6) | 10.3 (0.8) | 10.1 (0.4) |
|
| Average computation time, mean (SE) | 8.2 (0.8) | 7.9 (0.4) | 8 (0.3) | 8 (0.2) |
|
| Average upload time, mean (SE) | 15.6 (1.2) | 15.9 (0.6) | 15.1 (0.3) | 15.5 (0.3) |
|
| Average file transfer time (to analysis center), mean (SE) | 20 (0.8) | 21.8 (1.9) | 23.1 (1.2) | 22.1 (1.0) |
aN/A: not applicable.
Distributed logistic regression vs pooled individual-level logistic regression.
| Covariates | Distributed regression analysis | Pooled individual-level analysis | Difference in parameter estimate | Difference in SE | |||||
|
| Parameter | SE | Parameter | SE |
|
| |||
| Intercept | 2.11573 | 0.22833 | 2.11573 | 0.22833 | −6.22 x 10−15 | −1.00 x 10−14 | |||
| Exposure | 1.06711 | 0.09895 | −1.06711 | 0.09895 | −2.00 x 10−15 | −1.80 x 10−16 | |||
| Age | −0.01606 | 0.00316 | −0.01607 | 0.00316 | −4.51 x 10−17 | −1.57 x 10−16 | |||
| Preindex BMI | 0.00003 | 0.00020 | 0.00003 | 0.00020 | 6.51 x 10−19 | 2.44 x 10−19 | |||
| Combined comorbidity score | −0.02623 | 0.02561 | −0.02623 | 0.02561 | −6.97 x 10−16 | −3.12 x 10−17 | |||
| Number of ambulatory visits | 0.01155 | 0.00447 | 0.01155 | 0.00447 | 6.25 x 10−17 | 1.13 x 10−17 | |||
| Number of emergency department visits | −0.06230 | 0.03132 | −0.06230 | 0.03133 | 3.05 x 10−16 | 1.39 x 10−17 | |||
| Number of inpatient visits | −0.12098 | 0.08940 | −0.12098 | 0.08940 | 1.75 x 10−15 | −2.36 x 10−16 | |||
| Number of nonacute institutional stay | 0.42510 | 0.78809 | 0.42510 | 0.78809 | −2.00 x 10−15 | −3.33 x 10−16 | |||
| Number of other ambulatory visits | 0.00381 | 0.00340 | 0.00381 | 0.00340 | 3.17 x 10−17 | −2.91 x 10−17 | |||
| Days between BMI measurement and index procedure | −0.00266 | 0.00201 | −0.00266 | 0.00201 | 3.90 x 10−17 | −4.77 x 10−18 | |||
|
| |||||||||
|
| Unknown | −0.39685 | 0.09485 | −0.39685 | 0.09485 | 0.00 x 10+00 | −2.50 x 10−16 | ||
| American Indian or Alaska Native | −0.13938 | 0.26230 | −0.13938 | 0.26230 | −1.11 x 10−16 | 5.55 x 10−17 | |||
| Asian | −0.37257 | 0.22341 | −0.37257 | 0.22341 | −3.04 x 10−14 | 2.78 x 10−17 | |||
|
| Black or African American | −0.29617 | 0.10507 | −0.29617 | 0.10507 | −3.33 x 10−16 | −9.71 x 10−17 | ||
|
| Native Hawaiian or Other Pacific Islander | −0.02910 | 0.40543 | −0.02910 | 0.40543 | −6.14 x 10−16 | 0.00 x 10+00 | ||
| Female | 0.19993 | 0.08422 | 0.19993 | 0.08422 | −1.80 x 10−15 | −3.61 x 10−16 | |||
|
| |||||||||
| 2011 | −0.10269 | 0.11683 | −0.10269 | 0.11684 | 6.37 x 10−15 | −5.55 x 10−17 | |||
| 2012 | 0.05547 | 0.11897 | 0.05547 | 0.11897 | 5.45 x 10−15 | −1.67 x 10−16 | |||
| 2013 | −0.11956 | 0.11382 | −0.11956 | 0.11382 | 6.80 x 10−15 | −1.94 x 10−16 | |||
|
| 2014 | −0.10956 | 0.11617 | −0.10956 | 0.11617 | 4.36 x 10−15 | −1.80 x 10−16 | ||
|
| 2015 | 0.03701 | 0.12798 | 0.03701 | 0.12798 | 6.47 x 10−15 | −2.50 x 10−16 | ||
|
| |||||||||
|
| 2 | −0.10433 | 0.11751 | −0.10433 | 0.11751 | 4.51 x 10−15 | −9.99 x 10−16 | ||
|
| 3 | 0.75506 | 0.12577 | 0.75506 | 0.12577 | 2.11 x 10−15 | −2.50 x 10−16 | ||
aReference groups: Race (white), surgery year (2010), and data partner site (1).