| Literature DB >> 22871397 |
Khaled El Emam1, Saeed Samet, Luk Arbuckle, Robyn Tamblyn, Craig Earle, Murat Kantarcioglu.
Abstract
BACKGROUND: There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak.Entities:
Mesh:
Year: 2012 PMID: 22871397 PMCID: PMC3628043 DOI: 10.1136/amiajnl-2011-000735
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1Overview of set-up for implementing the SPARK protocol when there are only three sites. This figure is only reproduced in colour in the online version.
Size of simulated datasets (excluding intercept)
| Number of covariates | 5 | 10 | 15 | 20 |
| Number of observations | 200 | 400 | 600 | 800 |
Computation time for different datasets assuming two parties
| No of covariates | Type | Time (min) |
|---|---|---|
| 5 | iid | 0.0286 |
| Correlated | 0.0244 | |
| Binary | 0.0238 | |
| 10 | iid | 0.1836 |
| Correlated | 0.1395 | |
| Binary | 0.1249 | |
| 15 | iid | 0.6669 |
| Correlated | 0.4336 | |
| Binary | 0.3935 | |
| 20 | iid | 0.9804 |
| Correlated | 1.0159 | |
| Binary | 1.0026 |
Time is the average across 5000 replicates.
iid, independent identically distributed.
Figure 2Performance in seconds as the number of records increases from 100 000 to 1 million for two to five sites. This figure is only reproduced in colour in the online version.
Absolute difference between SPARK and SAS estimates for intercept and five covariates, based on a simple bootstrap of 5000 replicates*, with a recorded precision of 10e-9 for estimates
| No of covariates | Type | Estimate | Maximum absolute difference between estimates (×10e-6) | |||||
|---|---|---|---|---|---|---|---|---|
| 5 | iid | 0.073 | 0.257 | 0.082 | 0.094 | 0.117 | 0.017 | |
| Correlated | 0.060 | 0.229 | 0.133 | 0.061 | 0.084 | 0.158 | ||
| Binary | 0.071 | 0.447 | 0.110 | 0.079 | 0.233 | 0.126 | ||
| 10 | iid | 0.555 | 2.050 | 0.589 | 0.162 | 0.716 | 0.740 | |
| Correlated | 0.025 | 0.089 | 0.032 | 0.017 | 0.036 | 0.059 | ||
| Binary | 0.023 | 0.072 | 0.025 | 0.024 | 0.027 | 0.027 | ||
| 15 | iid | 0.930 | 4.340 | 0.980 | 0.807 | 1.850 | 1.510 | |
| Correlated | 0.016 | 0.075 | 0.041 | 0.028 | 0.030 | 0.027 | ||
| Binary | 0.049 | 0.120 | 0.034 | 0.034 | 0.042 | 0.028 | ||
| 20 | iid | 0.021 | 0.093 | 0.026 | 0.040 | 0.033 | 0.028 | |
| Correlated | 0.041 | 0.200 | 0.087 | 0.017 | 0.094 | 0.058 | ||
| Binary | 0.114 | 1.330 | 0.334 | 0.220 | 0.530 | 0.360 | ||
| 5 | iid | 0.017 | 0.069 | 0.020 | 0.023 | 0.031 | 0.024 | |
| Correlated | 0.015 | 0.057 | 0.035 | 0.018 | 0.020 | 0.037 | ||
| Binary | 0.032 | 0.206 | 0.029 | 0.019 | 0.107 | 0.032 | ||
| 10 | iid | 0.376 | 1.504 | 0.429 | 0.142 | 0.524 | 0.533 | |
| Correlated | 0.006 | 0.019 | 0.007 | 0.005 | 0.007 | 0.015 | ||
| Binary | 0.004 | 0.017 | 0.004 | 0.004 | 0.005 | 0.005 | ||
| 15 | iid | 1.548 | 8.500 | 1.960 | 1.577 | 2.860 | 1.790 | |
| Correlated | 0.002 | 0.018 | 0.005 | 0.003 | 0.005 | 0.004 | ||
| Binary | 0.005 | 0.009 | 0.003 | 0.002 | 0.004 | 0.003 | ||
| 20 | iid | 0.005 | 0.025 | 0.006 | 0.002 | 0.009 | 0.008 | |
| Correlated | 0.009 | 0.052 | 0.023 | 0.005 | 0.026 | 0.019 | ||
| Binary | 0.073 | 1.239 | 0.301 | 0.205 | 0.514 | 0.340 | ||
Replicates in which complete or quasi-complete separation was detected in SAS were excluded. This occurred in less than 2.5% of replicates for all but the dataset with 15 iid covariates, in which separation was detected in 16.6% of replicates.
iid, independent identically distributed.
Examples of link functions
| Name | Function |
|---|---|
| Identity | |
| Reciprocal | |
| Reciprocal squared | |
| Square root | |
| Log | |
| Complementary log-log | |
| Logit |