| Literature DB >> 22768321 |
Khaled El Emam1, Saeed Samet, Jun Hu, Liam Peyton, Craig Earle, Gayatri C Jayaraman, Tom Wong, Murat Kantarcioglu, Fida Dankar, Aleksander Essex.
Abstract
INTRODUCTION: In order to monitor the effectiveness of HPV vaccination in Canada the linkage of multiple data registries may be required. These registries may not always be managed by the same organization and, furthermore, privacy legislation or practices may restrict any data linkages of records that can actually be done among registries. The objective of this study was to develop a secure protocol for linking data from different registries and to allow on-going monitoring of HPV vaccine effectiveness.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22768321 PMCID: PMC3388071 DOI: 10.1371/journal.pone.0039915
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Example of a contingency table for which we want to compute a bivariate relationship.
| Any HPV (R1) | |||
| −ve | +ve | ||
|
| Aboriginal | n11 | n12 |
| White | n21 | n22 | |
Figure 1Different architectures in the literature for linking two registries.
Example of a contingency table for which there is a high identity disclosure risk within the population.
| Any HPV | |||
| −ve | +ve | ||
|
| Aboriginal | 1 | 11 |
| White | 50 | 15 | |
Example of a contingency table for which there is a high identity disclosure risk from an external attacker.
| Any HPV | |||
| −ve | +ve | ||
|
| Aboriginal | 0 | 5 |
| White | 50 | 15 | |
Example of contingency table with suppressed cells.
| Any HPV | |||
| −ve | +ve | ||
|
| Aboriginal | – | 11 |
| White | 50 | 15 | |
Example of contingency table that would reveal the contents of the suppressed cells in Table 4.
| Age | |||
| <20 |
| ||
|
| Aboriginal | 6 | 6 |
| White | 50 | 15 | |
Example of contingency table with suppressed cells.
| Any HPV | |||
| −ve | +ve | ||
|
| Aboriginal | – | 5 |
| White | 50 | 15 | |
Example of contingency table that would reveal the contents of the suppressed cells in Table 6.
| Any HPV | |||
| −ve | +ve | ||
|
| Male | 25 | 2 |
| Female | 25 | 18 | |
Figure 2The public health unit sends a query for a particular cell within the desired contingency table.
Figure 3An example showing how a registry responds for a request for counts.
A sequence of messages is generated for each patient.
Example of the matching performed by Aggregator 1 and Aggregator2 based on the hash values that they receive.
| Aggregator 1 Matching Table | |
| Registry 1 | Registry 2 |
|
|
|
|
| |
|
| |
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
Notation for computing statistics.
| Any HPV | |||
| -ve | +ve | ||
|
|
| n11 = n1,11+n2,11 | n12 = n1,12+n2,12 |
|
| n21 = n1,21+n2,21 | n22 = n1,22+n2,22 | |
Figure 4The flow of information between the Aggregators and the PHU.
Summary of the computation time range for each of the statistics.
| Statistic | Total Computation Time (max → min) |
| Chi-square |
|
| Odds Ratio |
|
| Relative Risk |
|
Summary of the inputs and outputs for the building block and analysis protocols.
| Protocol | Inputs | Outputs | Equation |
| Two-party addition |
|
|
|
| Two-party multiplication |
|
|
|
| Odds Ratio |
|
|
|
|
|
| ||
|
| |||
|
| |||
| Chi-square |
|
|
|
|
|
| ||
|
| |||
|
| |||
| Relative Risk |
|
|
|
|
|
| ||
|
| |||
|
| |||
| Confidence Interval for Odds Ratio |
|
|
|
|
|
| ||
|
| |||
|
| |||
| Confidence Interval for Relative Risk |
|
|
|
|
|
| ||
|
| |||
|
|
Figure 5The average computation times for the chi-square test when the total number of records returned by the queries varies from the two registries are 5,000, 10,000, 50,000, and 100,000 for a 4 cell and a 16 cell contingency table.
Figure 6The average computation times for the chi-square test as the proportion of records matching varies for different data set sizes for contingency tables with 4 and 16 cells.
Figure 7The average computation times for the chi-square test as the number of cells varies for different data set sizes when the proportion of records matching (m) is 90% and 20%.