| Literature DB >> 24884457 |
Chodziwadziwa W Kabudula1, Benjamin D Clark, Francesc Xavier Gómez-Olivé, Stephen Tollman, Jane Menken, Georges Reniers.
Abstract
BACKGROUND: Health and Demographic Surveillance Systems (HDSS) have been instrumental in advancing population and health research in low- and middle- income countries where vital registration systems are often weak. However, the utility of HDSS would be enhanced if their databases could be linked with those of local health facilities. We assess the feasibility of record linkage in rural South Africa using data from the Agincourt HDSS and a local health facility.Entities:
Mesh:
Year: 2014 PMID: 24884457 PMCID: PMC4041350 DOI: 10.1186/1471-2288-14-71
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Linkage scenarios by identifiers and string comparison techniques applied to names
| | | ||||||
|---|---|---|---|---|---|---|---|
| Identifiers used | Routinely collected identifiers* | S1 | S2 | S3 | S4 | S5 | S6 |
| Routinely collected identifiers + household member first name | S7 | S8 | S9 | S10 | S11 | S12 | |
| Routinely collected identifiers + household member first name and surname | | | S13 | S14 | S15 | | |
| Deterministic linkage on National ID Number or telephone number followed by best of S1-S15** | | | | | | S16 | |
| S16 + clerical review of 5%, 10%, 15%, and 20% of record pairs above and below the threshold value above which record pairs are automatically accepted as matches | S17-S20 | ||||||
*Routinely collected identifiers = first name, last name, sex, day of birth, month of birth, year of birth and village; JW = Jaro-Winkler; DM = double metaphone code.
**The best of the 15 probabilistic linkage scenarios is the one that yields the maximum sensitivity and PPV.
Completeness of identifiers from both sources
| First name | 100.00 | 100.00 |
| Surname | 100.00 | 100.00 |
| Other first name | 35.57 | 6.14 |
| Sex | 100.00 | 99.95 |
| Date of birth | 100.00 | 100.00 |
| Village | 100.00 | 81.17 |
| Household member first name | 98.48 | 77.29 |
| Household member surname | 98.48 | 76.60 |
| ID number | 67.14 | 1.55 |
| Telephone number | 37.48 | 26.67 |
Figure 1Sensitivity and positive predictive values (PPVs) in various linkage scenarios. See Table 1 for a description of the scenarios.
Background characteristics associated with successful matching in the dataset of records matched by means of fingerprints
| | | | ||||||
|---|---|---|---|---|---|---|---|---|
| | 623 | 492 (79.0) | | 551 (88.4) | | 552 (88.6) | | |
| | | | | | | | ||
| | Female | 511 | 395 (77.3) | 1 | 445 (87.1) | 1 | 447 (87.5) | 1 |
| | Male | 112 | 97 (86.6) | 2.86 (1.41-5.82)* | 106 (94.6) | 4.38 (1.52-12.61)* | 105 (93.8) | 3.34 (1.25-8.97)* |
| | | | | | | | ||
| | 18-34 | 334 | 284 (85.0) | 1 | 308 (92.2) | 1 | 308 (92.2) | 1 |
| | 35-49 | 125 | 100 (80.0) | 0.99 (0.53-1.84) | 112 (89.6) | 0.84 (0.36-1.93) | 115 (92.0) | 1.21 (0.5-2.92) |
| | 50-64 | 89 | 66 (74.2) | 0.76 (0.35-1.66) | 78 (87.6) | 0.75 (0.27-2.14) | 77 (86.5) | 0.75 (0.27-2.12) |
| | 65+ | 75 | 42 (56.0) | 0.35 (0.15-0.85)* | 53 (70.7) | 0.21 (0.07-0.63)* | 52 (69.3) | 0.25 (0.08-0.74)* |
| | | | | | | | ||
| | Other | 96 | 67 (70.0) | 1 | 76 (79.2) | 1 | 75 (78.1) | 1 |
| | South African | 527 | 425 (80.7) | 1.3 (0.71-2.37) | 475 (90.1) | 1.82 (0.88-3.77) | 477 (90.5) | 2.1 (1.02-4.33)* |
| | | | | | | | ||
| | Permanent | 574 | 450 (78.4) | 1 | 506 (88.1) | 1 | 507 (88.3) | 1 |
| | Temporary and other | 49 | 42 (85.7) | 1.63 (0.54-4.88) | 45 (91.8) | 1.28 (0.28-5.89) | 45 (91.8) | 1.4 (0.31-6.44) |
| | | | | | | |||
| | None | 97 | 54 (55.7) | 1 | 71 (73.2) | 1 | 69 (71.1) | 1 |
| | Some primary | 191 | 144 (75.4) | 1.46 (0.76-2.83) | 164 (85.8) | 1.16 (0.51-2.63) | 166 (87.0) | 1.43 (0.64-3.22) |
| | Post primary | 302 | 267 (88.4) | 2.73 (1.18-6.36)* | 288 (95.4) | 2.62 (0.87-7.92) | 288 (95.4) | 3.05 (1.01-9.24)* |
| | | | | | | | ||
| | Not working | 514 | 413 (80.4) | 1 | 462 (89.8) | 1 | 460 (89.5) | 1 |
| | Working | 93 | 70 (75.3) | 0.68 (0.37-1.25) | 79 (85.0) | 0.53 (0.25-1.14) | 81 (87.1) | 0.71 (0.32-1.58) |
| | | | | | | | ||
| | Lowest | 44 | 28 (63.6) | 1 | 33 (75.0) | 1 | 34 (77.3) | 1 |
| | Second | 84 | 62 (73.8) | 1.48 (0.63-3.49) | 75 (89.3) | 2.42 (0.84-6.98) | 73 (90.0) | 1.63 (0.57-4.62) |
| | Middle | 125 | 100 (80.0) | 1.89 (0.82-4.37) | 108 (86.4) | 1.60 (0.6-4.25) | 110 (88.0) | 1.58 (0.58-4.36) |
| | Fourth | 172 | 136 (79.1) | 1.81 (0.8-4.11) | 152 (88.3) | 2.08 (0.78-5.54) | 150 (87.2) | 1.47 (0.55-3.93) |
| | Highest | 184 | 159 (86.4) | 2.9 (1.24-6.75)* | 174 (94.5) | 4.4 (1.51-12.84)* | 175 (95.1) | 4.03 (1.34-12.17)* |
| | | | | | | | ||
| Pseudo R2, Wald | 0.11, 56.89 (<0.0001) | 0.16, 51.94 (<0.0001) | 0.16, 53.76 (<0.0001) | |||||
Statistical significance: * = p-value < 0.05.
Distribution of background characteristics in the dataset matched by means of fingerprints compared to three datasets of records matched using conventional personal identifiers
| | | | | | | | ||
| | Female | 511 (82.0) | 395 (80.3) | | 445 (80.8) | | 447 (81.0) | |
| | Male | 112 (18.0) | 97 (19.7) | 0.460 | 106 (19.2) | 0.579 | 105 (19.0) | 0.645 |
| | | | | | | | ||
| | 18-34 | 334 (53.6) | 284 (57.7) | | 308 (55.9) | | 308 (55.8) | |
| | 35-49 | 125 (20.1) | 100 (20.3 | | 112 (20.3) | | 115 (20.8) | |
| | 50-64 | 89 (14.3) | 66 (13.4) | | 78 (14.2) | | 77 (14.0) | |
| | 65+ | 75 (12.0) | 42 (8.5) | 0.240 | 53 (9.6) | 0.601 | 52 (9.4) | 0.528 |
| | | | | | | | ||
| | Other | 96 (15.4) | 67 (13.6) | | 76 (13.8) | | 75 (13.6) | |
| | South African | 527 (84.6) | 425 (86.4) | 0.401 | 475 (86.2) | 0.434 | 477 (86.4) | 0.377 |
| | | | | | | | ||
| | Permanent | 574 (92.1) | 450 (91.5) | | 506 (91.8) | | 507 (91.8) | |
| | Temporary and other | 48 (7.7) | 42 (8.5) | 0.595 | 45 (8.2) | 0.617 | 45 (8.2) | 0.618 |
| | | | | | | | ||
| | None | 97 (15.6) | 54 (11.0) | | 71 (12.9) | | 69 (12.5) | |
| | Some primary | 191 (30.7) | 144 (29.3) | | 164 (29.8) | | 166 (30.1) | |
| | Post primary | 302 (48.5) | 267 (54.3) | 0.098 | 288 (52.3) | 0.491 | 288 (52.2) | 0.426 |
| | | | | | | | ||
| | Not working | 514 (82.5) | 413 (83.9) | | 462 (83.4) | | 460 (83.3) | |
| | Working | 93 (14.9) | 70 (14.2) | 0.660 | 79 (14.3) | 0.643 | 81 (14.7) | 0.795 |
| | | | | | | | ||
| | Lowest | 44 (7.1) | 28 (5.7) | | 33 (6.0) | | 34 (16.2) | |
| | Second | 84 (13.5) | 62 (12.6) | | 75 (13.6) | | 73 (13.2) | |
| | Middle | 125 (20.1) | 100 (20.3) | | 108 (19.6) | | 110 (19.9) | |
| | Fourth | 172 (27.6) | 136 (27.6) | | 152 (27.6) | | 150 (21.2) | |
| Highest | 184 (29.5) | 159 (32.3) | 0.753 | 174 (31.58) | 0.912 | 175 (31.7) | 0.952 | |
*p-value using chi-squared test computed separately for records in each scenario compared to records matched by means of fingerprints.