| Literature DB >> 30613799 |
Christopher T Rentsch1, Georges Reniers1,2, Chodziwadziwa Kabudula2, Richard Machemba3, Baltazar Mtenga3, Katie Harron4, Paul Mee5, Denna Michael3, Redempta Natalis6, Mark Urassa3, Jim Todd1,3, Basia Zaba1.
Abstract
INTRODUCTION: Health and demographic surveillance systems (HDSS) have been an invaluable resource for monitoring the health status of populations, but often contain self-reported health service utilisation, which are subject to reporting bias.Entities:
Keywords: data linkage; health and demographic surveillance systems; health facility; point-of-contact interactive record linkage; sub-Saharan Africa
Year: 2017 PMID: 30613799 PMCID: PMC6314455 DOI: 10.23889/ijpds.v2i1.408
Source DB: PubMed Journal: Int J Popul Data Sci ISSN: 2399-4908
Abbreviations: HDSS = health and demographic surveillance surveys; nM = number of matches; m-prob = match probability; TCL = ten-cell leader; HH = household member
Notes: TCL = an individual for a group of ten households; % collected = proportion of matched records with completed information; % agreement = proportion of matched records with agreeing information
| First search | Matched search | Change (Δ)=matched-first | ||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
| ||||||
|
Field
| Agreement condition | m-prob | % collected | % agreement | % collected | % agreement | Δ% collected | Δ% agreement |
| First name | Jaro-Winkler ≥ 0.8 | 0.87 | 100.0% | 83.8% | 100.0% | 94.1% | 0.0% | 10.3% |
| Second name | Jaro-Winkler ≥ 0.8 | 0.87 | 100.0% | 77.9% | 100.0% | 87.9% | 0.0% | 10.1% |
| Third name | Jaro-Winkler ≥ 0.8 | 0.85 | 83.4% | 5.7% | 82.0% | 5.3% | -1.4% | -0.3% |
| TCL first name | Jaro-Winkler ≥ 0.8 | 0.87 | 44.8% | 15.1% | 65.8% | 42.9% | 20.9% | 27.8% |
| TCL second name | Jaro-Winkler ≥ 0.8 | 0.87 | 39.4% | 13.6% | 60.8% | 40.9% | 21.5% | 27.3% |
| TCL third name | Jaro-Winkler ≥ 0.8 | 0.85 | 0.2% | 0.0% | 0.2% | 0.2% | 0.0% | 0.1% |
| HH first name | Jaro-Winkler ≥ 0.8 | 0.52 | 90.5% | 70.1% | 93.2% | 75.2% | 2.7% | 5.1% |
| HH second name | Jaro-Winkler ≥ 0.8 | 0.52 | 89.6% | 64.3% | 92.2% | 70.8% | 2.6% | 6.5% |
| HH third name | Jaro-Winkler ≥ 0.8 | 0.52 | 4.1% | 1.1% | 4.4% | 1.1% | 0.3% | 0.0% |
| Sex | exact match | 0.99 | 99.8% | 97.6% | 99.8% | 97.7% | 0.0% | 0.1% |
| Year of birth | within 2 years | 0.80 | 98.7% | 84.9% | 99.1% | 87.0% | 0.4% | 2.1% |
| Month of birth | exact match | 0.63 | 3.7% | 1.4% | 4.0% | 1.6% | 0.3% | 0.2% |
| Day of birth | exact match | 0.57 | 3.6% | 1.0% | 3.9% | 1.2% | 0.3% | 0.2% |
| Village | exact match | 0.89 | 90.9% | 83.3% | 93.0% | 89.4% | 2.1% | 6.1% |
| Sub-village | exact match | 0.89 | 90.9% | 67.2% | 93.0% | 78.0% | 2.1% | 10.8% |
Supplemental Figure 1: Log frequency of match scores calculated for all pairwise comparisons using full algorithm, by true match status
Figure 1: Classification diagram of record linkage outcomes against true match statusAbbreviations: TP = true positives; FP = false positives; FN = false negatives; TN = true negatives
Common calculations: sensitivity = TP/(TP+FN); positive predictive value = TP/(TP+FP); false match rate = FP/(FP+TN)
Abbreviations: CTC - HIV care and treatment centre; ANC - antenatal clinic; HTC - HIV testing and counselling clinic; HDSS - health and demographic surveillance system
Note: all statistics are given in n(%)
a Clinic differences tested for statistical significance with chi-square (𝜒) tests
| Exclusion criteria | Overall | CTC | ANC | HTC |
P a |
|---|---|---|---|---|---|
| (n=6,376) | (n=1,318) | (n=2,583) | (n=2,480) | ||
| Total excluded | 3,752 (58.9) | 762 (57.8) | 1,298 (50.3) | 1,692 (68.4) | <0.0001 |
| Never lived in HDSS area | 2,206 (34.6) | 642 (48.7) | 393 (15.2) | 1,171 (47.3) | <0.0001 |
| Recently born or moved into HDSS area | 1,576 (24.7) | 126 (9.6) | 915 (35.4) | 535 (21.6) | <0.0001 |
Abbreviations: CTC - HIV care and treatment centre; ANC - antenatal clinic; HTC - HIV testing and counselling clinic; HDSS - health and demographic surveillance system; IQR - interquartile range
Note: all statistics are given in n (%), unless otherwise noted
a Statistical differences tested for significance with chi-square (𝜒), Fisher's Exact, or Wilcoxon Rank-Sum tests
b Recently hired fieldworker who had not yet worked in CTC
| Overall | CTC | ANC | HTC | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
| |||||||||
| Characteristic | Matched | Not matched |
P a | Matched | Not matched |
P a | Matched | Not matched |
P a | Matched | Not matched |
P a |
| (n=2,206) | (n=418) | (n=478) | (n=78) | (n=1,077) | (n=208) | (n=651) | (n=132) | |||||
| Sex | ||||||||||||
| Female | 1,769 (84.2) | 331 (15.8) | 0.7181 | 307 (85.0) | 54 (15.0) | 0.4030 | 1,053 (84.2) | 197 (15.8) | 0.0446 | 409 (83.6) | 80 (16.4) | 0.6310 |
| Male | 433 (83.6) | 85 (16.4) | 170 (87.6) | 24 (12.4) | 21 (70.0) | 9 (30.0) | 242 (82.3) | 52 (17.7) | ||||
| Age, years | ||||||||||||
| <15 | 131 (86.2) | 21 (13.8) | 0.0431 | 26 (81.3) | 6 (18.8) | 0.0369 | 87 (88.8) | 11 (11.2) | 0.1887 | 18 (81.8) | 4 (18.2) | 0.5896 |
| 15-49 | 1,836 (83.4) | 365 (16.6) | 329 (84.3) | 61 (15.6) | 985 (83.5) | 195 (16.5) | 522 (82.7) | 109 (17.3) | ||||
| 50+ | 231 (89.2) | 28 (10.8) | 122 (92.4) | 10 (7.6) | 2 (66.7) | 1 (33.3) | 107 (86.3) | 17 (13.7) | ||||
| Sub-village of residence, has road | ||||||||||||
| Yes | 1,318 (81.4) | 302 (18.6) | <0.0001 | 227 (82.0) | 50 (18.0) | 0.0034 | 746 (82.0) | 164 (18.0) | 0.0027 | 345 (79.7) | 88 (20.3) | 0.0029 |
| No | 886 (88.9) | 111 (11.1) | 249 (90.6) | 26 (9.5) | 331 (88.7) | 42 (11.3) | 306 (87.7) | 43 (12.3) | ||||
| Sub-village of residence, type | ||||||||||||
| Rural | 703 (89.0) | 87 (11.0) | <0.0001 | 212 (88.3) | 28 (11.7) | 0.3595 | 237 (89.1) | 29 (10.9) | 0.0084 | 254 (89.4) | 30 (10.6) | 0.0005 |
| Peri-urban | 696 (84.6) | 127 (15.4) | 140 (85.9) | 23 (14.1) | 380 (84.8) | 68 (15.2) | 176 (83.0) | 36 (17.0) | ||||
| Urban | 805 (80.2) | 199 (19.8) | 124 (83.2) | 25 (16.8) | 460 (80.8) | 109 (19.2) | 221 (77.3) | 65 (22.7) | ||||
| Date of first visit | ||||||||||||
| June - August 2015 | 845 (81.5) | 192 (18.5) | 0.0050 | 303 (86.3) | 48 (13.7) | 0.4326 | 350 (78.8) | 94 (21.2) | 0.0014 | 192 (79.3) | 50 (20.7) | 0.1513 |
| September - December 2015 | 503 (88.3) | 67 (11.8) | 118 (88.1) | 16 (12.0) | 228 (89.8) | 26 (10.2) | 157 (86.3) | 25 (13.7) | ||||
| January - June 2016 | 503 (84.0) | 96 (16.0) | 33 (80.5) | 8 (19.5) | 299 (85.4) | 51 (14.6) | 171 (82.2) | 37 (17.8) | ||||
| July - December 2016 | 355 (84.9) | 63 (15.1) | 24 (80.0) | 6 (20.0) | 200 (84.4) | 37 (15.6) | 131 (86.8) | 20 (13.3) | ||||
| Fieldworker | ||||||||||||
| 1 - originally trained | 731 (86.7) | 112 (13.3) | 0.0001 | 412 (87.1) | 61 (12.9) | <0.0001 | 196 (86.0) | 32 (14.0) | 0.3075 | 118 (86.1) | 19 (13.9) | 0.0237 |
| 2 - originally trained | 951 (84.9) | 169 (15.1) | 46 (93.9) | 3 (6.1) | 747 (84.1) | 141 (15.9) | 156 (85.7) | 26 (14.3) | ||||
| 3 - originally trained | 387 (82.2) | 84 (17.8) | 10 (66.7) | 5 (33.3) | 49 (76.6) | 15 (23.4) | 324 (83.5) | 64 (16.5) | ||||
| 4 - substitute | 59 (69.4) | 26 (30.6) | 11 (52.6) | 9 (47.4) | 9 (90.9) | 1 (9.1) | 40 (71.4) | 16 (28.6) | ||||
| 5 - recently trained | 89 (78.1) | 25 (21.9) | b | 75 (79.8) | 19 (20.2) | 13 (65.0) | 7 (35.0) | |||||
| HIV test result at first visit | ||||||||||||
| Positive | - | - | - | 106 (83.5) | 21 (16.5) | 0.9855 | ||||||
| Negative | 529 (83.1) | 108 (17.0) | ||||||||||
| Inconclusive/unknown | 16 (84.2) | 3 (15.8) | ||||||||||
Abbreviations: CTC - HIV care and treatment centre; ANC - antenatal clinic; HTC - HIV testing and counselling clinic; HDSS - health and demographic surveillance system
Note: all statistics are given in n(%)
a Clinic differences tested for statistical significance with chi-square (𝜒) tests
| Overall | CTC | ANC | HTC | |
|---|---|---|---|---|
| Characteristic | OR (95% CI) | OR (95% CI) | OR (95% CI) | OR (95% CI) |
| Sample size (number missing) | 2,624 (22) | 556 (6) | 1,285 (10) | 783 (6) |
| Sex | ||||
| Female | 1 | 1 | 1 | 1 |
| Male | 0.89 (0.67, 1.17) | 1.34 (0.77, 2.33) | 0.32 (0.13, 0.81) | 0.92 (0.61, 1.37) |
| Age, per 5-year increase | 1.07 (1.02, 1.12) | 1.17 (1.06, 1.28) | 0.95 (0.87, 1.05) | 1.07 (0.99, 1.16) |
| Sub-village of residence, has road | ||||
| Yes | 1 | 1 | 1 | 1 |
| No | 1.44 (1.02, 2.03) | 2.69 (1.22, 5.95) | 1.39 (0.86, 2.25) | 0.95 (0.48, 1.85) |
| Sub-village of residence, type | ||||
| Rural | 1.44 (0.97, 2.14) | 0.62 (0.25, 1.52) | 1.54 (0.87, 2.74) | 2.41 (1.10, 5.31) |
| Peri-urban | 1.13 (0.89, 1.53) | 0.92 (0.47, 1.79) | 1.21 (0.83, 1.76) | 1.34 (0.78, 2.31) |
| Urban | 1 | 1 | 1 | 1 |
| Date of first visit | ||||
| June - August 2015 | 1 | 1 | 1 | 1 |
| September - December 2015 | 1.95 (1.43, 2.66) | 1.54 (0.75, 3.13) | 2.98 (1.79, 4.95) | 2.26 (1.17, 4.36) |
| January - June 2016 | 1.44 (1.09, 1.91) | 1.20 (0.39, 3.65) | 2.03 (1.30, 3.17) | 2.42 (1.17, 5.01) |
| July - December 2016 | 2.07 (1.37, 3.12) | 0.89 (0.23, 3.43) | 2.43 (1.23, 4.82) | 5.15 (2.06, 12.89) |
| Fieldworker who performed first search | ||||
| 1 - originally trained | 0.93 (0.70, 1.23) | 0.44 (0.12, 1.70) | 0.69 (0.41, 1.17) | 1.03 (0.53, 2.00) |
| 2 - originally trained | 1 | 1 | 1 | 1 |
| 3 - originally trained | 0.77 (0.56, 1.05) | 0.12 (0.02, 0.72) | 0.47 (0.23, 0.95) | 1.84 (0.90, 3.79) |
| 4 - substitute | 0.30 (0.18, 0.52) | 0.12 (0.03, 0.61) | 1.09 (0.13, 9.46) | 0.45 (0.21, 0.96) |
| 5 - recently trained | 0.36 (0.20, 0.66) | a | 0.43 (0.19, 0.97) | 0.17 (0.05, 0.53) |
| HIV test result at first visit | ||||
| Positive | - | - | - | 0.94 (0.55, 1.62) |
| Negative | 1 | |||
| Inconclusive/unknown | 0.82 (0.22, 2.99) | |||
Figure 2: Quality measures of a probabilistic record linkage algorithm used to link health facility and HDSS databases in rural Tanzania, first search attemptNotes: HH = household member; TCL = ten-cell leader, an individual for a group of ten households; % collected = proportion of matched records with completed information; % agreement = proportion of matched records with agreeing information
Figure 3: Sensitivity (Se) and positive predictive value (PPV) of automated retrospective record linkage at various match score percentile thresholds, full algorithmAbbreviations: HDSS - health and demographic sentinel surveillance
* Statistical differences tested for significance with chi-square (𝜒) or Fisher's Exact tests
a This question was only given to individuals aged 15 years or older
b This question was only given to females between 15 and 49 years of age
c This question was only given to individuals between 5 and 25 years of age
| Automated: full algorithm | |||||||
|---|---|---|---|---|---|---|---|
|
| |||||||
| PIRL match | Threshold=10%ile | Threshold=50%ile | Threshold=90%ile | ||||
|
|
|
|
| ||||
| Characteristic | n (%) | n (%) | p-value* | n (%) | p-value* | n (%) | p-value* |
| Total matched (PPV) | 2,612 | 2,612 (55.1) | 1,579 (70.3) | 292 (84.6) | |||
| Sex | |||||||
| Female | 2,061 (78.9) | 2,036 (78.0) | 0.4004 | 1,185 (75.1) | 0.0038 | 213 (73.0) | 0.0191 |
| Male | 551 (21.1) | 576 (22.1) | 394 (25.0) | 79 (27.1) | |||
| Age, in years | |||||||
| <5 | 125 (4.8) | 198 (7.6) | <0.0001 | 132 (8.4) | <0.0001 | 46 (15.8) | <0.0001 |
| 5-17 | 393 (15.1) | 464 (17.8) | 239 (15.2) | 35 (12.0) | |||
| 18-34 | 1,384 (53.0) | 1,301 (49.9) | 770 (48.8) | 125 (42.8) | |||
| 35-49 | 522 (20.0) | 433 (16.6) | 301 (19.1) | 68 (23.3) | |||
| 50-64 | 160 (6.1) | 162 (6.2) | 105 (6.7) | 15 (5.1) | |||
| 65+ | 28 (1.1) | 52 (2.0) | 30 (1.9) | 3 (1.0) | |||
| Village of residence | |||||||
| Kisesa | 999 (38.3) | 982 (37.6) | 0.9340 | 586 (37.1) | 0.8100 | 111 (38.0) | 0.3320 |
| Kanyama | 521 (20.0) | 529 (20.3) | 302 (19.1) | 46 (15.8) | |||
| Kitumba | 424 (16.2) | 444 (17.0) | 262 (16.6) | 48 (16.4) | |||
| Isangijo | 257 (9.8) | 258 (9.9) | 176 (11.2) | 39 (13.4) | |||
| Ihayabuyaga | 152 (5.8) | 138 (5.3) | 89 (5.6) | 21 (7.2) | |||
| Igekemaja | 141 (5.4) | 150 (5.7) | 94 (6.0) | 13 (4.5) | |||
| Welamasonga | 118 (4.5) | 111 (4.3) | 70 (4.4) | 14 (4.8) | |||
|
Marital status a | |||||||
| Never married | 362 (24.0) | 272 (24.1) | 0.9997 | 179 (22.5) | 0.4266 | 33 (22.3) | 0.6089 |
| Married once | 724 (48.0) | 540 (47.8) | 403 (50.6) | 72 (48.7) | |||
| Remarried | 175 (11.6) | 132 (11.7) | 99 (12.4) | 22 (14.9) | |||
| Separated/Widowed | 249 (16.5) | 187 (16.5) | 116 (14.6) | 21 (14.2) | |||
|
Pregnant at last HDSS round b | |||||||
| No | 1,057 (95.7) | 758 (95.5) | 0.8425 | 529 (95.0) | 0.5292 | 101 (98.1) | 0.3094 |
| Yes | 48 (4.3) | 36 (4.5) | 28 (5.0) | 2 (1.9) | |||
|
Enrolled in school at last HDSS round c | |||||||
| No | 378 (72.0) | 282 (67.6) | 0.1454 | 185 (68.3) | 0.2725 | 25 (52.1) | 0.0038 |
| Yes | 147 (28.0) | 135 (32.4) | 86 (31.7) | 23 (47.9) | |||
Supplemental Figure 2: Log frequency of match scores calculated for all pairwise comparisons using limited algorithm, by true match status
Supplemental Figure 3: Sensitivity (Se) and positive predictive value (PPV) of automated retrospective record linkage at various match score percentile thresholds, full (F) vs. limited (L) algorithmAbbreviations: HDSS - health and demographic sentinel surveillance
* Statistical differences tested for significance with chi-square (𝜒) or Fisher's Exact tests
a This question was only given to individuals aged 15 years or older
b This question was only given to females between 15 and 49 years of age
c This question was only given to individuals between 5 and 25 years of age
| Automated: full algorithm | Automated: limited algorithm | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
| ||||||||||||
| PIRL match | Threshold=10%ile | Threshold=50%ile | Threshold=90%ile | Threshold=10%ile | Threshold=50%ile | Threshold=90%ile | |||||||
|
|
|
|
|
|
| ||||||||
| Characteristic | n (%) | n (%) | p-value* | n (%) | p-value* | n (%) | p-value* | n (%) | p-value* | n (%) | p-value* | n (%) | p-value* |
| Total matched (PPV) | 2,612 | 2,612 (55.1) | 1,579 (70.3) | 292 (84.6) | 2602 (58.4) | 1,514 (75.2) | 288 (84.7) | ||||||
| Sex | |||||||||||||
| Female | 2,061 (78.9) | 2,036 (78.0) | 0.4004 | 1,185 (75.1) | 0.0038 | 213 (73.0) | 0.0191 | 2,059 (79.1) | 0.8409 | 1,158 (76.5) | 0.0706 | 209 (72.6) | 0.0133 |
| Male | 551 (21.1) | 576 (22.1) | 394 (25.0) | 79 (27.1) | 543 (20.9) | 356 (23.5) | 79 (27.4) | ||||||
| Age, in years | |||||||||||||
| <5 | 125 (4.8) | 198 (7.6) | <0.0001 | 132 (8.4) | <0.0001 | 46 (15.8) | <0.0001 | 198 (7.6) | <0.0001 | 122 (8.1) | 0.0013 | 33 (11.5) | <0.0001 |
| 5-17 | 393 (15.1) | 464 (17.8) | 239 (15.2) | 35 (12.0) | 453 (17.4) | 211 (14.0) | 34 (11.8) | ||||||
| 18-34 | 1,384 (53.0) | 1,301 (49.9) | 770 (48.8) | 125 (42.8) | 1,325 (51.0) | 765 (50.6) | 121 (42.0) | ||||||
| 35-49 | 522 (20.0) | 433 (16.6) | 301 (19.1) | 68 (23.3) | 437 (16.8) | 296 (19.6) | 74 (25.7) | ||||||
| 50-64 | 160 (6.1) | 162 (6.2) | 105 (6.7) | 15 (5.1) | 144 (5.5) | 99 (6.5) | 23 (8.0) | ||||||
| 65+ | 28 (1.1) | 52 (2.0) | 30 (1.9) | 3 (1.0) | 43 (1.7) | 20 (1.3) | 3 (1.0) | ||||||
| Village of residence | |||||||||||||
| Kisesa | 999 (38.3) | 982 (37.6) | 0.9340 | 586 (37.1) | 0.8100 | 111 (38.0) | 0.3320 | 981 (37.7) | 0.6773 | 531 (35.1) | 0.3071 | 73 (25.4) | 0.0002 |
| Kanyama | 521 (20.0) | 529 (20.3) | 302 (19.1) | 46 (15.8) | 527 (20.3) | 299 (19.8) | 59 (20.5) | ||||||
| Kitumba | 424 (16.2) | 444 (17.0) | 262 (16.6) | 48 (16.4) | 436 (16.8) | 254 (16.8) | 49 (17.0) | ||||||
| Isangijo | 257 (9.8) | 258 (9.9) | 176 (11.2) | 39 (13.4) | 254 (9.8) | 177 (11.7) | 46 (16.0) | ||||||
| Ihayabuyaga | 152 (5.8) | 138 (5.3) | 89 (5.6) | 21 (7.2) | 129 (5.0) | 87 (5.8) | 22 (7.6) | ||||||
| Igekemaja | 141 (5.4) | 150 (5.7) | 94 (6.0) | 13 (4.5) | 163 (6.3) | 94 (6.2) | 24 (8.3) | ||||||
| Welamasonga | 118 (4.5) | 111 (4.3) | 70 (4.4) | 14 (4.8) | 112 (4.3) | 72 (4.8) | 15 (5.2) | ||||||
|
Marital status a | |||||||||||||
| Never married | 362 (24.0) | 272 (24.1) | 0.9997 | 179 (22.5) | 0.4266 | 33 (22.3) | 0.6089 | 286 (25.3) | 0.8668 | 176 (22.5) | 0.7093 | 26 (16.5) | 0.0139 |
| Married once | 724 (48.0) | 540 (47.8) | 403 (50.6) | 72 (48.7) | 536 (47.4) | 391 (49.9) | 80 (50.6) | ||||||
| Remarried | 175 (11.6) | 132 (11.7) | 99 (12.4) | 22 (14.9) | 124 (11.0) | 95 (12.1) | 30 (19.0) | ||||||
| Separated/Widowed | 249 (16.5) | 187 (16.5) | 116 (14.6) | 21 (14.2) | 185 (16.4) | 121 (15.5) | 22 (13.9) | ||||||
|
Pregnant at last HDSS round b | |||||||||||||
| No | 1,057 (95.7) | 758 (95.5) | 0.8425 | 529 (95.0) | 0.5292 | 101 (98.1) | 0.3094 | 769 (95.2) | 0.6166 | 531 (95.5) | 0.8862 | 93 (92.1) | 0.1310 |
| Yes | 48 (4.3) | 36 (4.5) | 28 (5.0) | 2 (1.9) | 39 (4.8) | 25 (4.5) | 8 (7.9) | ||||||
|
Enrolled in school at last HDSS roundc | |||||||||||||
| No | 378 (72.0) | 282 (67.6) | 0.1454 | 185 (68.3) | 0.2725 | 25 (52.1) | 0.0038 | 295 (67.7) | 0.1438 | 186 (69.9) | 0.5422 | 21 (60.0) | 0.1288 |
| Yes | 147 (28.0) | 135 (32.4) | 86 (31.7) | 23 (47.9) | 141 (32.3) | 80 (30.1) | 14 (40.0) | ||||||