| Literature DB >> 30176856 |
Sean Randall1, Adrian Brown2, James Boyd2, Rainer Schnell3, Christian Borgs3, Anna Ferrante2.
Abstract
BACKGROUND: Record linkage is an important tool for epidemiologists and health planners. Record linkage studies will generally contain some level of residual record linkage error, where individual records are either incorrectly marked as belonging to the same individual, or incorrectly marked as belonging to separate individuals. A key question is whether errors in linkage quality are distributed evenly throughout the population, or whether certain subgroups will exhibit higher rates of error. Previous investigations of this issue have typically compared linked and un-linked records, which can conflate bias caused by record linkage error, with bias caused by missing records (data capture errors).Entities:
Keywords: Bias; Methodological research; Record linkage; Sociodemographic differences
Mesh:
Year: 2018 PMID: 30176856 PMCID: PMC6122711 DOI: 10.1186/s12913-018-3495-x
Source DB: PubMed Journal: BMC Health Serv Res ISSN: 1472-6963 Impact factor: 2.655
Sociodemographic profile of each dataset - number and proportion of records in each sociodemographic category
| NSW Emergency | NSW Hospital | SA Emergency | WA Hospital | |
|---|---|---|---|---|
| Total | 4,304,459 (100%) | 19,874,083 (100%) | 813,839 (100%) | 6,772,949 (100%) |
|
| ||||
| Major Cities | 2,868,504 (67%) | 14,351,493 (72%) | 690,399 (85%) | 5,045,362 (74%) |
| Regional | 1,367,635 (32%) | 5,263,797 (26%) | 54,665 (7%) | 1,199,149 (18%) |
| Remote | 12,250 (0%) | 137,540 (1%) | 6333 (1%) | 498,850 (7%) |
|
| ||||
| Male | 2,178,168 (51%) | 9,346,451 (47%) | 386,176 (47%) | 3,184,925 (47%) |
| Female | 2,125,422 (49%) | 10,526,591 (53%) | 427,645 (53%) | 3,588,021 (53%) |
|
| ||||
| Most Disadvantaged | 1,155,081 (27%) | 4,133,693 (24%) | 212,613 (26%) | 1,723,482 (26%) |
| 2 | 833,405 (19%) | 3,574,267 (21%) | 172,838 (21%) | 1,468,495 (22%) |
| 3 | 974,884 (23%) | 3,220,110 (19%) | 154,871 (19%) | 1,256,879 (19%) |
| 4 | 705,006 (16%) | 3,133,389 (18%) | 129,227 (16%) | 1,128,067 (17%) |
| Least Disadvantaged | 563,326 (13%) | 3,285,911 (19%) | 103,915 (13%) | 1,149,271 (17%) |
|
| ||||
| < 1950 | 1,431,527 (33%) | 9,726,134 (49%) | 269,257 (33%) | 3,164,258 (47%) |
| 1950–1979 | 1,849,952 (43%) | 6,460,872 (33%) | 353,815 (43%) | 2,535,776 (37%) |
| 1980+ | 1,022,980 (24%) | 3,686,298 (19%) | 190,767 (23%) | 1,072,915 (16%) |
Optimal overall linkage quality for each of the four datasets
| Precision | Recall | Average | |
|---|---|---|---|
| NSW Emergency | 0.993 | 0.988 | 0.991 |
| NSW Hospital | 0.986 | 0.971 | 0.979 |
| SA Emergency | 0.988 | 0.971 | 0.980 |
| WA Hospital | 0.994 | 0.987 | 0.991 |
Fig. 1Comparison of recall scores at optimal threshold by sociodemographic category and dataset
Fig. 2Comparison of precision scores at optimal threshold by sociodemographic category and dataset
Incidence rates (admissions/presentations per person-year) by sociodemographic category and dataset, comparing the gold-standard benchmark to the linkage results
| NSW Emergency | NSW Hospital | SA Emergency | WA Hospital | |||||
|---|---|---|---|---|---|---|---|---|
| GSa | Estimated | GS | Estimated | GS | Estimated | GS | Estimated | |
|
| ||||||||
| Major Cities | 0.68 | 0.68 | 0.41 | 0.41 | 0.75 | 0.76 | 0.47 | 0.48 |
| Regional | 0.83 | 0.84 | 0.42 | 0.43 | 0.53 | 0.53 | 0.46 | 0.46 |
| Remote | 0.59 | 0.590 | 0.46 | 0.47 | 0.50 | 0.50 | 0.48 | 0.52 |
|
| ||||||||
| Male | 0.72 | 0.73 | 0.42 | 0.42 | 0.70 | 0.70 | 0.46 | 0.47 |
| Female | 0.72 | 0.72 | 0.41 | 0.41 | 0.75 | 0.75 | 0.47 | 0.47 |
|
| ||||||||
| Most Disadvantaged | 0.85 | 0.86 | 0.44 | 0.45 | 0.83 | 0.84 | 0.54 | 0.56 |
| 2 | 0.64 | 0.65 | 0.41 | 0.41 | 0.75 | 0.76 | 0.48 | 0.49 |
| 3 | 0.82 | 0.83 | 0.39 | 0.38 | 0.71 | 0.72 | 0.46 | 0.47 |
| 4 | 0.65 | 0.66 | 0.38 | 0.37 | 0.66 | 0.66 | 0.43 | 0.43 |
| Least Disadvantaged | 0.59 | 0.59 | 0.38 | 0.38 | 0.60 | 0.61 | 0.41 | 0.41 |
|
| ||||||||
| < 1950 | 0.78 | 0.79 | 0.66 | 0.69 | 0.76 | 0.76 | 0.76 | 0.79 |
| 1950–1979 | 0.68 | 0.69 | 0.36 | 0.36 | 0.70 | 0.70 | 0.42 | 0.42 |
| 1980+ | 0.73 | 0.74 | 0.29 | 0.27 | 0.75 | 0.76 | 0.31 | 0.3 |
aResults using the gold standard benchmark