| Literature DB >> 18631404 |
Rosemary Karmel1, Diana Rosman.
Abstract
BACKGROUND: Data linkage is a technique that has long been used to connect information across several disparate data sources--most commonly for medical and population health research. Often the purpose is to connect data for individuals over extended time periods or across different service settings, and so person-based linkage using detailed personal information is preferred. Linkage which aims to link connected events, on the other hand, requires information about the time and place of the event as well as the person or persons involved in that event in order to make high quality linkages. This paper describes the detailed process of event linkage and compares directly an event-based linkage method for identifying transition events between two care sectors in Australia with a well-established high quality longitudinal person-based linkage which facilitates identification of event data for individuals.Entities:
Mesh:
Year: 2008 PMID: 18631404 PMCID: PMC2488340 DOI: 10.1186/1472-6963-8-149
Source DB: PubMed Journal: BMC Health Serv Res ISSN: 1472-6963 Impact factor: 2.655
Identifying additional data for E linkage.
| Findings | Events for people on leave from permanent RAC (either for hospital leave or for social leave) have both start and end dates recorded. |
| E strategy adjustment | Matches to the admission, hospital leave and social leave events were carried out separately, and start and end dates for hospital and social leave events were used when matching to hospital discharges. |
| Findings | Information on usual residence (private dwelling, RAC etc.) is not reliably available on the hospital dataset. A variable on hospital 'mode of admission' shows whether or not the hospital event started with a within-sector transfer or as an admission into hospital. |
| E strategy adjustment | Hospital mode of admission data influenced the type of date comparisons made when matching to RAC leave events. |
| Findings | 'Mode of discharge' is reported on the hospital data. Categories include whether the patient transferred within the sector, died in hospital, returned to their usual residence or went to RAC. Death in hospital is well-reported but return to RAC as the usual residence and admission to RAC are not well-distinguished. |
| E strategy adjustment | Hospital episodes ending with a within-sector transfer were excluded from the linkage process (also done for the basic E linkage). Hospital discharges were divided into three groups defined by reported destination on discharge: 'to usual residence', 'died' and 'other' (predominately people reported as discharged to RAC). The three groups of RAC events (admissions, hospital leave and social leave) were then matched separately to each of these three groups, with the exception that RAC admissions were not matched to hospital discharges due to death. This reduced the incidence of within-dataset duplicates with respect to match variables and so allowed greater flexibility in choice of match variables. |
| Findings | The RAC dataset does not explicitly contain information about a client's location just prior to admission. However, data on place of assessment for aged care (including whether in hospital) and date of the assessment are available. |
| E strategy adjustment | Place and date of aged care assessment were used to assist linkage when matching using hospital region (see 6. below) |
| Findings | Hospital episodes may last less than a day (same-day episodes), so a person may have two hospital episodes starting or ending on the same day. In addition, people may go to a different RAC facility at the end of their hospital leave. Such a change in RAC facility is recorded both as a return from hospital leave and as an admission into RAC, resulting in two entry events reported for the same day relating to the same event (i.e. collision events). |
| E strategy adjustment | Because same-day hospital episodes are highly unlikely to be related to a RAC entry event, same-day hospital episodes were excluded from the linkage process (also done for the basic E linkage). To select between any duplicate matches arising from multiple events on the same day for the same person or multiple comparisons between the groups of RAC events and hospital discharge groups, likely matches from the various comparisons were given a priority ranking. Priority ranks were based on reported destination on discharge from hospital, in addition to a preference for matches to a RAC hospital leave record over a RAC admission record (as a link to the former indicates that the person was already living in RAC), with matches to social leave being least preferred. |
| Findings | Other available address-type information includes postcode of the hospital and postcode of the receiving RAC facility. |
| E strategy adjustment | Usual residence was grouped into regions of various size to allow for movement within a specified neighbourhood. Hospital region and RAC facility region as well as reported region of the person's usual residence were considered for matching. |
Adjustments to E linkage strategies based on knowledge of service systems and additional information in RAC and hospital data collections.
Blocking and matching variables used in probability matching passes for E linkage strategies
| 1 | region definition 1 sex date of birth event dates | none |
| 2 | region definition 2 sex date of birth | event dates: one-sided variation (non-overlapping) |
| 3 | region definition 2 sex day of birth month of birth event dates | year of birth |
| 4 | region definition 2 sex year of birth event dates | day of birth month of birth |
| 5 | region definition 2 sex date of birth | event dates: two-sided variation |
| 6 | region definition 3 sex date of birth event dates | none |
| 7 | region definition 4 sex date of birth event dates | none |
Note: Dates were only used as blocking variables if an exact date match was appropriate. Early investigations indicated that allowing variation in both date of birth and event dates leads to an unacceptable number of erroneous matches.
Comparing E links with N links
| a | b | PPV = a/(a+b) | |
| c | d | NPV = d/(c+d) | |
| Sensitivity = a/(a+c)) | Specificity = d/(b+d) |
Note: The number of non links depends on whether non links to hospital episodes or RAC events are being considered. Therefore the values of NPV and specificity are different depending on whether they are derived from the point of view of hospital discharges (86200 in total) or RAC entries (19600 in total).
Figure 1Types of link concordances between N and E links.
Regions used for blocking in constrained E linkage strategies
| Constrained-SLG v1 | SLG | SLG | SLG | SLG | SLG | 3-digit PC | . . |
| Constrained-SLG v2 | SLG | SLG | SLG | SLG | SLG | 3-digit PC | 2-digit PC |
| Constrained-SLG v3 | SLG | SLG | SLG | SLG | SLG | 3-digit PC | RAC PC |
| Constrained-PC v1 | PC | 3-digit PC | 3-digit PC | 3-digit PC | 3-digit PC | 3-digit PC | . . |
| Constrained-PC v2 | PC | 3-digit PC | 3-digit PC | 3-digit PC | 3-digit PC | 3-digit PC | 2-digit PC |
| Constrained-PC v3 | PC | 3-digit PC | 3-digit PC | 3-digit PC | 3-digit PC | 3-digit PC | RAC PC |
Note: Australian postcodes have four digits. 3-digit postcode indicates that the first three digits of the postcode were used for region matching; 2-digit postcode indicates that the first two digits of the postcode were used for region matching. In Western Australia in 2001, all postcodes had fewer than 5000 people of either sex. For 3-digit postcode regions, 85% had fewer than 2500 men aged 65+, 79% had fewer than 2500 older women with the most populous regions having fewer than 15000 older people of either sex. Among the nine two-digit postcode regions, the largest had fewer than 50000 older women and 40000 older men.
PPV and sensitivity of E linkage strategies. N linkage used as the reference standard.
| Basic | 6539 | 154 | 1567 | 6693 | 97.7 | 80.7 |
| Constrained-SLG v1 | 7100 | 153 | 1006 | 7253 | 97.9 | 87.6 |
| Constrained-SLG v2 | 7418 | 177 | 688 | 7595 | 97.7 | 91.5 |
| Constrained-SLG v3 | 7420 | 170 | 686 | 7590 | 97.8 | 91.5 |
| Constrained-PC v1 | 6936 | 142 | 1170 | 7078 | 98.0 | 85.6 |
| Constrained-PC v2 | 7418 | 169 | 688 | 7587 | 97.8 | 91.5 |
| Constrained-PC v3 | 7335 | 156 | 771 | 7491 | 97.9 | 90.5 |
PPV and sensitivity, by RAC event type and E linkage strategy. N linkage used as the reference standard.
| Basic | 95.5 | 97.8 | 98.5 | 89.2 | 97.7 |
| Constrained-SLG v1 | 95.1 | 98.2 | 98.7 | 95.5 | 97.9 |
| Constrained-SLG v2 | 94.8 | 97.4 | 98.7 | 95.5 | 97.7 |
| Constrained-SLG v3 | 95.1 | 98.1 | 98.6 | 95.5 | 97.8 |
| Constrained-PC v1 | 95.0 | 98.0 | 98.7 | 99.3 | 98.0 |
| Constrained-PC v2 | 95.0 | 97.2 | 98.7 | 99.3 | 97.8 |
| Constrained-PC v3 | 95.3 | 97.9 | 98.7 | 99.3 | 97.9 |
| Basic | 65.2 | 73.8 | 86.4 | 91.9 | 80.7 |
| Constrained-SLG v1 | 75.0 | 81.1 | 92.5 | 91.3 | 87.6 |
| Constrained-SLG v2 | 88.5 | 86.4 | 93.3 | 91.3 | 91.5 |
| Constrained-SLG v3 | 89.6 | 89.2 | 92.5 | 91.3 | 91.5 |
| Constrained-PC v1 | 69.5 | 78.5 | 91.7 | 90.7 | 85.6 |
| Constrained-PC v2 | 88.9 | 87.0 | 93.1 | 90.7 | 91.5 |
| Constrained-PC v3 | 88.0 | 88.0 | 91.7 | 90.7 | 90.5 |
| N links | 1723 | 852 | 5370 | 161 | 8106 |
| RAC records | 4752 | 3297 | 6878 | 4709 | 19636 |
Note: Analysis indicated that for a small number of links the event match chosen by the N linkage strategy was not the preferred link. In particular, for 18 matches (0.2% of N links) the preferred link was to a RAC hospital leave event rather than the chosen (earlier) admission event for the same person.
Figure 2Post-hospital destination of people aged 65 and over, by movement type, by linkage strategy. Hospital separations are for Western Australia, 2000–01. E linkage is PC v3.
Figure 3Analysis example: Median length of hospital episode, by post-hospital destination of people aged 65+, by linkage strategy. Hospital separations are for Western Australia, 2000–01. E linkage strategy is PC v3.
Analysis example: care level and dementia status for RAC entries, by transition type.
| With dementia | 64.5 | 35.5 | 100.0 | 31.0 | 1711 | 65.2 | 34.8 | 100.0 | 30.7 | 1570 |
| Without dementia | 33.6 | 66.4 | 100.0 | 69.0 | 3802 | 33.4 | 66.6 | 100.0 | 69.3 | 3536 |
| With dementia | 83.1 | 16.9 | 100.0 | 45.9 | 769 | 84.3 | 15.7 | 100.0 | 45.2 | 690 |
| Without dementia | 72.6 | 27.4 | 100.0 | 54.1 | 905 | 74.0 | 26.0 | 100.0 | 54.8 | 837 |
| With dementia | 32.5 | 67.5 | 100.0 | 24.6 | 209 | 31.9 | 68.1 | 100.0 | 25.1 | 191 |
| Without dementia | 19.1 | 80.9 | 100.0 | 75.4 | 640 | 20.0 | 80.0 | 100.0 | 74.9 | 570 |
RAC entries are for Western Australia, 2000–01. E linkage strategy is PC v3.
(a) Based on linked hospital and RAC records.
Note: E linkage is PC v3. Diagnosis of dementia includes diagnoses of dementia and Alzheimer's disease (ICD-10-AM Edition 1 categories F00–F03, and G30) [31]. Table excludes 115 cases with missing care level, and all unlinked RAC hospital leave events.