| Literature DB >> 35310557 |
Claire de Oliveira1,2,3,4, Evgenia Gatov1, Laura Rosella1,2,5,6,7, Simon Chen1, Rachel Strauss1, Mahmoud Azimaee1, Elizabeth Paterno8, Astrid Guttmann1,2,9,10,11.
Abstract
Background: The linkage of records across administrative databases has become a powerful tool to increase information available to undertake research and analytics in a privacy protective manner. Objective: The objective of this paper was to describe the data integration strategy used to link the Ontario Ministry of Children, Community and Social Services (MCCSS)-Social Assistance (SA) database with administrative health care data.Entities:
Keywords: Ontario; administrative health care data; administrative social assistance data; data linkage
Mesh:
Year: 2022 PMID: 35310557 PMCID: PMC8900651 DOI: 10.23889/ijpds.v6i1.1689
Source DB: PubMed Journal: Int J Popul Data Sci ISSN: 2399-4908
Figure 1: Pre-processing linkage steps for the Ministry of Children, Community and Social Services – Social Assistance input file
Figure 2: Linkage process for Ministry of Children, Community and Social Services – Social Assistance data|
|
|
|
| ||
|---|---|---|---|---|---|
|
|
|
|
| ||
|
| 2,405,115 | 87.9 | 331,238 | 12.1 | N/A |
|
| |||||
| Ontario Works | 1,630,744 | 67.8 | 254,119 | 76.7 | 0.20 |
| Ontario Disability Support Program | 774,371 | 32.2 | 77,119 | 23.3 | 0.20 |
|
| |||||
| Applicant | 1,433,505 | 59.6 | 149,495 | 45.1 | 0.29 |
| Spouse | 195,181 | 8.1 | 29,156 | 8.8 | 0.02 |
| Dependent adult | 120,574 | 5.0 | 22,782 | 6.9 | 0.08 |
| Dependent child | 655,855 | 27.3 | 129,805 | 39.2 | 0.26 |
|
| |||||
| Male | 1,210,680 | 50.3 | 127,502 | 38.5 | 0.24 |
| Female | 1,154,547 | 48.0 | 137,660 | 41.6 | 0.13 |
| Unknown | 39,888 | 1.7 | 66,076 | 19.9 | 0.62 |
|
| |||||
| Mean (SD) | 31.01 ± 19.47 | 25.39 ± 18.17 | 0.30 | ||
| Median (IQR) | 29 (16-47) | 22 (11-38) | 0.30 | ||
|
| |||||
| N/A (Canadian-born and long-term residents) | 1,731,689 | 72.0 | 184,830 | 55.8 | 0.34 |
| All other (immigrants and refugees) | 673,426 | 28.0 | 146,408 | 44.2 | 0.34 |
|
| |||||
| Yes | 216,878 | 9.0 | 21,109 | 6.4 | 0.10 |
| No | 2,168,548 | 90.2 | 306,977 | 92.7 | 0.09 |
| Missing | 19,689 | 0.8 | 3,152 | 1.0 | 0.01 |
|
| |||||
| Single without children | 1,000,286 | 41.6 | 99,077 | 29.9 | 0.25 |
| Single with children | 782,381 | 32.5 | 124,509 | 37.6 | 0.11 |
| Couples without children | 161,980 | 6.7 | 19,457 | 5.9 | 0.04 |
| Couples with children | 460,468 | 19.1 | 88,195 | 26.6 | 0.18 |
|
| |||||
| Homeless | 20,785 | 0.9 | 3,767 | 1.1 | 0.03 |
| Not homeless | 2,384,330 | 99.1 | 327,471 | 98.9 | 0.03 |
|
| |||||
| In SDMT only: January 2003 – October 2014 | 1,260,419 | 52.4 | 255,041 | 77.0 | 0.53 |
| In SAMS only: November 2014 – December 2016 | 263,941 | 11.0 | 28,074 | 8.5 | 0.08 |
| In both systems | 880,755 | 36.6 | 48,123 | 14.5 | 0.52 |
|
| |||||
| Mean (SD) | 49.57 ± 50.02 | 30.70 ± 37.02 | 0.43 | ||
| Median (IQR) | 29 (10-77) | 17 (6-38) | 0.40 | ||
Legend: N/A – not applicable; SDMT – Service Delivery Model Technology; SAMS – Social Assistance Management System; SD – standard deviation; IQR – interquartile range.
|
|
|
|
| |
|---|---|---|---|---|
|
|
| |||
| Unique member ID SDMT + SAM | 2,736,353 (100%) | 2,083,864 (76.2%) | 321,251 (11.7%) | 2,405,115 (87.9%) |
|
|
|
| ||
|
|
| |||
| 1 | D | 1,071,584 | 983,389 | Matching on: Surname 1 + Given Name 1 + Sex + DOB, Alternate with Given Name 2 (RPDB) and Standardized Given Name (MCCSS) |
| 2 | P | 57,711 | 52,265 | Blocking on: Surname 1 first-3 characters + Given Name 1 first-3 characters + DOB + Sex Matching on: Surname 1 + Given Name 1 + Given Name 2 + Given Name 3 |
| 3 | P | 25,625 | 25,460 | Blocking on: Surname 1 initial + Given Name 1 initial + DOB + Sex Matching on: Surname 1 + Standardized Given Name (MCCSS)/Given Name 1 (RPDB) + Given Name 2 + Given Name 3 |
| 4 | P | 11,782 | 10,631 | Blocking on: DOB + Sex + Surname 1 initial Matching on: Surnames + Given Names + Postal Codes |
| 5 | P | 10,753 | 10,133 | Blocking on: Surname 1 initial + Given Name 1 initial + Birth Year + Sex Matching on: Surnames + Given Names + Birth Month + Birth Day + Postal Codes |
| 6 | P | 20,814 | 2,435 | Blocking on: Surname 1 initial + Given Name 1 initial + Birth Month + Birth Day + Sex Matching on: Surnames + Given Names + Birth Year + Postal Codes |
| 7 | P | 6,478 | 575 | Blocking on: NYSIIS code of Surname 1 + Birth Year + Sex Matching on: Surnames + Given Names + Birth Month + Birth Day + Postal Codes |
| 8 | D | 14,439 | 14,452 | Matching on: DOB + Surname 1 + Given Name 1 |
| 9 | P | 5,240 | 3,298 | Blocking on: DOB + Surname 1 initial + Given Name 1 initial Matching on: Surnames + Given Names +Postal Codes |
| 10 | P | 2,784 | 67,136 | Blocking on: DOB + Sex Matching on: Surnames + Given Names + Postal Codes |
| 11 | P | 1,220 | 394 | Blocking on: Birth Year + Sex Matching on: Surnames + Given Names + Birth Month + Birth Day + Postal Codes |
| 12 | P | 546 | 293 | Blocking on: Birth Month + Birth Day + Sex Matching on: Surnames + Given Names + Birth Year + Postal Codes |
| 13 | P | 1,972 | 364 | Blocking on: Surname 2 initial (MCCSS)/Surname 1 initial (RPDB) + DOB Matching on: Surname 2 (MCCSS)/Surname 1 (RPDB) + Given Names + Postal Codes |
| 14 | P | 51 | 37 | Blocking on: Surname 2 Initial + DOB Matching on: Surnames + Given Names + Postal Codes |
| 15 | P | 1,986 | 898 | Blocking on: Birth Year + Given Name 1 initial + NYSIIS code of Surname 1 Matching on: Surnames + Given Names + Birth Month + Birth Day + Postal Codes |
| 16 | P | 258 | 112 | Blocking on: Birth Month + Birth Day + Given Name 1 initial + NYSIIS code of Surname 1 Matching on: Surnames + Given Names + Birth Year + Postal Codes |
| Linked Total | 1,233,243 | 1,171,872 | ||
| 2,405,115 (87.9%) | ||||
Legend: SAMS – Social Assistance Management System; SDMT – Service Delivery Model Technology; DOB – date of birth; RPDB – Registered Persons Databas; MCCSS – Ministry of Children, Community and Social Services; NYSIIS – New York State Identification and Intelligence System.
Notes: Surnames – Array variable of surname; element contains Surname 1 and Surname 2.
Given Names – Array variable of given name; element contains Given Name 1, Given Name 2 and Given Name 3.
Postal Codes – Array variable of postal code; element contains member’s first historic postal code and most recent postal code.
Standardized Given Name – standardized nickname from Given Name 1.
Figure 3: Deterministic linkage, probabilistic linkage and unlinked rates and percentage of unlinked records for the Ministry of Children, Community and Social Services – Social Assistance by year (2003–2016)