| Literature DB >> 34568585 |
Nicolás Libuy1, Katie Harron1,2, Ruth Gilbert1,2, Richard Caulton3, Ellen Cameron3, Ruth Blackburn1.
Abstract
INTRODUCTION: Linkage of administrative data for universal state education and National Health Service (NHS) hospital care would enable research into the inter-relationships between education and health for all children in England.Entities:
Keywords: administrative data; bias; data linkage; educational records; hospital records; linkage error; record linkage
Mesh:
Year: 2021 PMID: 34568585 PMCID: PMC8445153 DOI: 10.23889/ijpds.v6i1.1671
Source DB: PubMed Journal: Int J Popul Data Sci ISSN: 2399-4908
Figure 1: Data flow and linkage process for linkage between the national pupil database, the personal demographic service and hospital episode statistics
Figure 2: Lexis diagram to show year of age of each cohort (y axis) and start year of each dataset (x axis)|
|
| ||
|---|---|---|---|
|
|
| ||
|
|
|
| |
| First name(s) | ✓ | ✓ | |
| Surname(s) | ✓ | ✓ | |
| Date of birth (e.g. 23/02/1988) | ✓ | ✓ | ✓ |
| Sex | ✓ | ✓ | ✓ |
| NHS number | ✓ | ✓ | |
| Residence postcodes* | ✓ | ✓ | ✓ |
| Residence postcodes dates** | ✓ | ✓ | ✓ |
| Anonymised Pupil Matching Reference (aPMR) | ✓ | ||
| UCL HESID | ✓ | ||
Notes: * Full postcodes (e.g. LS0 0AA) were available in NPD and PDS. For records in NPD a list of postcodes was available over the academic years. For a specific patient’s NHS number in PDS, a list of postcodes was available over time. ** Dates referring to changes is patient’s postcodes over time were available in PDS. Similarly, dates referring to postcodes in academic years were available in NPD. UCL HESID: is a unique and pseudonymised patient-level identifier that can be used to link patient-level information over time and across different modules of the UCL HES extracts. aPMR: anonymised Pupil Matching Reference is a nationally unique and anonymised child-level identifier that can be used to link pupil-level information over time and across different modules of NPD.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
| 1** | Exact | Exact | Exact | Exact | Exact |
| 2 | Soundex | Soundex | Exact | Exact | Exact |
| 3 | 1st character | Characters 1–3 | Exact | Exact | Exact |
| 4 | 1st character | Characters 1–3 | Exact | Exact | |
| 5 | Exact | Exact | Exact | ||
| 6 | Partial | Exact | Exact | ||
| 7 | Exact | Exact | Exact | Exact | |
| 8 | 1st character | Characters 1–3 | Exact | Exact |
Notes: * Full postcode (e.g. LS0 0AA). ** Step 1 was repeated by NHS Digital but allowing an NPD record to link to many PDS records. The objective of repeating this modified step 1 was to remove potential duplicate HESIDs for the same pupil. See details in Supplementary Appendix 4. Exact refers to exact linking; Partial refers exact linking but using month and year of birth only; Soundex refers to the Structured Query Language (SQL) algorithm that converts an alphanumeric string to a four-character code that is based on how the string sounds when spoken. NPD = National Pupil Database; PDS = Personal Demographic Service.
|
|
|
|
|
|
|---|---|---|---|---|
| 1 | Exact | Exact | Exact | Exact |
| 2 | Exact | Exact | Exact | |
| 3 | Exact | Partial | Exact | Exact |
| 4 | Exact | Partial | Exact | |
| 5 | Exact | Exact | ||
| 6 | Exact | Exact | Exact | |
| Where NHS number does not contradict the match and date of birth is not 1 January | ||||
| 7 | Exact | Exact | Exact | |
| Where date of birth is not 1 January | ||||
Notes: * Full postcode (e.g. LS0 0AA). Exact refers to exact linking; Partial refers exact linking but using month and year of birth only.
Figure 3: Results of linkage at stage 1 (NPD and PDS) and stage 2 (PDS and HES) and final linkage rates
Figure 4: Cumulative percentage of records linked in stage 1 (NPD to PDS; y axis) by academic year in spring census (x axis)|
|
| |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| Region | ||||||
| London | 7,729 (16.1) | 68,073 (12.0) | 0.191 | 6,243 (17.7) | 71,652 (13.4) | 0.247 |
| South East | 8,000 (16.7) | 81,806 (14.5) | 5,961 (16.9) | 75,452 (14.1) | ||
| South West | 4,217 (8.8) | 52,018 (9.2) | 3,021 (8.6) | 50,302 (9.4) | ||
| West Midlands | 4,915 (10.3) | 63,013 (11.1) | 3,392 (9.6) | 60,027 (11.2) | ||
| North West | 6,200 (12.9) | 83,376 (14.7) | 3,630 (10.3) | 77,805 (14.5) | ||
| North East | 1,567 (3.3) | 29,318 (5.2) | 1,025 (2.9) | 27,374 (5.1) | ||
| Yorkshire and The Humber | 3,885 (8.1) | 57,539 (10.2) | 2,908 (8.2) | 54,564 (10.2) | ||
| East Midlands | 3,535 (7.4) | 47,096 (8.3) | 2,769 (7.8) | 42,187 (7.9) | ||
| East of England | 5,541 (11.6) | 59,686 (10.5) | 4,525 (12.8) | 54,424 (10.1) | ||
| Wales | 28 (0.1) | 38 (0.0) | * | * | ||
| Missing | 2,317 (4.8) | 23,835 (4.2) | 1,818 (5.2) | 22,794 (4.2) | ||
| Ethnic group | ||||||
| White | 27,692 (57.8) | 488,330 (86.3) | 0.160 | 24,452 (69.3) | 453,764 (84.6) | 0.159 |
| Asian | 2,541 (5.3) | 33,024 (5.8) | 2,584 (7.3) | 37,654 (7) | ||
| Black | 1,507 (3.1) | 17,047 (3.0) | 1,429 (4.0) | 19,228 (3.6) | ||
| Chinese | 278 (0.6) | 1,384 (0.2) | 213 (0.6) | 1,439 (0.3) | ||
| Other ethnic group | 498 (1.0) | 3,627 (0.6) | 626 (1.8) | 3,951 (0.7) | ||
| Mixed | 834 (1.7) | 13,808 (2.4) | 1,278 (3.6) | 19,286 (3.6) | ||
| Missing | 14,584 (30.4) | 8,578 (1.5) | 4,717 (13.4) | 1,297 (0.2) | ||
| Sex | ||||||
| Male | 27,334 (57.0) | 285,716 (50.5) | 0.131 | 17,014 (48.2) | 275,479 (51.3) | 0.062 |
| Female | 20,543 (42.9) | 279,520 (49.4) | 18,268 (51.8) | 261,094 (48.7) | ||
| Missing | 57 (0.1) | 562 (0.1) | 17 (0.0) | 46 (0.0) | ||
| IDACI Deciles | ||||||
| 1 (deprived) | 7,306 (15.2) | 54,336 (9.6) | 0.242 | 4,866 (13.8) | 50,540 (9.4) | 0.218 |
| 2 | 6,001 (12.5) | 55,606 (9.8) | 4,247 (12.0) | 51,132 (9.5) | ||
| 3 | 5,414 (11.3) | 56,149 (9.9) | 3,811 (10.8) | 51,662 (9.6) | ||
| 4 | 4,941 (10.3) | 56,600 (10.0) | 3,738 (10.6) | 51,725 (9.6) | ||
| 5 | 4,611 (9.6) | 56,620 (10.0) | 3,444 (9.8) | 52,336 (9.8) | ||
| 6 | 4,255 (8.9) | 56,927 (10.1) | 3,310 (9.4) | 52,503 (9.8) | ||
| 7 | 3,854 (8.0) | 56,891 (10.1) | 2,936 (8.3) | 53,336 (9.9) | ||
| 8 | 3,685 (7.7) | 56,122 (9.9) | 2,914 (8.3) | 54,281 (10.1) | ||
| 9 | 3,514 (7.3) | 54,875 (9.7) | 2,851 (8.1) | 55,791 (10.4) | ||
| 10 (affluent) | 3,630 (7.6) | 54,286 (9.6) | 2,701 (7.7) | 56,355 (10.5) | ||
| Missing | 723 (1.5) | 7,386 (1.3) | 481 (1.4) | 6,958 (1.3) | ||
Notes: IDACI = Income deprivation affecting children index. Stand. Diff.= Standardized Difference.* Value omitted to avoid risk of disclosure due to small cell count.
|
|
|
| ||
|---|---|---|---|---|
|
|
|
|
| |
| Ethnic group | ||||
| White | Ref | Ref | ||
| Asian | 0.69 | [0.66,0.72]** | 0.69 | [0.66,0.73]** |
| Black | 0.62 | [0.59,0.66]** | 0.67 | [0.63,0.71]** |
| Chinese | 0.29 | [0.26,0.33]** | 0.38 | [0.33,0.44]** |
| Any other ethnic group | 0.42 | [0.38,0.46]** | 0.32 | [0.30,0.35]** |
| Mixed | 0.92 | [0.85,0.98]* | 0.80 | [0.75,0.85]** |
| Missing | 0.03 | [0.03,0.03]** | 0.01 | [0.01,0.02]** |
| Sex | ||||
| Male | Ref | Ref | ||
| Female | 1.35 | [1.32,1.37]** | 0.87 | [0.85,0.89]** |
| Missing | 22.77 | [17.02,30.47]** | 10.21 | [5.77,18.07]** |
| Region | ||||
| London | Ref | Ref | ||
| South East | 1.31 | [1.26,1.36]** | 1.12 | [1.08,1.17]** |
| South West | 1.34 | [1.28,1.40]** | 1.38 | [1.31,1.45]** |
| West Midlands | 1.27 | [1.22,1.33]** | 1.37 | [1.30,1.43]** |
| North West | 1.36 | [1.30,1.41]** | 1.64 | [1.57,1.72]** |
| North East | 1.91 | [1.80,2.04]** | 1.99 | [1.85,2.14]** |
| Yorkshire and The Humber | 1.34 | [1.28,1.40]** | 1.42 | [1.35,1.49]** |
| East Midlands | 1.28 | [1.23,1.35]** | 1.22 | [1.16,1.28]** |
| East of England | 1.14 | [1.09,1.19]** | 1.00 | [0.95,1.04] |
| Wales | 0.31 | [0.16,0.59]** | 0.40 | [0.17,0.93]* |
| Missing | 1.16 | [1.10,1.23]** | 1.08 | [1.01,1.14]* |
| IDACI Deciles | ||||
| 1 (deprived) | 0.67 | [0.64,0.70]** | 0.71 | [0.67,0.74]** |
| 2 | 0.78 | [0.74,0.81]** | 0.77 | [0.73,0.81]** |
| 3 | 0.86 | [0.82,0.90]** | 0.87 | [0.83,0.92]** |
| 4 | 0.95 | [0.90,0.99]* | 0.90 | [0.85,0.94]** |
| 5 | Ref | Ref | ||
| 6 | 1.11 | [1.05,1.16]** | 1.05 | [1.00,1.11] |
| 7 | 1.26 | [1.20,1.32]** | 1.23 | [1.16,1.29]** |
| 8 | 1.31 | [1.25,1.38]** | 1.27 | [1.20,1.34]** |
| 9 | 1.31 | [1.25,1.38]** | 1.37 | [1.29,1.44]** |
| 10 (affluent) | 1.27 | [1.21,1.34]** | 1.52 | [1.44,1.61]** |
| Missing | 0.95 | [0.86,1.04] | 1.06 | [0.95,1.18] |
| Observations | 613,732 | 571,918 | ||
| Pseudo R-squared | 0.162 | 0.093 | ||
Notes: Adjusted for all other covariates listed in the table. *p < 0.05, **p < 0.01. aOR = adjusted odds ratios. Conf. Int. = confidence interval. NPD = national pupil dataset. HES = hospital episode statistics; NHS = national health service. IDACI = income deprivation affecting children index.