| Literature DB >> 27764135 |
Katie Harron1, Ruth Gilbert2, David Cromwell1, Jan van der Meulen1.
Abstract
OBJECTIVE: Linkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England. DESIGN ANDEntities:
Mesh:
Year: 2016 PMID: 27764135 PMCID: PMC5072610 DOI: 10.1371/journal.pone.0164667
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Completeness of potential linkage variables in maternal and baby HES records for 2012/13.
| Potential linkage variable | % Complete (baby extract) | % Complete (maternal extract) | Deterministic linkage | Probabilistic linkage | % agreement in deterministically-linked records | ||
|---|---|---|---|---|---|---|---|
| Provider code– 3 character | procode3 | 100.0 | 100.0 | X | X | - | |
| Code of GP practice | gpprac | 93.0 | 99.6 | X | X | - | |
| Ethnic category | ethnos | 92.1 | 92.6 | X | 77.35 | ||
| Postcode district of patient residence | postdist | 18.0 (38.6) | 98.5 | X | 97.18 | ||
| Date episode started | epistart | 100.0 | 100.0 | X | 60.66 | ||
| Date episode ended | epiend | 100.0 | 100.0 | X | 89.99 | ||
| Estimated delivery date | opdte / epistart | 99.4 | 99.5 | X | 99.95 | ||
| Baby’s age in days | neodur | 99.5 | |||||
| Sex of baby (or Sex in baby main HES record) | sexbaby | 99.9 | 88.3 | X | X | - | |
| Birth order | birordr | 84.6 | 90.8 | X | X | - | |
| Birth weight | birweit | 85.4 | 89.0 | X | X | - | |
| Length of gestation | gestat | 81.7 | 85.4 | X | X | - | |
| Mother’s age at delivery | matage | 85.3 | 85.1 | X | X | - | |
| First antenatal assessment date | anasdate | 80.9 | 85.2 | X | 99.89 | ||
| Gestational period in weeks at first antenatal assessment | anagest | 74.6 | 73.9 | X | 99.92 | ||
| Delivery place (actual) | delplac | 87.6 | 89.1 | X | 99.93 | ||
| Delivery place (intended) | delinten | 86.3 | 88.2 | X | 99.22 | ||
| Delivery method | delmeth | 88.1 | 88.9 | X | 99.80 | ||
| Method to induce labour | delonset | 87.4 | 89.4 | X | 99.99 | ||
| Anaesthetic given during labour or delivery | delprean | 53.9 | 54.1 | X | 99.99 | ||
| Anaesthetic given post labour or delivery | delposan | 26.1 | 26.8 | X | 100.00 | ||
| Status of person conducting delivery | delstat | 85.4 | 86.9 | X | 98.24 | ||
| Resuscitation method | biresus | 76.5 | 81.5 | X | 99.99 | ||
| Birth status | birstat | 85.1 | 88.8 | X | 99.99 | ||
| Number of previous pregnancies | numpreg | 0.01 | 71.9 | - | |||
| Delivery place change reason | delchang | 8.1 | 7.7 | - | |||
| Antenatal days of stay | antedur | 86.6 | 85.2 | - | |||
| Postnatal days of stay | postdur | 86.9 | 85.2 | - | |||
| Neonatal level of care | neocare | 66.6 | 99.9 | - | |||
| Well baby flag | well_baby | 100.00 | 100.0 | - | |||
| Number of babies | numbaby | 86.9 | 91.0 | - | |||
*Frequency-based probabilistic weights were used for these variables, allowing weights to vary according to the frequency of data values or the distance between dates
1 Where fields were complete in both maternity and baby extract;
2 Estimated delivery date derived from date of relevant OPCS procedure code (mother) or episode start date (baby);
3 Completeness rose to 38.6% when postcode district was imputed from subsequent admission records
Fig 1Extract flow-diagram for delivery and birth episodes captured in HES for 2012/13.
Probabilistic match weights.
| Linkage variable | HES field name | Match weight | |
|---|---|---|---|
| Agreement | Disagreement | ||
| Sex | sexbaby | 0.95 | -3.99 |
| GP practice code | gpprac | 11.68 | -3.07 |
| Maternal age | matage | 4.38 | -7.40 |
| Birthweight | birweit | 8.18 | -8.00 |
| Gestational age | bestat | 2.80 | -1.74 |
| Birth order | Birordr | 0.04 | -7.29 |
| Estimated delivery date | dobbaby | 8.48 | -10.68 |
| First antenatal assessment date | anasdate | 8.37 | -3.18 |
| Gestation period in weeks at first antenatal assessment | anagest | 3.11 | -2.09 |
| Delivery method | delmeth | 1.33 | -4.21 |
| Delivery place (actual) | delplac | 0.94 | -1.38 |
| Delivery place (intended) | delinten | 5.51 | -3.50 |
| Method to induce labour | delonset | 1.12 | -3.20 |
| Anaesthetic given during labour or delivery | delprean | 1.77 | -4.99 |
| Anaesthetic given post labour or delivery | delposan | 1.11 | -9.22 |
| Status of person conducting delivery | delstat | 4.40 | -4.77 |
| Birth status | birstat | 0.14 | -6.10 |
| Resuscitation method | biresus | 0.68 | -8.55 |
| Ethnic category | ethnos | 4.26 | -1.01 |
| Postcode district of patient residence | postdist | 10.47 | -5.32 |
| Date episode started | epistart | 7.79 | -1.89 |
| Date episode ended | epiend | 8.29 | -0.79 |
*Average of frequency-based weights presented.
+Weights presented for date differences of 0 (same day) or 7 days apart.
Fig 2Estimated false-match rate and sensitivity for a range of threshold weights, based on synthetic data.
Probability of achieving a deterministic link according to completeness of baby records.
The final row shows an increase in accuracy of variables over time: in 2001/02, deterministic links were found for 73.0% of baby records with complete values on all linkage variables compared with 77.5% in 2012/13.
| Completeness of deterministic linkage variables | 2001/02 | 2012/13 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| GP practice | Maternal age | Birth weight | Gestation | Birth order | Sex of baby | % of all baby records | % deterministically linked | % of all deterministic links | % of all baby records | % deterministically linked | % of all deterministic links |
| 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||||||
| ✓ | 1.1 | 0.0 | 0.0 | 1.2 | 5.0 | 0.1 | |||||
| ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 6.2 | 0.0 | ||||
| ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 20.0 | 0.0 | ||||
| ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.1 | 4.0 | 0.0 | |||
| ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||||
| ✓ | ✓ | ✓ | 0.1 | 0.0 | 0.0 | 0.0 | 8.8 | 0.0 | |||
| ✓ | ✓ | ✓ | ✓ | 1.2 | 0.2 | 0.0 | 0.0 | 3.9 | 0.0 | ||
| ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 33.3 | 0.0 | ||||
| ✓ | ✓ | ✓ | 0.0 | 1.0 | 0.0 | 0.0 | 39.1 | 0.0 | |||
| ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 100.0 | 0.0 | |||
| ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |||
| ✓ | ✓ | ✓ | ✓ | 0.1 | 0.1 | 0.0 | 0.0 | 58.1 | 0.1 | ||
| ✓ | ✓ | ✓ | ✓ | 0.1 | 0.4 | 0.0 | 0.1 | 62.1 | 0.1 | ||
| ✓ | ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 100.0 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | ✓ | 4.3 | 2.9 | 0.3 | 1.0 | 64.1 | 1.5 | |
| ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 6.8 | 0.0 | |||||
| ✓ | ✓ | 9.3 | 0.0 | 0.0 | 31.1 | 4.4 | 3.1 | ||||
| ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||||
| ✓ | ✓ | ✓ | 0.3 | 0.0 | 0.0 | 0.4 | 2.4 | 0.0 | |||
| ✓ | ✓ | ✓ | 1.4 | 0.0 | 0.0 | 1.5 | 3.1 | 0.1 | |||
| ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |||
| ✓ | ✓ | ✓ | ✓ | 0.1 | 3.2 | 0.0 | 1.7 | 2.7 | 0.1 | ||
| ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||||
| ✓ | ✓ | ✓ | 0.9 | 0.0 | 0.0 | 0.3 | 6.1 | 0.0 | |||
| ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |||
| ✓ | ✓ | ✓ | ✓ | 2.4 | 0.0 | 0.0 | 0.2 | 2.2 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.4 | 4.3 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | ✓ | 18.0 | 6.5 | 2.8 | 2.3 | 7.4 | 0.4 | |
| ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.8 | 60.8 | 1.1 | |||
| ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 100.0 | 0.0 | |||
| ✓ | ✓ | ✓ | ✓ | 0.0 | 63.8 | 0.0 | 0.8 | 65.0 | 1.1 | ||
| ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 100.0 | 0.0 | |||
| ✓ | ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 50.9 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | 0.0 | 14.3 | 0.0 | 0.0 | 57.1 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | ✓ | 2.2 | 1.7 | 0.1 | 4.1 | 63.5 | 5.8 | |
| ✓ | ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 44.4 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 60.0 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | ✓ | 4.0 | 6.6 | 0.6 | 9.5 | 69.8 | 14.7 | |
| ✓ | ✓ | ✓ | ✓ | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | ||
| ✓ | ✓ | ✓ | ✓ | ✓ | 2.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | |
| ✓ | ✓ | ✓ | ✓ | ✓ | 0.0 | 71.4 | 0.0 | 0.0 | 73.3 | 0.0 | |
| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 51.8 | 77.5 | 96.1 | 44.4 | 73.0 | 71.7 |
Fig 3Contribution of each linking variable to overall match weight.
Agreement = positive contribution (solid line), disagreement = negative contribution (dashed line). The higher the value, the more information the linkage variable provides.
Comparison of linked and unlinked baby record characteristics for 2012/13.
Missing values are excluded from all categories.
| Linked N = 660,401 | % 98% | Unlinked N = 12,654 | % 2% | p-value | ||
|---|---|---|---|---|---|---|
| 62157 | 7452 | <0.001 | ||||
| 0.003 | ||||||
| < = 18 | 11452 | 2.6 | 103 | 2.9 | ||
| 19–24 | 87116 | 20.0 | 793 | 22.0 | ||
| 25–29 | 120943 | 27.7 | 1036 | 28.8 | ||
| 30–34 | 129398 | 29.7 | 898 | 24.9 | ||
| 35–39 | 68916 | 15.8 | 519 | 14.4 | ||
| > = 40 | 18008 | 4.1 | 252 | 7.0 | ||
| Most deprived | 9269 | 19.7 | 1107 | 22.2 | <0.001 | |
| 2 | 9410 | 20.0 | 970 | 19.5 | ||
| 3 | 9388 | 20.0 | 1000 | 20.1 | ||
| 4 | 9380 | 20.0 | 1012 | 20.3 | ||
| Least deprived | 9491 | 20.2 | 894 | 17.9 | ||
| White | 457181 | 75.2 | 7855 | 71.0 | <0.001 | |
| Mixed | 29234 | 4.8 | 693 | 6.3 | ||
| Asian | 68588 | 11.3 | 1219 | 11.0 | ||
| Black | 32410 | 5.3 | 735 | 6.6 | ||
| Other | 20299 | 3.3 | 567 | 5.1 | ||
| <0.001 | ||||||
| < = 27 | 5674 | 1.0 | 174 | 4.4 | ||
| 28-<32 | 6591 | 1.2 | 84 | 2.1 | ||
| 32-<37 | 31872 | 5.8 | 348 | 8.8 | ||
| 37-<42 | 478042 | 87.5 | 3229 | 81.3 | ||
| > = 42 | 23881 | 4.4 | 137 | 3.4 | ||
| <0.001 | ||||||
| <1000 | 2837 | 0.5 | 159 | 3.4 | ||
| 1000–1499 | 3516 | 0.6 | 116 | 2.5 | ||
| 1500–2499 | 33021 | 5.8 | 498 | 10.6 | ||
| 2500–3999 | 465984 | 81.7 | 3513 | 74.6 | ||
| > = 4000 | 64710 | 11.4 | 425 | 9.0 | ||
| 443262 | 88.8 | 2508 | 86.7 | <0.001 | ||
| 146077 | 24.8 | 787 | 17.1 | <0.001 | ||
| 2752 | 0.4 | 285 | 2.3 | <0.001 | ||
| 20436 | 3.1 | 352 | 2.8 | 0.041 | ||
*IMD = Index of Multiple Deprivation
Fig 4Distribution of birth weight by week of gestation in baby records.
Vertical lines show 3 standard deviations from the average; values above the upper limit are likely to have been miscoded as days (rather than weeks) of gestation, truncated to 2 digits.
Fig 5Representativeness of linked HES cohort in terms of maternal age, birth weight and gestational age.
Dark shade = HES, light shade = Office for National Statistics.