| Literature DB >> 26059721 |
Noémi Kreif1, Richard Grieve1, Iván Díaz2, David Harrison3.
Abstract
For a continuous treatment, the generalised propensity score (GPS) is defined as the conditional density of the treatment, given covariates. GPS adjustment may be implemented by including it as a covariate in an outcome regression. Here, the unbiased estimation of the dose-response function assumes correct specification of both the GPS and the outcome-treatment relationship. This paper introduces a machine learning method, the 'Super Learner', to address model selection in this context. In the two-stage estimation approach proposed, the Super Learner selects a GPS and then a dose-response function conditional on the GPS, as the convex combination of candidate prediction algorithms. We compare this approach with parametric implementations of the GPS and to regression methods. We contrast the methods in the Risk Adjustment in Neurocritical care cohort study, in which we estimate the marginal effects of increasing transfer time from emergency departments to specialised neuroscience centres, for patients with acute traumatic brain injury. With parametric models for the outcome, we find that dose-response curves differ according to choice of specification. With the Super Learner approach to both regression and the GPS, we find that transfer time does not have a statistically significant marginal effect on the outcomes.Entities:
Keywords: generalised propensity score; machine learning; programme evaluation
Mesh:
Year: 2015 PMID: 26059721 PMCID: PMC4744663 DOI: 10.1002/hec.3189
Source DB: PubMed Journal: Health Econ ISSN: 1057-9230 Impact factor: 3.046
Descriptive statistics of baseline covariates and outcomes
|
|
| |
|---|---|---|
|
| ||
| Dead at 6months, n (%) | 99 | (20.3) |
| Sixmonths costs ( | 27 480 | (29 741) |
|
| ||
| IMPACT pred mort, mean (SD) | 0.23 | (0.17) |
| Age, mean (SD) | 40.33 | (17.51) |
| Major extr inj, | 185 | (37.9) |
| Severe GCS, | 265 | (54.3) |
| Motor score poor, | 226 | (46.3) |
| Any pupil unreactive, | 76 | (15.6) |
SD, standard deviation; IMPACT pred mort, predicted mortality from IMPACT risk prediction model; extr inj, extracranial injury; GCS, Glasgow Coma Score.
The following variables had missing values for n observations: dead at 6months: n = 18, pupil unreactive: n = 41, motor score: n = 7.
Specifications of candidate algorithms in Super Learner
| GPS estimation | ||||||
|---|---|---|---|---|---|---|
| Candidate | Error dist | Link fn | Linear pred | |||
| Norm 1 | normal | identity |
| |||
| Norm 2 | normal | identity |
| |||
| Norm 3 | normal | identity |
| |||
| Norm 4 | normal | identity |
| |||
|
| ||||||
| Gam 1 | gamma | log |
| |||
| Gam 2 | gamma | log |
| |||
| Gam 3 | gamma | log |
| |||
| Gam 4 | gamma | log |
| |||
|
| ||||||
| Outcome estimation, regression | ||||||
| Mortality | Costs | |||||
| Candidate | Error dist | Link fn | Linear pred | Error dist | Link fn | Linear pred |
| Linear | binomial | logit |
| normal | identity |
|
| Linear (costs) | gamma | log |
| |||
| Linear, int | binomial | logit |
| gamma | log |
|
| Quadr, int (1) | binomial | logit |
| gamma | log |
|
|
|
| |||||
|
|
| |||||
| Quadr, int (2) | Binomial | logit |
| gamma | log |
|
|
|
| |||||
|
|
| |||||
|
|
| |||||
| Fourth | binomial | logit |
| gamma | log |
|
| GAM | binomial | logit | splines of | normal | identity | splines of |
| Bayesian GLM | binomial | logit |
| normal | identity |
|
| Outcome estimation, GPS | ||||||
| Mortality | Costs | |||||
| Linear | binomial | logit |
| normal | identity |
|
| Linear (costs) | gamma | log |
| |||
| Linear, int | binomial | logit | linear, | gamma | log |
|
| Quadr, int (1) | binomial | logit |
| gamma | log |
|
| Quadr, int (2) | binomial | logit |
| gamma | log |
|
| Fourth | binomial | logit |
| gamma | log |
|
| GAM | binomial | logit | splines of | normal | identity | splines of |
| Bayesian GLM | binomial | logit |
| normal | identity |
|
GPS,generalised propensity score; GAM, generalised additive models; GLM, generalised linear model. W: all covariates, A treatment variable, r: GPS, fg: glucose, fhb: haemoglobin, m o t o r3: motor score 3
Balance unadjusted and adjusted for the GPS: differences in means and t‐statistics for the equality of means
| Variable | Unadjusted differences in means ( | Adjusted differences in means ( | ||||
|---|---|---|---|---|---|---|
| tx cat [1.83–5.2] | tx cat [5.2–10.1] | tx cat [10.1–23.7] | tx cat [1.83–5.2] | tx cat [5.2–10.1] | tx cat [10.1–23.7] | |
| IMPACT pred mort | −0.03 (1.99) | −0.00 (−0.04) | 0.03 (1.92) | −0.01 (−0.91) | 0.00 (0.18) | 0.01 (0.81) |
| Age | −4.88 (−3.05) | 3.87 (2.26) | 0.99 (0.58) | −3.09 (−1.81) | 4.22 (2.49) | −1.09 (−0.65) |
| Major extr inj | −0.08 (−1.69) | −0.02 (−0.35) | 0.09 (1.99) | −0.02 (−0.32) | 0.01 (0.12) | 0.02 (0.47) |
| Severe GCS | 0.07 (1.36) | −0.06 (−1.25) | −0.01 (−0.10) | 0.03 (0.59) | −0.07 (−1.49) | 0.04 (0.73) |
| Motor score poor | 0.018 (0.38) | −0.069 (−1.45) | 0.051 (1.06) | 0.011 (0.23) | −0.068 (−1.42) | 0.050 (1.00) |
| Any pupil unreactive | −0.01 (−0.33) | −0.03 (−0.92) | 0.04 (1.18) | 0.01 (0.15) | −0.03 (−0.76) | 0.02 (0.61) |
GPS, generalised propensity score; tx cat, treatment category; IMPACT pred mort, predicted mortality from IMPACT risk prediction model; GCS, Glasgow Coma Score. Differences in means reported, with t‐statistics for the equality of the mean in brackets. Each comparison contrast units in a given treatment category versus the other two treatment categories. There are 122, 123 and 123 patients in each group.
Figure 1Dose–response functions of expected mortality at 6months, using regression and GPS, with parametric models and the Super Learner. The rug plots demonstrate the distribution of observed transfer times.GPS, generalised propensity score
Cross validation results and Super Learner weights
| Candidate predictor | Regression approach | GPS approach | ||
|---|---|---|---|---|
| MSE | Weight in Super Learner | MSE | Weight in Super Learner | |
| Mortality | ||||
| Linear | 0.1472 | 0.00 | 0.1621 | 0.00 |
| Linear, int | 0.1503 | 0.00 | 0.1618 | 0.00 |
| Quadratic, int (1) | 0.1553 | 0.00 | 0.1631 | 0.00 |
| Quadratic, int (2) | 0.1576 | 0.00 | 0.1644 | 0.00 |
| Fourth order | 0.1484 | 0.00 | 0.1634 | 0.00 |
| GAM (d.f.=2) | 0.1470 | 0.31 | 0.1625 | 0.00 |
| GAM (d.f.=3) | 0.1487 | 0.00 | 0.1630 | 0.00 |
| Bayesian GLM | 0.1456 | 0.69 | 0.1617 | 1.00 |
| Convex Super Learner | 0.1481 | 0.1621 | ||
| Costs | ||||
| Linear (normal) | 853058 | 0.00 | 883591 | 0.00 |
| Linear | 855597 | 0.13 | 883507 | 0.83 |
| Linear int | 864127 | 0.14 | 889775 | 0.00 |
| Quadr, int (1) | 941058 | 0.00 | 895103 | 0.17 |
| Quadr, int (2) | 897010 | 0.20 | 912055 | 0.00 |
| Fourth order | 891008 | 0.00 | 901585 | 0.00 |
| GAM (d.f.=2) | 857542 | 0.00 | 887886 | 0.00 |
| GAM (d.f.=3) | 863129 | 0.00 | 894215 | 0.00 |
| Bayesian GLM | 852746 | 0.52 | 883579 | 0.00 |
| Convex super learner | 859842 | 888529 | ||
GPS, generalised propensity score; MSE, mean squared error; d.f., degrees of freedom; GAM, generalised additive models; GLM, generalised linear model. MSE for costs in 1000 $.
Figure 2Dose–response function and marginal treatment effect function of expected mortality at 6months, using the Super Learner. (a) Regression approach, point estimates and 95% CI, and (b) GPS approach, point estimates and 95% CI. The rug plots demonstrate the distribution of observed transfer times. GPS, generalised propensity score; CI, confidence interval