| Literature DB >> 31420761 |
Anton Nilsson1,2, Carl Bonander3, Ulf Strömberg3, Jonas Björk4,5.
Abstract
When analyzing effect heterogeneity, the researcher commonly opts for stratification or a regression model with interactions. While these methods provide valuable insights, their usefulness can be somewhat limited, since they typically fail to take into account heterogeneity with respect to many dimensions simultaneously, or give rise to models with complex appearances. Based on the potential outcomes framework and through imputation of missing potential outcomes, our study proposes a method for analyzing heterogeneous effects by focusing on treatment effects rather than outcomes. The procedure is easy to implement and generates estimates that take into account heterogeneity with respect to all relevant dimensions at the same time. Results are easily interpreted and can additionally be represented by graphs, showing the overall magnitude and pattern of heterogeneity as well as how this relates to different factors. We illustrate the method both with simulations and by examining heterogeneous effects of obesity on HDL cholesterol in the Malmö Diet and Cancer cardiovascular cohort. Obesity was associated with reduced HDL in almost all individuals, but effects varied with smoking, risky alcohol consumption, higher education, and energy intake, with some indications of non-linear effects. Our approach can be applied by any epidemiologist who wants to assess the role and strength of heterogeneity with respect to a multitude of factors.Entities:
Keywords: Causal inference; Heterogeneity; Imputation; Potential outcomes
Mesh:
Substances:
Year: 2019 PMID: 31420761 PMCID: PMC6759690 DOI: 10.1007/s10654-019-00551-0
Source DB: PubMed Journal: Eur J Epidemiol ISSN: 0393-2990 Impact factor: 8.082
Simulation results: results from the imputation/prediction model in the first column and results from a standard regression model with interactions in the second column
| Model | Imputation | Interaction |
|---|---|---|
| Average | 3.005 | 3.005 |
| Average | 1.002 | 1.002 |
| Average | 0.503 | 0.503 |
| Average | − 0.001 | − 0.001 |
| Average standard error of | 0.179 | 0.178 |
| Average standard error of | 0.130 | 0.130 |
| Average standard error of | 0.130 | 0.131 |
| Average standard error of | 0.130 | 0.130 |
| 95% CI coverage | 0.955 | 0.955 |
| 95% CI coverage | 0.949 | 0.952 |
| 95% CI coverage | 0.951 | 0.961 |
| 95% CI coverage | 0.949 | 0.954 |
In the imputation/prediction model, bootstrap was used for statistical inference
Fig. 1Estimated treatment effects according to simulations in the first model as well as in a model with increased heterogeneity, obtained according to steps 1–4. The average effect is indicated by a dashed line
Fig. 2Estimated treatment effects at different levels of covariates according to the first simulation. In the two first curves, fitted quadratic curves have been inserted
Simulation results as in Table 1 but with the correlation between X1 and X2 increased to 0.9
| Model | Imputation | Interaction |
|---|---|---|
| Average | 3.001 | 3.001 |
| Average | 1.003 | 1.003 |
| Average | 0.493 | 0.493 |
| Average | − 0.000 | − 0.000 |
| Average standard error of | 0.182 | 0.182 |
| Average standard error of | 0.232 | 0.232 |
| Average standard error of | 0.232 | 0.232 |
| Average standard error of | 0.122 | 0.122 |
| 95% CI coverage | 0.945 | 0.949 |
| 95% CI coverage | 0.939 | 0.947 |
| 95% CI coverage | 0.952 | 0.954 |
| 95% CI coverage | 0.944 | 0.948 |
Result in the first column were obtained from the imputation/prediction model outlined in steps 1–4 using bootstrap for statistical inference, whereas results in the second column were obtained from a standard regression model with interactions
Simulation results as in Table 1 but with the correlation between X1 and X2 increased to 0.9 and X2 omitted from the heterogeneity analysis
| Model | Imputation | Interaction |
|---|---|---|
| Average | 3.013 | 2.947 |
| Average | 1.438 | 1.400 |
| Average | 0.033 | 0.012 |
| Average standard error of | 0.183 | 0.181 |
| Average standard error of | 0.143 | 0.143 |
| Average standard error of | 0.123 | 0.123 |
Results in the first column were obtained from the imputation/prediction model outlined in steps 1–4 using bootstrap for statistical inference, whereas results in the second column were obtained from a standard regression model with interactions. In the imputation/prediction model, X2 was omitted from the last step and in the regression model with interactions, no interaction between treatment and X2 was included
Descriptive statistics for obese and non-obese individuals in the sample
| Variable | Non-obese (n = 3028) | Obese (n = 357) |
|---|---|---|
| HDL (ln mmol/l) | 0.33 (0.30) | 0.19 (0.28) |
| Risky alcohol consumption | 1% | 2% |
| Smoker | 24% | 15% |
| Physical activity score | 8288 (5779) | 7170 (5124) |
| Energy intake (kcal/day) | 2342 (657) | 2303 (673) |
| Primary education | 68% | 76% |
| Secondary education | 11% | 5% |
| Tertiary education | 22% | 18% |
| Male | 40% | 38% |
| Age 45–49 | 17% | 13% |
| Age 50–59 | 52% | 50% |
| Age 60–68 | 31% | 37% |
Continuous variables are presented with means and standard deviations whereas binary variables are presented with percentages
Fig. 3Estimated (relative) effects of obesity on HDL levels in the empirical example, obtained according to steps 1–3. The average effect is indicated by a dashed line
Relative impacts of individual characteristics on the estimated effects of having BMI ≥ 30 versus BMI < 30 on HDL (ln mmol/l), according to the proposed four-step method
| Univariate (95% CI) | Multivariate (95% CI) | |
|---|---|---|
| Risky alcohol consumption | 1.142 (0.986–1.324) | 1.155 (1.000–1.334) |
| Smoker | 0.886 (0.806–0.975) | 0.890 (0.807–0.982) |
| Physical activity score (10,000 units, centered) | 1.015 (0.955–1.079) | 1.014 (0.954–1.078) |
| Energy intake (1000 kcal/day, centered) | 0.967 (0.926–1.010) | 0.949 (0.906–0.995) |
| Primary education | 1.000 (ref) | 1.000 (ref) |
| Secondary education | 1.008 (0.906–1.122) | 1.018 (0.914–1.134) |
| Tertiary education | 1.081 (1.004–1.163) | 1.077 (1.002–1.159) |
| Male | 1.019 (0.961–1.080) | 1.051 (0.987–1.120) |
| Age 45–49 | 0.995 (0.949–1.043) | 1.001 (0.967–1.036) |
| Age 50–59 | 1.000 (ref) | 1.000 (ref) |
| Age 60–68 | 1.001 (0.953–1.052) | 0.991 (0.949–1.035) |
| Constant | – | 0.855 (0.817–0.895) |
Relative impacts of individual characteristics on the estimated effects of having BMI ≥ 30 versus BMI < 30 on HDL (ln mmol/l), according to a regression model with interactions
| Univariate (95% CI) | Multivariate (95% CI) | |
|---|---|---|
|
| ||
| Risky alcohol consumption | 0.911 (0.737–1.125) | 0.906 (0.730–1.124) |
| Smoker | 0.897 (0.825–0.975) | 0.907 (0.832–0.988) |
| Physical activity score (10,000 units, centered) | 1.013 (0.955–1.073) | 1.009 (0.952–1.070) |
| Energy intake (1000 kcal/day, centered) | 0.978 (0.935–1.023) | 0.964 (0.915–1.015) |
| Primary education | 1.000 (ref) | 1.000 (ref) |
| Secondary education | 1.011 (0.888–1.151) | 1.018 (0.893–1.161) |
| Tertiary education | 1.090 (1.009–1.177) | 1.085 (1.003–1.174) |
| Male | 1.025 (0.963–1.091) | 1.057 (0.983–1.135) |
| Age 45–49 | 0.960 (0.881–1.046) | 0.967 (0.884–1.058) |
| Age 50–59 | 1.000 (ref) | 1.000 (ref) |
| Age 60–68 | 1.012 (0.952–1.076) | 0.999 (0.936–1.065) |
| Main effect of obesity | – | 0.857 (0.810–0.906) |
Heterogeneous effects of obesity on HDL levels (ln mmol/l), based on interaction models where all variables previously used in “step 1–3” were entered as main effects and either one or all variables previously used in “step 4” were interacted with obesity. Since univariate models (including only one interaction term) were run separately, there is no common main effect of obesity to be reported. Effects are multiplicative, as the outcome is measured in logarithms and we have exponentiated coefficient estimates
Fig. 4Estimated (relative) effects of obesity on HDL levels by different levels of physical activity or energy intake. Fitted quadratic curves have been inserted (vertically shifted in order for treatment effects at x = 0 to represent individuals in the reference group)