| Literature DB >> 27460539 |
Peter C Austin1,2,3, Nathaniel Jembere1, Maria Chiu1.
Abstract
Researchers are increasingly using complex population-based sample surveys to estimate the effects of treatments, exposures and interventions. In such analyses, statistical methods are essential to minimize the effect of confounding due to measured covariates, as treated subjects frequently differ from control subjects. Methods based on the propensity score are increasingly popular. Minimal research has been conducted on how to implement propensity score matching when using data from complex sample surveys. We used Monte Carlo simulations to examine two critical issues when implementing propensity score matching with such data. First, we examined how the propensity score model should be formulated. We considered three different formulations depending on whether or not a weighted regression model was used to estimate the propensity score and whether or not the survey weights were included in the propensity score model as an additional covariate. Second, we examined whether matched control subjects should retain their natural survey weight or whether they should inherit the survey weight of the treated subject to which they were matched. Our results were inconclusive with respect to which method of estimating the propensity score model was preferable. In general, greater balance in measured baseline covariates and decreased bias was observed when natural retained weights were used compared to when inherited weights were used. We also demonstrated that bootstrap-based methods performed well for estimating the variance of treatment effects when outcomes are binary. We illustrated the application of our methods by using the Canadian Community Health Survey to estimate the effect of educational attainment on lifetime prevalence of mood or anxiety disorders.Entities:
Keywords: Monte Carlo simulations; Propensity score; propensity score matching; survey
Mesh:
Year: 2016 PMID: 27460539 PMCID: PMC5843030 DOI: 10.1177/0962280216658920
Source DB: PubMed Journal: Stat Methods Med Res ISSN: 0962-2802 Impact factor: 3.021
Figure 1.Balance in X1.
Figure 2.Relative bias in estimated difference in means.
Figure 3.Relative bias in estimated risk differences.
Figure 4.MSE of estimated difference in means.
Figure 5.MSE of estimated risk difference.
Figure 6.Ratio of estimated to empirical standard error (difference in means).
Figure 7.Empirical coverage rates of 95% CIs (difference in means).
Figure 8.Ratio of estimated to empirical standard error (risk difference).
Figure 9.Empirical coverage rates of 95% CIs (risk difference).
Figure 10.Results for the DuGoff/Ridgeway scenario.
Balance in baseline covariates and estimated risk differences for the six different approaches to propensity score matching in the CCHS sample.
| Variable | Specification of propensity score model | |||||
|---|---|---|---|---|---|---|
| Unweighted logistic regression model | Weighted logistic regression model | Unweighted logistic regression model + survey weight as an additional covariate | ||||
| Retained weights | Inherited weights | Retained weights | Inherited weights | Retained weights | Inherited weights | |
| Standardized differences | ||||||
| Age 20–29 years | −0.010 | 0.007 | −0.018 | 0.013 | −0.017 | 0.007 |
| Age 30–39 years | 0.026 | 0.001 | 0.020 | 0.001 | 0.015 | 0.009 |
| Age 40–49 years | −0.008 | −0.002 | −0.008 | 0.001 | −0.007 | 0.011 |
| Age ≥50 years | −0.011 | −0.008 | 0.007 | −0.020 | 0.011 | −0.034 |
| Two or more chronic conditions | 0.006 | −0.004 | 0.008 | 0.003 | 0.022 | 0.016 |
| Ate fruits and vegetables <3 times per day | 0.002 | 0.002 | 0.001 | 0.008 | −0.008 | −0.010 |
| Income less than $30,000 | −0.018 | 0.001 | −0.017 | 0.006 | −0.033 | −0.001 |
| Income between $30,000 and $60,000 | −0.002 | −0.005 | −0.009 | −0.008 | −0.009 | −0.015 |
| Income between $60,000 and $80,000 | 0.019 | 0.000 | 0.009 | −0.002 | 0.012 | −0.011 |
| Income greater than $80,000 | −0.001 | 0.004 | 0.016 | 0.004 | 0.027 | 0.026 |
| Male | −0.045 | −0.003 | −0.030 | −0.003 | −0.028 | 0.021 |
| Physically inactive | −0.001 | 0.002 | 0.008 | 0.004 | 0.012 | 0.100 |
| Rural dwelling | 0.001 | 0.000 | 0.003 | −0.007 | 0.012 | −0.125 |
| Poor sense of belonging | −0.010 | 0.003 | −0.011 | 0.011 | −0.007 | 0.001 |
| Current smoking | −0.013 | 0.000 | −0.026 | 0.001 | 0.015 | 0.008 |
| Non-drinker | 0.006 | 0.005 | 0.023 | 0.011 | 0.011 | 0.072 |
| Moderate drinker | −0.023 | −0.004 | −0.042 | −0.009 | −0.036 | −0.057 |
| Heavy drinker | 0.013 | −0.002 | 0.013 | −0.003 | 0.019 | −0.024 |
| Work status (unemployed in past year) | −0.008 | −0.010 | −0.020 | −0.011 | −0.005 | −0.031 |
| Effect of low education on the probability of a prevalent mood or anxiety disorder | ||||||
| Difference in probability of prevalent mood or anxiety disorder (95% CI) | −0.005 (−0.017, 0.006) | −0.013 (−0.025, −0.001) | −0.013 (−0.025, 0) | −0.02 (−0.031, −0.008) | −0.012 (−0.026, 0.002) | −0.018 (−0.033, −0.003) |