| Literature DB >> 19757444 |
Abstract
The propensity score is a subject's probability of treatment, conditional on observed baseline covariates. Conditional on the true propensity score, treated and untreated subjects have similar distributions of observed baseline covariates. Propensity-score matching is a popular method of using the propensity score in the medical literature. Using this approach, matched sets of treated and untreated subjects with similar values of the propensity score are formed. Inferences about treatment effect made using propensity-score matching are valid only if, in the matched sample, treated and untreated subjects have similar distributions of measured baseline covariates. In this paper we discuss the following methods for assessing whether the propensity score model has been correctly specified: comparing means and prevalences of baseline characteristics using standardized differences; ratios comparing the variance of continuous covariates between treated and untreated subjects; comparison of higher order moments and interactions; five-number summaries; and graphical methods such as quantile-quantile plots, side-by-side boxplots, and non-parametric density plots for comparing the distribution of baseline covariates between treatment groups. We describe methods to determine the sampling distribution of the standardized difference when the true standardized difference is equal to zero, thereby allowing one to determine the range of standardized differences that are plausible with the propensity score model having been correctly specified. We highlight the limitations of some previously used methods for assessing the adequacy of the specification of the propensity-score model. In particular, methods based on comparing the distribution of the estimated propensity score between treated and untreated subjects are uninformative.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19757444 PMCID: PMC3472075 DOI: 10.1002/sim.3697
Source DB: PubMed Journal: Stat Med ISSN: 0277-6715 Impact factor: 2.373
Comparison of baseline characteristics between treated and untreated subjects in the original unmatched sample
| Variable | Statin: no ( | Statin: yes ( | Standardized differences | |
|---|---|---|---|---|
| Age | 68.1±13.8 | 63.4±12.4 | <0.001 | 0.355 |
| Female | 2241 (37.0 Per cent) | 887 (29.1 Per cent) | <0.001 | 0.167 |
| Acute CHF/pulmonary edema | 316 (5.2 Per cent) | 122 (4.0 Per cent) | 0.010 | 0.057 |
| Cardiogenic shock | 46 (0.8 Per cent) | 12 (0.4 Per cent) | 0.038 | 0.046 |
| Diabetes | 1561 (25.8 Per cent) | 774 (25.4 Per cent) | 0.684 | 0.009 |
| Current smoker | 2004 (33.1 Per cent) | 1070 (35.1 Per cent) | 0.057 | 0.042 |
| Hyperlipidemia | 1138 (18.8 Per cent) | 1761 (57.8 Per cent) | <0.001 | 0.910 |
| Hypertension | 2681 (44.3 Per cent) | 1453 (47.7 Per cent) | 0.002 | 0.068 |
| Family history of CAD | 1762 (29.1 Per cent) | 1177 (38.6 Per cent) | <0.001 | 0.204 |
| Cerebrovascular disease/TIA | 610 (10.1 Per cent) | 237 (7.8 Per cent) | <0.001 | 0.079 |
| Angina | 1869 (30.9 Per cent) | 1086 (35.6 Per cent) | <0.001 | 0.102 |
| Cancer | 191 (3.2 Per cent) | 73 (2.4 Per cent) | 0.041 | 0.045 |
| Chronic CHF | 275 (4.5 Per cent) | 91 (3.0 Per cent) | <0.001 | 0.079 |
| Renal disease | 34 (0.6 Per cent) | 13 (0.4 Per cent) | 0.396 | 0.019 |
| Systolic blood pressure | 148.7±31.6 | 149.3±30.1 | 0.338 | 0.021 |
| Diastolic blood pressure | 83.6±18.6 | 84.5 ±18.0 | 0.033 | 0.047 |
| Heart rate | 84.6 ±24.3 | 81.7±23.0 | <0.001 | 0.121 |
| Respiratory rate | 21.2±5.7 | 20.3±4.8 | <0.001 | 0.167 |
| Glucose | 9.4±5.1 | 9.2±5.3 | 0.092 | 0.037 |
| White blood count | 10.3±4.9 | 10.0±4.4 | 0.003 | 0.065 |
| Hemoglobin | 137.5±19.3 | 140.6±16.9 | <0.001 | 0.167 |
| Sodium | 138.9±3.9 | 139.2±3.3 | <0.001 | 0.079 |
| Potassium | 4.1±0.6 | 4.1 ±0.5 | 0.006 | 0.061 |
| Creatinine | 105.7±65.5 | 99.9±50.0 | <0.001 | 0.096 |
Notes: Continuous variables are reported as mean ± standard deviation. Dichotomous variables are reported as N (Per cent).
Comparison of baseline characteristics between treated and untreated subjects in the propensity-score matched sample
| Variable | Statin: no ( | Statin: yes ( | Standardized difference |
|---|---|---|---|
| Age | 63.4±13.2 | 63.5± 12.6 | 0.011 |
| Female | 717 (29.50 Per cent) | 736 (30.3 Per cent) | 0.017 |
| Acute CHF/pulmonary edema | 94 (3.9 Per cent) | 93 (3.8 Per cent) | 0.002 |
| Cardiogenic shock | 15 (0.6 Per cent) | 12 (0.5 Per cent) | 0.017 |
| Diabetes | 609 (25.1 Per cent) | 606 (24.9 Per cent) | 0.003 |
| Current smoker | 883 (36.3 Per cent) | 872 (35.9 Per cent) | 0.009 |
| Hyperlipidemia | 1135 (46.7 Per cent) | 1,142 (47.0 Per cent) | 0.006 |
| Hypertension | 1109 (45.6 Per cent) | 1,118 (46.0 Per cent) | 0.007 |
| Family history of CAD | 898 (37.0 Per cent) | 912 (37.5 Per cent) | 0.012 |
| Cerebrovascular disease/TIA | 185 (7.6 Per cent) | 187 (7.7 Per cent) | 0.003 |
| Angina | 830 (34.2 Per cent) | 810 (33.3 Per cent) | 0.017 |
| Cancer | 66 (2.7 Per cent) | 61 (2.5 Per cent) | 0.013 |
| Chronic CHF | 78 (3.2 Per cent) | 70 (2.9 Per cent) | 0.019 |
| Renal disease | 11 (0.5 Per cent) | 10 (0.4 Per cent) | 0.006 |
| Systolic blood pressure | 149.1 ±30.2 | 149.3±30.2 | 0.007 |
| Diastolic blood pressure | 84.3±18.1 | 84.6±18.4 | 0.015 |
| Heart rate | 81.3±22.5 | 81.6±22.8 | 0.014 |
| Respiratory rate | 20.3 ±4.8 | 20.3 ±4.7 | 0.012 |
| Glucose | 9.2±5.1 | 9.3±5.5 | 0.017 |
| White blood count | 10.2±4.1 | 10.1 ±4.7 | 0.006 |
| Hemoglobin | 140.7±18.2 | 140.5 ±17.0 | 0.007 |
| Sodium | 139.2±3.7 | 139.1±3.3 | 0.025 |
| Potassium | 4.1 ±0.5 | 4.1±0.5 | 0.030 |
| Creatinine | 100.5 ±59.0 | 100.5 ±52.4 | 0.000 |
Notes: Continuous variables are reported as mean ± standard deviation. Dichotomous variables are reported as N (Per cent).
Figure 1Absolute standardized differences for baseline covariates comparing treated to untreated subjects in the original and the matched sample.
Figure 2Relationship between sample size and the standard deviation of empirical sampling distribution of standardized difference.
Results of simulations examining balance in matched sample when propensity score model is mis-specified compared to when it is properly specified
| Scenario A | Scenario B | Scenario C | ||||
|---|---|---|---|---|---|---|
| Statistic | Propensity-score model incorrectly specified | Propensity-score mode correctly specified | Propensity-score model incorrectly specified | Propensity-score model correctly specified | Propensity-score model incorrectly specified | Propensity-score model correctly specified |
| 0.01 | 0.05 | 0.04 | 0.04 | 0.02 | 0.03 | |
| 1.14 | 1.04 | 0.65 | 1.02 | 0.77 | 1.02 | |
| 0.22 | 0.07 | 0.04 | 0.04 | 0.04 | 0.04 | |
| 1.43 | 1.12 | 0.65 | 1.02 | NA | NA | |
| NA | NA | 0.45 | 0.05 | 0.23 | 0.03 | |
| NA | NA | 0.46 | 1.08 | 0.89 | 1.05 | |
| Difference in | 0.08 | 0.04 | 0.35 | 0.02 | 0.11 | 0.02 |
Notes: Variance ratios are the mean ratio of the variance of a variable in treated subjects to the variance of the variable in untreated subjects.
Variances of continuous covariates in treated and untreated subjects in unmatched and matched samples
| Unmatched sample | Matched sample | |||||
|---|---|---|---|---|---|---|
| Variable | Variance (untreated subjects) | Variance (treated subjects) | Ratio: treated to untreated variances | Variance (untreated subjects) | Variance (treated subjects) | Ratio: treated to untreated variances |
| Age | 191.71 | 153.54 | 0.80 | 174.84 | 158.08 | 0.90 |
| Heart rate | 590.84 | 527.08 | 0.89 | 508.03 | 521.28 | 1.03 |
| Systolic BP | 996.68 | 904.15 | 0.91 | 914.95 | 909.29 | 0.99 |
| Diastolic BP | 346.7 | 323.95 | 0.93 | 327.17 | 337.04 | 1.03 |
| Respiratory rate | 32.9 | 22.81 | 0.69 | 23.32 | 22.09 | 0.95 |
| WBC | 23.75 | 19.51 | 0.82 | 17.2 | 21.64 | 1.26 |
| Hemoglobin | 374.32 | 286.34 | 0.76 | 331.62 | 288.36 | 0.87 |
| Sodium | 15.37 | 10.82 | 0.70 | 13.96 | 10.75 | 0.77 |
| Glucose | 26.1 | 28.05 | 1.08 | 26.04 | 30.77 | 1.18 |
| Potassium | 0.32 | 0.26 | 0.82 | 0.28 | 0.26 | 0.94 |
| Creatinine | 4283.81 | 2503.02 | 0.58 | 3477.95 | 2745.08 | 0.79 |
Five-number summaries of continuous variables comparing treated and untreated subjects in both unmatched and matched samples
| Untreated subjects | Treated subjects | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Variable | Minimum | 25th percentile | Median | 75th percentile | Maximum | Minimum | 25th percentile | Median 75th percentile | Maximum | |
| Age | 21 | 58 | 70 | 79 | 100 | 22 | 54 | 64 | 73 | 92 |
| Heart rate | 0 | 68 | 81 | 98 | 260 | 4 | 66 | 78 | 93 | 260 |
| Systolic BP | 0 | 128 | 148 | 170 | 266 | 0 | 130 | 149 | 169 | 270 |
| Diastolic BP | 0 | 70 | 83 | 96 | 199 | 0 | 72 | 84 | 96 | 182 |
| Respiratory rate | 5 | 18 | 20 | 23 | 60 | 5 | 18 | 20 | 20 | 58 |
| WBC | 0.3 | 7.6 | 9.5 | 12.1 | 98 | 0.76 | 7.6 | 9.4 | 11.5 | 90.1 |
| Hemoglobin | 39 | 127 | 139 | 151 | 199 | 58 | 131 | 142 | 152 | 199 |
| Sodium | 85 | 137 | 139 | 141 | 191 | 114 | 137 | 140 | 141 | 149 |
| Glucose | 1.2 | 6.4 | 7.8 | 10.7 | 89 | 1.4 | 6.3 | 7.6 | 10.2 | 92 |
| Potassium | 2 | 3.7 | 4.1 | 4.4 | 9.2 | 2.3 | 3.7 | 4 | 4.3 | 7 |
| Creatinine | 30 | 78 | 92 | 112 | 1151 | 20 | 77 | 91 | 108 | 926 |
| Age | 21 | 53 | 64 | 74 | 96 | 22 | 54 | 64 | 74 | 92 |
| Heart rate | 0 | 66 | 78 | 94 | 230 | 12 | 66 | 78 | 93 | 210 |
| Systolic BP | 56 | 130 | 148 | 170 | 251 | 0 | 130 | 148 | 169 | 270 |
| Diastolic BP | 8 | 72 | 84 | 96 | 189 | 0 | 72 | 84 | 97 | 182 |
| Respiratory rate | 5 | 18 | 20 | 22 | 56 | 5 | 18 | 20 | 20 | 58 |
| WBC | 0.3 | 7.6 | 9.5 | 11.8 | 69.5 | 0.76 | 7.7 | 9.4 | 11.6 | 90.1 |
| Hemoglobin | 39 | 131 | 143 | 153 | 195 | 58 | 131 | 142 | 152 | 199 |
| Sodium | 85 | 138 | 140 | 141 | 191 | 114 | 137 | 140 | 141 | 149 |
| Glucose | 1.2 | 6.3 | 7.7 | 10.4 | 89 | 1.4 | 6.3 | 7.6 | 10.2 | 92 |
| Potassium | 2.4 | 3.7 | 4 | 4.3 | 7.3 | 2.3 | 3.7 | 4 | 4.3 | 7 |
| Creatinine | 30 | 77 | 90 | 106 | 1151 | 20 | 77 | 91 | 107 | 926 |
Figure 3Side-by-side boxplots and Q–Q plots for age.
Figure 4Density plots and cumulative distribution functions for age.
Figure 5Distribution of estimated propensity score in treated and untreated subjects in different matched samples.