| Literature DB >> 36124257 |
Sun Yeop Lee1, Rockli Kim2,3, Justin Rodgers4, S V Subramanian4,5.
Abstract
In a study attempting to estimate a causal effect of a causal variable, an assessment of the predictive power of the causal variable can shed light on the heterogeneity around its average effect. Using data from the Head Start Impact Study, a randomized controlled trial of the Head Start, a nation-wide early childhood education program in the United States, we provide a parallel comparison between measures of average effect and predictive power of the Head Start on five cognitive outcomes. We observed that one year of the Head Start increased scores for all five outcomes, with effect sizes ranging from 0.12 to 0.19 standard deviations. Percent variation explained by the Head Start ranged from 0.56 to 1.62%. For binary versions of the outcomes, the overall pattern remained; the Head Start on average improved the outcomes by meaningful magnitudes. In contrast, in a fully adjusted model, the Head Start only improved area under the curve (AUC) by less than 1% and its influence on the variance of predicted probabilities was negligible. The Head-Start-only model only achieved AUC ranging from 50.22 to 55.24%. Negligible predictive power despite the significant average effect suggests that the heterogeneity in effects may be large. The average effect estimates may not generalize well to different populations or different Head Start program settings. Assessment of the predictive power of a causal variable in randomized data should be a routine practice as it can provide helpful information on the causal effect and especially its heterogeneity.Entities:
Year: 2022 PMID: 36124257 PMCID: PMC9482140 DOI: 10.1016/j.ssmph.2022.101223
Source DB: PubMed Journal: SSM Popul Health ISSN: 2352-8273
Sample characteristics at baseline by the treatment and control groups.
| Overall | Control | Head Start | Missing | ||
|---|---|---|---|---|---|
| N | 4442 | 1796 | 2646 | ||
| Age cohort (%) | 3 | 2449 (55.1) | 985 (54.8) | 1464 (55.3) | 0 |
| 4 | 1993 (44.9) | 811 (45.2) | 1182 (44.7) | ||
| Gender (%) | male | 2239 (50.4) | 912 (50.8) | 1327 (50.2) | 0 |
| Race/ethnicity (%) | White | 1496 (33.7) | 623 (34.7) | 873 (33.0) | 0 |
| Black | 1348 (30.3) | 536 (29.8) | 812 (30.7) | ||
| Hispanic & others | 1598 (36.0) | 637 (35.5) | 961 (36.3) | ||
| Primary language (%) | English | 3301 (74.3) | 1345 (74.9) | 1956 (73.9) | 0 |
| Spanish | 1141 (25.7) | 451 (25.1) | 690 (26.1) | ||
| Parental education (%) | more | 1274 (28.7) | 505 (28.1) | 769 (29.1) | 0 |
| high school | 1481 (33.3) | 592 (33.0) | 889 (33.6) | ||
| less | 1687 (38.0) | 699 (38.9) | 988 (37.3) | ||
| Single parent (%) | 2239 (50.4) | 907 (50.5) | 1332 (50.3) | 0 | |
| Recent immigrant (%) | 855 (19.2) | 337 (18.8) | 518 (19.6) | 0 | |
| Marital status (%) | married | 1972 (44.4) | 806 (44.9) | 1166 (44.1) | 0.1 |
| separated & divorced & | 724 (16.3) | 290 (16.1) | 434 (16.4) | ||
| never | 1742 (39.2) | 699 (38.9) | 1043 (39.4) | ||
| Special needs (%) | 570 (12.8) | 204 (11.4) | 366 (13.8) | 0 | |
| Teen mom (%) | 752 (16.9) | 330 (18.4) | 422 (15.9) | 0 | |
| Urban (%) | 3746 (84.3) | 1513 (84.2) | 2233 (84.4) | 0 | |
| Household risk (%) | low | 3383 (76.2) | 1399 (77.9) | 1984 (75.0) | 0 |
| moderate | 741 (16.7) | 277 (15.4) | 464 (17.5) | ||
| high | 318 (7.2) | 120 (6.7) | 198 (7.5) | ||
| Caregiver's age (mean (SD)) | 28.91 (7.34) | 28.65 (7.06) | 29.08 (7.52) | 0 | |
Measures of average effect and predictive power on continuous outcomes.
| Average effect | Predictive power | ||||
|---|---|---|---|---|---|
| Sample size (follow-up rate) | Model 1 regression coefficient (95% CI); Cohen's | Model 1 child-level variance | Model 2 child-level variance | Percent variation explained by Head Start | |
| Outcome | |||||
| PPVT | 3621 (82%) | 5.66 (4.05, 7.26); 0.14 | 557.74 | 565.29 | 1.34 |
| Letter-Word Identification | 3627 (82%) | 5.17 (3.78, 6.55); 0.19 | 416.47 | 423.13 | 1.57 |
| Applied Problem | 3601 (81%) | 3.38 (1.93, 4.84); 0.12 | 454.99 | 457.72 | 0.60 |
| Spelling | 3635 (82%) | 3.02 (1.72, 4.31); 0.12 | 365.38 | 367.43 | 0.56 |
| Pre-Academic Skill | 3594 (81%) | 3.82 (2.81, 4.83); 0.17 | 218.65 | 222.24 | 1.62 |
Note. CI, confidence intervals., PPVT, Peabody Picture Vocabulary Test.
Model 1: a full model with the Head Start.
Model 2: a full model without the Head Start.
Measures of average effect and predictive power on binary outcomes (i.e., “high” scores).
| Average effect | Predictive power | |||||
|---|---|---|---|---|---|---|
| Model 3 | Model 3 | Model 4 | Model 4 → Model 3 | Model 5 | ||
| OR (95% CI) | AUC (%) | AUC (%) | Improvement in AUC contributed to Head Start (%) | Change in the variance of predicted probability | AUC (%) | |
| Outcome | ||||||
| PPVT | ||||||
| >0.25 | 1.66 (1.36, 2.02) | 88.58 | 88.40 | 0.18 | 0.018 | 53.16 |
| >0.50 | 1.42 (1.18, 1.71) | 89.90 | 89.78 | 0.13 | 0.006 | 51.63 |
| >0.75 | 1.16 (0.92, 1.45) | 92.54 | 92.53 | 0.01 | 0.001 | 50.22 |
| Letter-Word Identification | ||||||
| >0.25 | 1.62 (1.38, 1.90) | 80.44 | 79.95 | 0.49 | 0.033 | 54.56 |
| >0.50 | 1.77 (1.50, 2.09) | 83.72 | 83.06 | 0.65 | 0.034 | 55.24 |
| >0.75 | 1.47 (1.20, 1.80) | 87.05 | 86.92 | 0.13 | 0.010 | 53.98 |
| Applied Problem | ||||||
| >0.25 | 1.34 (1.11, 1.60) | 85.62 | 85.50 | 0.12 | 0.008 | 52.02 |
| >0.50 | 1.28 (1.08, 1.52) | 85.87 | 85.76 | 0.12 | 0.005 | 51.56 |
| >0.75 | 1.07 (0.86, 1.34) | 90.14 | 90.14 | 0.00 | 0.000 | 50.66 |
| Spelling | ||||||
| >0.25 | 1.47 (1.24, 1.74) | 84.56 | 84.34 | 0.22 | 0.013 | 53.02 |
| >0.50 | 1.30 (1.09, 1.54) | 85.49 | 85.36 | 0.13 | 0.005 | 52.12 |
| >0.75 | 1.24 (1.01, 1.53) | 88.42 | 88.38 | 0.04 | 0.004 | 52.11 |
| Pre-Academic Skill | ||||||
| >0.25 | 1.58 (1.30, 1.90) | 86.42 | 86.21 | 0.22 | 0.017 | 53.22 |
| >0.50 | 1.56 (1.30, 1.87) | 88.21 | 87.98 | 0.23 | 0.011 | 53.23 |
| >0.75 | 1.54 (1.23, 1.92) | 91.33 | 91.20 | 0.14 | 0.009 | 53.46 |
Note. OR, odds ratio. CI, confidence intervals. AUC, area under the curve.
Model 3: a full model with the Head Start.
Model 4: a full model without the Head Start.
Model 5: a simple model with only the Head Start.
Fig. 1Comparison of predicted probabilities with and without the Head Start. Predicted probabilities for “high” scores from Model 3 (with the Head Start; right) and Model 4 (without the Head Start; left) are shown. Probabilities are on the x-axis, and densities are the y-axis.