| Literature DB >> 31218240 |
Gabriel Estrella1, Jacky Au1, Susanne M Jaeggi1, Penelope Collins1.
Abstract
Despite being among the fastest growing segments of the student population, English Language Learners (ELLs) have yet to attain the same academic success as their English-proficient peers, particularly in science. In an effort to support the pedagogical needs of this group, educators have been urged to adopt inquiry approaches to science instruction. Whereas inquiry instruction has been shown to improve science outcomes for non-ELLs, systematic evidence in support of its effectiveness with ELLs has yet to be established. The current meta-analysis summarizes the effect of inquiry instruction on the science achievement of ELLs in elementary school. Although an analysis of 26 articles confirmed that inquiry instruction produced significantly greater impacts on measures of science achievement for ELLs compared to direct instruction, there was still a differential learning effect suggesting greater efficacy for non-ELLs compared to ELLs. Contextual factors that moderate these effects are identified and discussed.Entities:
Keywords: English Language Learner; achievement gap; inquiry instruction; quantitative research synthesis; science education
Year: 2018 PMID: 31218240 PMCID: PMC6583889 DOI: 10.1177/2332858418767402
Source DB: PubMed Journal: AERA Open ISSN: 2332-8584
FIGURE 1.Flow diagram of study selection procedure and selection criteria.
Note. ELL = English language learner.
FIGURE 2.Estimated mean treatment effect size (difference in science achievement between English language learners in treatment and control conditions) for each study with overall mean weighted effect size. Forest plot showing treatment effect sizes with 95% confidence interval and 95% prediction interval. Studies with alphabetic superscripts refer to multiple independent effect sizes generated from the same study.
FIGURE 3.Estimated mean inquiry effect size (difference in science achievement between English language learners and non-English language learners in treatment condition) for each study with overall weighted effect size. Forest plot showing inquiry effect sizes with 95% confidence interval and 95% prediction interval. Studies with alphabetic superscripts refer to multiple independent effect sizes generated from the same study. The data used to calculate an overall effect size for Lee et al. 2004–2007 is based on information reported in Lee, Maerten-Rivera, Penfield, Leroy, and Secada (2008); Lee, Mahotiere, Salinas, Penfield, and Maerten-Rivera (2009); and Lee, Penfield, and Maerten-Rivera (2009).
FIGURE 4.Estimated mean traditional effect sizes (difference in science achievement between English language learners and non-English language learner students in control condition) for each study with overall weighted effect size. Forest plot showing traditional effect sizes with 95% confidence interval and 95% prediction interval.
Overall Weighted Mean Treatment Effect Size (ES) for Subgroup Analyses of Categorical Moderators
| Treatment ES and 95% CI | Test of Difference | |||||||
|---|---|---|---|---|---|---|---|---|
| Moderator | Lower | Upper | ||||||
| Publication status | 8.19 | 1 | ||||||
| Published | 7,595 | 17 | 0 37 | 0.07 | 0.24 | 0.51 | ||
| Unpublished | 696 | 6 | −0.04 | 0.11 | −0.26 | 0.18 | ||
| Research design | ||||||||
| Randomized experiment | 6,161 | 15 | 0.18 | 0.07 | 0.04 | 0.33 | 5.20 | 1 |
| Quasi-experiment | 2,130 | 8 | 0.46 | 0.10 | 0.26 | 0.65 | ||
| Measurement design | 0.01 | 1 | ||||||
| Pretest and posttest | 3,651 | 13 | 0.27 | 0.11 | 0.06 | 0.48 | ||
| Posttest only | 4,154 | 10 | 0.27 | 0.09 | 0.10 | 0.45 | ||
| Assessment format | 2.22 | 2 | ||||||
| Multiple choice | 6,248 | 14 | 0.27 | 0.09 | 0.10 | 0.44 | ||
| Constructed response | 460 | 2 | 0.58 | 0.23 | 0.13 | 1.04 | ||
| Mixed | 1,583 | 7 | 0.19 | 0.12 | −0.05 | 0.43 | ||
| Assessment type | 5.08 | 1 | ||||||
| Researcher-developed | 2,976 | 13 | 0.39 | 0.08 | 0.23 | 0.55 | ||
| Standardized | 4,829 | 10 | 0.12 | 0.10 | −0.06 | 0.33 | ||
| Professional development | 4.08 | 2 | ||||||
| Small dose (14 hours) | 1,175 | 5 | 0.19 | 0.13 | −0.06 | 0.44 | ||
| Large dose (15+ hours) | 6,822 | 16 | 0.27 | 0.07 | 0.14 | 0.40 | ||
| Not reported | 294 | 2 | 0.66 | 0.19 | 0.28 | 1.04 | ||
| Professional development | 8.74 | 2 | ||||||
| Focused on English language learners | 7,125 | 15 | 0.32 | 0.06 | 0.19 | 0.44 | ||
| Not focused on English language learners | 872 | 6 | 0.06 | 0.11 | −0.16 | 0.27 | ||
| Not reported | 294 | 2 | 0.67 | 0.17 | 0.30 | 1.03 | ||
| Student grade level | 6.77 | 5 | ||||||
| First | 420 | 2 | 0.35[ | 0.21 | −0.06 | 0.76 | ||
| Second | 220 | 1 | 0.38 | 0.27 | −0.15 | 0.92 | ||
| Fourth | 601 | 3 | 0.63 | 0.16 | 0.32 | 0.94 | ||
| Fifth | 5,625 | 9 | 0.22 | 0.09 | 0.05 | 0.40 | ||
| Sixth | 1,058 | 5 | 0.24[ | 0.12 | −0.01 | 0.48 | ||
| Mixed | 367 | 3 | 0.11 | 0.17 | −0.22 | 0.44 | ||
p < .10.
p < .05.
p < .01.
p < .001.
Overall Weighted Mean Inquiry Effect Size (ES) for Subgroup Analyses of Categorical Moderators
| Inquiry ES and 95% CI | Test of Difference | |||||||
|---|---|---|---|---|---|---|---|---|
| Moderator | Lower | Upper | ||||||
| Publication status | 0.90 | 1 | ||||||
| Published | 24,383 | 21 | −0.36 | 0.08 | −0.52 | −0.20 | ||
| Unpublished | 23,776 | 9 | −0.22[ | 0.12 | −0.46 | 0.02 | ||
| Research design | 0.41 | 1 | ||||||
| Randomized experiment | 43,408 | 23 | −0.34 | 0.08 | −0.49 | −0.19 | ||
| Quasi-experiment | 4,751 | 7 | −0.23 | 0.15 | −0.52 | 0.05 | ||
| Measurement design | 14.52 | 1 | ||||||
| Pretest and posttest | 31,590 | 20 | −0.17 | 0.08 | −0.31 | −0.04 | ||
| Posttest only | 16,569 | 10 | −0.66 | 0.11 | −0.87 | −0.45 | ||
| Assessment format | 2.29 | 3 | ||||||
| Multiple choice | 40,315 | 18 | −0.29 | 0.08 | −0.46 | −0.13 | ||
| Constructed response | 457 | 3 | −0.67 | 0.29 | −1.23 | −0.11 | ||
| Mixed | 6,670 | 8 | −0.37 | 0.13 | −0.61 | −0.12 | ||
| Other | 717 | 1 | −0.05 | 0.36 | −0.76 | 0.67 | ||
| Assessment type | 5.03 | 1 | ||||||
| Researcher-developed | 32,923 | 23 | −0.24 | 0.07 | −0.38 | −0.10 | ||
| Standardized | 15,236 | 7 | −0.56 | 0.12 | −0.79 | −0.32 | ||
| Professional development | 5.79[ | 2 | ||||||
| Small dose (14 hours) | 22,310 | 11 | −0.12 | 0.11 | −0.34 | 0.10 | ||
| Large dose (15+ hours) | 24,630 | 17 | −0.46 | 0.09 | −0.63 | −0.28 | ||
| Not reported | 1,219 | 2 | −0.16 | 0.26 | −0.66 | 0.34 | ||
| Professional development | 3.83 | 2 | ||||||
| Focused on English language learners | 17,842 | 15 | −0.45 | 0.10 | −0.64 | −0.26 | ||
| Not focused on English language learners | 29,815 | 14 | −0.19 | 0.10 | −0.40 | 0.00 | ||
| Not reported | 502 | 1 | −0.26 | 0.35 | −0.95 | 0.42 | ||
| Grade level | 6.68 | 4 | ||||||
| Third | 6,299 | 2 | −0.26 | 0.25 | −0.75 | −0.24 | ||
| Fourth | 11,730 | 7 | −0.20 | 0.14 | −0.48 | −0.07 | ||
| Fifth | 21,652 | 8 | −0.54 | 0.12 | −0.78 | −0.30 | ||
| Sixth | 7,271 | 6 | −0.38 | 0.15 | −0.67 | −0.08 | ||
| Mixed | 1,207 | 7 | −0.08 | 0.15 | −0.37 | −0.21 | ||
p < .10.
p < .05.
p < .01.
p < .001.
Overall Weighted Mean Traditional Effect Size (ES) for Subgroup Analyses of Categorical Moderators
| Traditional ES and 95% CI | Test of Difference | |||||||
|---|---|---|---|---|---|---|---|---|
| Moderator | Lower | Upper | ||||||
| Publication status | 2.39 | 1 | ||||||
| Published | 11,986 | 10 | −0.58 | 0.14 | −0.85 | −0.31 | ||
| Unpublished | 3,093 | 5 | −0.21 | 0.19 | −0.59 | 0.17 | ||
| Research design | 1.95 | 1 | ||||||
| Randomized experiment | 11,676 | 8 | −.60 | 0.15 | −0.76 | −0.14 | ||
| Quasi-experiment | 3,403 | 7 | −.29[ | 0.17 | −0.61 | 0.04 | ||
| Measurement design | 16.18 | 1 | ||||||
| Pretest and posttest | 7,606 | 9 | −0.24 | 0.11 | −0.44 | −0.03 | ||
| Posttest only | 7,473 | 6 | −0.92 | 0.13 | −1.18 | −0.66 | ||
| Assessment format | 0.39 | 2 | ||||||
| Multiple choice | 11,135 | 8 | −0.55 | 0.15 | −0.86 | −0.25 | ||
| Constructed response | 310 | 2 | −0.49 | 0.34 | −1.13 | 0.16 | ||
| Mixed | 3,634 | 5 | −0.40 | 0.32 | −0.78 | −0.02 | ||
| Assessment type | 2.92[ | 1 | ||||||
| Researcher-developed | 5,320 | 9 | −0.36 | 0.13 | −0.61 | −0.11 | ||
| Standardized | 9,759 | 6 | −0.69 | 0.15 | −0.99 | −0.40 | ||
| Grade level | 6.83[ | 3 | ||||||
| Fourth | 1,591 | 3 | −0.42[ | 0.25 | −0.91 | 0.08 | ||
| Fifth | 9,761 | 5 | −0.76 | 0.18 | −1.12 | −0.40 | ||
| Sixth | 3,168 | 4 | −0.45 | 0.21 | −0.87 | −0.04 | ||
| Mixed | 559 | 3 | 0.05 | 0.25 | −0.44 | 0.54 | ||
p < .10.
p < .05.
p < .01.
p < .001.
Meta-Regression of Continuous Variables on Overall Weighted Mean Effect Sizes (ES)
| Moderator | Treatment ES | Inquiry ES | Traditional ES |
|---|---|---|---|
| Constant | 0.613 | −0.582 | 0.652[ |
| Methodological controls | |||
| Published study | 0.433 | −0.018 | −0.216 |
| Randomized experiment | −0.204 | −0.111 | −0.024 |
| Pretest and posttest design | −0.114 | 0.516 | 0.429 |
| Continuous predictors | |||
| Student grade level | −0.049 | 0.001 | −0.197 |
| Instruction (weeks) | −0.012 | 0.005 | −0.016 |
| Professional development (hours) | −0.001 | −0.041 | — |
| Number of studies ( | 21 | 27 | 15 |
| Between-study variance (τ2) | 0.01 | 0.05 | 0.01 |
| Heterogeneity ( | 57 | 92 | 65 |
Note. Random effects models were used in all meta-regression analyses. Random effects variance components were estimated using maximum likelihood. Effect sizes computed as Hedge’s g. Reference group for controls = unpublished study, quasi-experiment, posttest-only design.
p < .10.
p < .05.
p < .001.
FIGURE 5.Funnel plot used for evaluating publication bias. Average weighted treatment effect size estimated for each study (horizontal axis) plotted against corresponding standard error (vertical axis).
Study Characteristics and Key Moderators for Studies Included in the Meta-Analysis
| Study by Authors | Publication Status | Research Design | Measurement Design | Grade Level | Treatment Effect Size | Inquiry Effect Size | Traditional Effect Size |
|---|---|---|---|---|---|---|---|
| Published | Quasi-experimental | Posttest only | Fourth | 0.25 | −0.64 | −0.27 | |
| Published | Quasi-experimental | Posttest only | Sixth | 0.82 | −0.59 | −0.71 | |
| Published | Experimental | Pre-posttest | Fifth | 0.33 | −0.25 | −0.18 | |
| Published | Experimental | Pre-posttest | Sixth | 0.16 | −0.80 | −0.66 | |
| Published | Experimental | Pre-posttest | Fourth, fifth | 0.71 | −0.04 | 0.06 | |
| Published | Experimental | Pre-posttest | Fourth, fifth | — | −0.06 | — | |
| Published | Experimental | Pre-posttest | Fourth, fifth | — | 0.04 | — | |
| Published | Experimental | Posttest only | Fifth | — | −0.22 | — | |
| Published | Experimental | Pre-posttest | Fourth | 0.92 | −0.26 | — | |
| Published | Experimental | Pre-posttest | Fourth | 0.92 | — | — | |
| Published | Experimental | Pre-posttest | Third | — | −0.12 | — | |
| Published | Experimental | Pre-posttest | Fourth | — | −0.02 | — | |
| Published | Experimental | Pre-posttest | Fifth | — | −0.04 | — | |
| Published | Experimental | Pre-posttest | Sixth | — | −0.09 | — | |
| Published | Experimental | Pre-posttest | Fourth | 0.54 | — | −0.29 | |
| Published | Experimental | Pre-posttest | Fourth | — | −0.12 | — | |
| Published | Experimental | Pre-posttest | Fourth | — | −0.23 | — | |
| Published | Experimental | Pre-posttest | Fourth | — | −0.21 | — | |
| Published | Experimental | Posttest only | Fifth, sixth | — | −0.83 | — | |
| Published | Experimental | Posttest only | Fifth | 0.24 | — | — | |
| Lee et al. (2004–2007) | Published | Experimental | Pre-posttest | Fifth, sixth | — | −0.38 | — |
| Published | Experimental | Pre-Posttest | Fifth | 0.16 | −0.12 | −0.34 | |
| Unpublished | Quasi-experimental | Posttest only | First | 0.12 | — | — | |
| Published | Experimental | Posttest only | Fifth | 0.05 | −1.14 | −1.12 | |
| Published | Experimental | Posttest only | Fifth | 0.16 | −0.92 | −1.00 | |
| Published | Experimental | Posttest only | Fifth | 0.32 | −0.93 | −1.14 | |
| Unpublished | Experimental | Pre-Posttest | Third, fifth | −0.07 | −0.13 | 0.20 | |
| Unpublished | Experimental | Pre-Posttest | Third, fifth | −0.29 | −0.06 | −0.11 | |
| Unpublished | Experimental | Pre-posttest | Sixth | 0.11 | −0.34 | −0.17 | |
| Unpublished | Experimental | Pre-posttest | Sixth | −0.35 | −0.23 | 0.05 | |
| Unpublished | Experimental | Posttest only | Sixth | 0.38 | −0.95 | −1.06 | |
| Published | Experimental | Pre-posttest | Fourth | — | −0.04 | — | |
| Published | Experimental | Posttest only | Fifth | — | −0.05 | — | |
| Published | Experimental | Pre-posttest | Third, sixth | — | 0.07 | — | |
| Published | Experimental | Posttest only | Fifth | 0.72 | — | — | |
| Published | Quasi-experimental | Pre-posttest | Fifth | 0.11 | — | — | |
| Published | Quasi-experimental | Pre-posttest | Fifth | 0.36 | — | — | |
| Published | Quasi-experimental | Pre-posttest | First | 0.49 | — | — | |
| Published | Quasi-experimental | Pre-posttest | Second | 0.38 | — | — |
Note. Studies with alphabetic superscripts refer to multiple independent effect sizes generated from the same study. The data used to calculate an overall effect size for Lee et al. (2004–2007) is based on information reported in Lee, Maerten-Rivera, Penfield, Leroy, and Secada (2008); Lee, Mahotiere, Salinas, Penfield, and Maerten-Rivera (2009); and Lee, Penfield, and Maerten-Rivera (2009).
Comparison of Mean Effect Sizes (ES) From Standard and Robust Variance Estimation Meta-Analyses
| Treatment ES | Inquiry ES | Traditional ES | ||||
|---|---|---|---|---|---|---|
| Statistics | Standard | RVE | Standard | RVE | Standard | RVE |
| Hedge’s | 0.28 | 0.31 | −0.31 | −0.31 | −0.46 | −0.46 |
| Standard error | 0.07 | 0.08 | 0.07 | 0.07 | 0.12 | 0.12 |
| 95% CI low estimate | 0.15 | 0.13 | −0.45 | −0.44 | −0.70 | −0.71 |
| 95% CI high estimate | 0.41 | 0.48 | −0.18 | −0.18 | −0.22 | −0.20 |
| Degrees of freedom ( | — | 20.9 | — | 28.2 | — | 14 |
| Heterogeneity ( | 82 | 85 | 92 | 93 | 95 | 96 |
| Between-study variance (τ2) | 0.07 | 0.10 | 0.12 | 0.13 | 0.21 | 0.25 |
| Number of studies ( | 23 | 45 | 30 | 59 | 15 | 20 |
| Correlation (ρ) | — | 0.80 | — | 0.80 | — | 0.80 |
Note. To achieve independence, standard meta-analyses were conducted using synthetic effect sizes, whereas RVE meta-analyses used correlated effects models with small-sample bias corrections. RVE = robust variance estimation.
Examining the Effectiveness of Inquiry Instruction for English Language Learner Students by Overall Study Quality Ratings
| Treatment Effect Size and 95% Confidence Interval | Test of Difference | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Quality Rating | Lower | Upper | ||||||||
| Individual studies | 3.11 | 2 | .21 | |||||||
| Strong | 6 | 0.20 | 0.11 | −0.02 | 0.42 | 0.074 | 21 | |||
| Moderate | 11 | 0.23 | 0.10 | 0.04 | 0.42 | 0.020 | 76 | |||
| Weak | 6 | 0.46 | 0.12 | 0.23 | 0.69 | 0.000 | 91 | |||
| Combined studies | ||||||||||
| High quality | 17 | 0.22 | 0.07 | 0.07 | 0.35 | 0.003 | 66 | 3.27 | 1 | .07 |
| Low quality | 6 | 0.46 | 0.12 | 0.23 | 0.69 | 0.000 | 91 | |||
Note. Appraisal of study quality measured using the Quality Assessment Tool for Quantitative Studies. High-quality group composed of studies with overall quality ratings of strong and moderate. Low-quality group composed of studies with overall quality ratings of weak.