| Literature DB >> 34239224 |
Natalie Bau1, Jishnu Das2, Andres Yi Chang3.
Abstract
Using a unique longitudinal dataset collected from primary school students in Pakistan, we document four new facts about learning in low-income countries. First, children's test scores increase by 1.19 SD between Grades 3 and 6. Second, going to school is associated with greater learning. Children who dropout have the same test score gains prior to dropping out as those who do not but experience no improvements after dropping out. Third, there is significant variation in test score gains across students, but test scores converge over the primary schooling years. Students with initially low test scores gain more than those with initially high scores, even after accounting for mean reversion. Fourth, conditional on past test scores, household characteristics explain little of the variation in learning. In order to reconcile our findings with the literature, we introduce the concept of "fragile learning," where progression may be followed by stagnation or reversals. We discuss the implications of these results for several ongoing debates in the literature on education from Low- and Middle-Income Countries (LMICs).Entities:
Keywords: Dropouts; Learning profiles; Primary schools; Teaching at right level; Test scores
Year: 2021 PMID: 34239224 PMCID: PMC8246518 DOI: 10.1016/j.ijedudev.2021.102430
Source DB: PubMed Journal: Int J Educ Dev ISSN: 0738-0593
Fig. A1Vertical equating by subject.
Notes: This figure shows the results of a vertical equating exercise by subject. First, item parameters from year 1 only are estimated. Then, the item parameters are assumed to be fixed and used to re-estimate new θ’s for children using their patterns of responses for common items in year 4. The solid line in each graph is the item characteristic/response curve, which represents the expected patterns of responses for each θ. The actual patterns of responses against θ for 40 quantiles is then plotted against it. If the expected and actual patterns of responses match, this implies that children are moving along a fixed item characteristic curve and that the curve itself is not shifting across years. For 9/24 English questions, the Pearson’s Chi2 test of differences is significant between the observed and expected frequencies of answering correctly when dividing the sample in 1000 quantiles by subject theta for a total sample of 10,067 or about 10 students by quantile.
For 6/28 Math questions, the Pearson’s Chi2 test of differences is significant between the observed and expected frequencies of answering correctly when dividing the sample in 1000 quantiles by subject theta for total sample of 10,067 or about 10 students by quantile.
For 4/28 Urdu questions, the Pearson’s Chi2 test of differences is significant between the observed and expected frequencies of answering correctly when dividing the sample in 1000 quantiles by subject theta for total sample of 10,067 or about 10 students by quantile.
Comparing IRT test scores with model using restricted questions and fixed year 1 parameters.
| (1) | (2) | (3) | t-test | t-test | ||||
|---|---|---|---|---|---|---|---|---|
| IRT 4yrs | IRT 4yrs Restricted Qs | IRT 4yrs All Qs Fixed Yr 1 & Varying Params | Difference | Difference | ||||
| Variable | N | Mean/SE | N | Mean/SE | N | Mean/SE | (1)-(2) | (1)-(3) |
| Combined Theta Year 1 | 12,109 | −0.550 | 12,109 | −0.535 | 12,109 | −0.108 | −0.015 | −0.441*** |
| [0.009] | [0.009] | [0.010] | ||||||
| Combined Theta Year 2 | 12,806 | −0.339 | 12,806 | −0.341 | 12,806 | 0.115 | 0.003 | −0.453*** |
| [0.009] | [0.009] | [0.010] | ||||||
| Combined Theta Year 3 | 12,123 | 0.235 | 12,123 | 0.231 | 12,123 | 0.735 | 0.004 | −0.500*** |
| [0.008] | [0.008] | [0.009] | ||||||
| Combined Theta Year 4 | 10,067 | 0.567 | 10,067 | 0.554 | 10,067 | 1.091 | 0.013 | −0.524*** |
| [0.010] | [0.010] | [0.011] | ||||||
| Combined Learning (2006−03) | 7355 | 1.129 | 7355 | 1.102 | 7355 | 1.213 | 0.027* | −0.084*** |
| [0.010] | [0.011] | [0.011] | ||||||
| English Theta Year 1 | 12,109 | −0.528 | 12,109 | −0.517 | 12,109 | −0.102 | −0.011 | −0.426*** |
| [0.011] | [0.011] | [0.012] | ||||||
| English Theta Year 2 | 12,806 | −0.318 | 12,806 | −0.323 | 12,806 | 0.118 | 0.005 | −0.436*** |
| [0.010] | [0.010] | [0.010] | ||||||
| English Theta Year 3 | 12,123 | 0.201 | 12,123 | 0.198 | 12,123 | 0.658 | 0.003 | −0.457*** |
| [0.009] | [0.009] | [0.009] | ||||||
| English Theta Year 4 | 10,067 | 0.542 | 10,067 | 0.534 | 10,067 | 1.016 | 0.008 | −0.473*** |
| [0.011] | [0.011] | [0.011] | ||||||
| Math Theta Year 1 | 12,109 | −0.502 | 12,109 | −0.478 | 12,109 | −0.087 | −0.024* | −0.414*** |
| [0.009] | [0.010] | [0.011] | ||||||
| Math Theta Year 2 | 12,806 | −0.335 | 12,806 | −0.339 | 12,806 | 0.090 | 0.004 | −0.426*** |
| [0.010] | [0.010] | [0.011] | ||||||
| Math Theta Year 3 | 12,123 | 0.268 | 12,123 | 0.261 | 12,123 | 0.761 | 0.007 | −0.493*** |
| [0.009] | [0.009] | [0.010] | ||||||
| Math Theta Year 4 | 10,067 | 0.526 | 10,067 | 0.505 | 10,067 | 1.045 | 0.021 | −0.519*** |
| [0.011] | [0.011] | [0.012] | ||||||
| Urdu Theta Year 1 | 12,109 | −0.619 | 12,109 | −0.611 | 12,109 | −0.135 | −0.009 | −0.484*** |
| [0.011] | [0.011] | [0.012] | ||||||
| Urdu Theta Year 2 | 12,806 | −0.362 | 12,806 | −0.362 | 12,806 | 0.137 | −0.001 | −0.499*** |
| [0.011] | [0.011] | [0.012] | ||||||
| Urdu Theta Year 3 | 12,123 | 0.237 | 12,123 | 0.234 | 12,123 | 0.788 | 0.003 | −0.551*** |
| [0.009] | [0.009] | [0.010] | ||||||
| Urdu Theta Year 4 | 10,067 | 0.632 | 10,067 | 0.623 | 10,067 | 1.213 | 0.009 | −0.581*** |
| [0.010] | [0.010] | [0.011] |
Notes: This table shows the differences between IRT-estimated tests scores levels and gains used throughout the paper (i.e. column 1) versus: those recomputed after dropping the 19 questions where vertical equating seems to fail in column (2); and, in column (3), those recomputed using all questions but using year 1 parameters for all but varying parameters for those 19 items where vertical equating seems to fail. It shows no appreciable difference in test scores or gains between (1) and (2). Similarly, (3) shows significant yearly test score level differences with (1) but very similar yearly gains with only a slightly larger 4-year learning. The value displayed for t-tests are the differences in the means across the groups. Standard errors are robust. ***, **, and * indicate significance at the 1, 5, and 10 percent critical level.
Sample of children by number of years observed, child and household characteristics and mean learning.
| N Years Observed | N Child-Year Obs. | % Obs. | N Unique Children | Female Proportion | Age (2003) | Avg Days Absent (last 30 days) | % Fathers w/ Primary Edu. or Less | % Mothers w/ Primary Edu. or Less | HH Assets PCA | Avg. Annual Learning |
|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 24,152 | 51.27 | 6038 | 0.48 | 9.58 | 1.82 | 44.24 | 76.13 | 0.11 | 0.39 |
| 3 | 14,280 | 30.32 | 4760 | 0.44 | 9.69 | 1.97 | 51.25 | 79.07 | −0.01 | 0.40 |
| 2 | 6088 | 12.92 | 3044 | 0.38 | 9.90 | 2.10 | 48.50 | 78.32 | −0.09 | 0.37 |
| 1 | 2585 | 5.49 | 2585 | 0.38 | 9.83 | 2.42 | 50.75 | 78.25 | −0.48 | – |
Notes: This table uses the full unbalanced sample. The “number of years observed” categories are exclusive. Thus, children observed for 1 year are not counted again in other categories. Age in 2003 is estimated for those not observed in that year. Average annual learning is defined as the mean of learning between every year a child is observed. If there are 2- or 3-year gaps, then learning is divided by the number of years in the gap. Father and mother education groups (used to construct % of fathers and mothers with primary education or less) and household assets are not available for every child as these data was only collected for a subsample of children that have test scores. The household assets PCA is the average of all years observed, ignoring missing data. The household assets PCA index is very highly correlated (corr = .96) with an index constructed using IRT on the same household assets (see Appendix Fig. A5 for details on how these two measures compare). Fathers’ education, mothers’ education, and household asset information is from the school survey and was cleaned to make it stable across years (see Appendix Table A8 for details on how these variables were cleaned).
Proportion of correct answers by subject for anchoring items across grades 3-6.
| Round 1 Grade 3 2003 | Round 2 Grade 4 2004 | Round 3 Grade 5 2005 | Round 4 Grade 6 2006 | |
|---|---|---|---|---|
| 12,109 | 12,806 | 12,123 | 10,067 | |
| English | ||||
| Eng 6: Listen to word, write word (boy) | 0.39 | 0.52 | 0.65 | 0.74 |
| Eng 7: Listen to word, write word (girl) | 0.20 | 0.24 | 0.32 | 0.46 |
| Eng 8: Alphabet order, fill in blank letter (e) | 0.70 | 0.78 | 0.88 | 0.90 |
| Eng 9: Alphabet order, fill in blank letter (m) | 0.59 | 0.67 | 0.79 | 0.82 |
| Eng 10: Alphabet order, fill in blank letter (s,t) | 0.50 | 0.58 | 0.69 | 0.71 |
| Eng 11: Alphabet order, fill in blank letter (n) | 0.32 | 0.41 | 0.54 | 0.60 |
| Eng 12: Match picture with word (banana) | 0.61 | 0.71 | 0.82 | 0.85 |
| Eng 13: Match picture with word (book) | 0.70 | 0.80 | 0.89 | 0.93 |
| Eng 16: Fill missing letter for picture (ball) | 0.45 | 0.49 | 0.64 | 0.71 |
| Eng 18: Fill missing letter for picture (cat) | 0.67 | 0.71 | 0.80 | 0.83 |
| Eng 19: Fill missing letter for picture (flag) | 0.28 | 0.28 | 0.46 | 0.53 |
| Eng 20: Fill in blank letters of word w/ picture (elephant) | 0.17 | 0.17 | 0.25 | 0.34 |
| Eng 22: Fill in blank letters of word w/ picture (fruit) | 0.09 | 0.07 | 0.10 | 0.11 |
| Eng 27: Check antonym of word (rough) | 0.29 | 0.34 | 0.41 | 0.49 |
| Eng 29: Fill missing word in sentence (his) | 0.30 | 0.34 | 0.51 | 0.61 |
| Eng 30: Fill missing word in sentence (show) | 0.27 | 0.32 | 0.43 | 0.51 |
| Eng 40: Construct sentence with word (school) | 0.11 | 0.15 | 0.29 | 0.44 |
| Eng 41: Construct sentence with word (doctor) | 0.07 | 0.09 | 0.21 | 0.37 |
| Eng 43: Construct sentence with word (deep) | 0.01 | 0.01 | 0.03 | 0.10 |
| Eng 44: Construct sentence with word (play) | 0.02 | 0.03 | 0.10 | 0.20 |
| Eng 45: Read passage and answer questions | 0.27 | 0.35 | 0.52 | 0.67 |
| Eng 46: Read passage and answer questions | 0.21 | 0.30 | 0.40 | 0.53 |
| Eng 48: Read passage and answer questions | 0.17 | 0.24 | 0.39 | 0.51 |
| Eng 50: Read passage and answer questions | 0.10 | 0.14 | 0.18 | 0.21 |
| Math | ||||
| Math 1: Count and write number (8) | 0.60 | 0.65 | 0.78 | 0.73 |
| Math 2: Count and check number (2) | 0.46 | 0.51 | 0.69 | 0.78 |
| Math 9: Add, subtract (3 + 4) | 0.89 | 0.90 | 0.94 | 0.93 |
| Math 11: Add, subtract (9 + 9+9) | 0.74 | 0.79 | 0.86 | 0.86 |
| Math 12: Multiply (4 × 5) | 0.58 | 0.60 | 0.73 | 0.79 |
| Math 13: Fill in blank multiply (2x_ = 20) | 0.38 | 0.42 | 0.52 | 0.61 |
| Math 15: Write word from number (113) | 0.26 | 0.27 | 0.47 | 0.55 |
| Math 16: Write number for word (18) | 0.51 | 0.62 | 0.79 | 0.84 |
| Math 18: Read and write time (12 -h clock showing 3:40) | 0.24 | 0.28 | 0.47 | 0.53 |
| Math 19: Word problem, find information and use | 0.39 | 0.47 | 0.66 | 0.75 |
| Math 20: Word problem, find information and use | 0.35 | 0.44 | 0.59 | 0.67 |
| Math 22: Word problem, find information and use | 0.47 | 0.58 | 0.74 | 0.79 |
| Math 23: Word problem, find information and use | 0.09 | 0.12 | 0.20 | 0.29 |
| Math 24: Add and subtract advanced (36 + 61) | 0.84 | 0.86 | 0.91 | 0.92 |
| Math 25: Add and subtract advanced (678 + 923) | 0.54 | 0.56 | 0.69 | 0.72 |
| Math 26: Add and subtract advanced (5.9 + 4.3) | 0.20 | 0.35 | 0.55 | 0.58 |
| Math 27: Add and subtract advanced (98−55) | 0.69 | 0.73 | 0.81 | 0.84 |
| Math 28: Add and subtract advanced (238−129) | 0.32 | 0.38 | 0.48 | 0.51 |
| Math 30: Multiply and divide (32 × 4) | 0.50 | 0.53 | 0.68 | 0.73 |
| Math 31: Multiply and divide (417 × 27) | 0.13 | 0.15 | 0.30 | 0.36 |
| Math 32: Multiply and divide (384/6) | 0.19 | 0.23 | 0.43 | 0.51 |
| Math 33: Multiply and divide (352/20) | 0.01 | 0.02 | 0.16 | 0.23 |
| Math 34: Cost of necklace, simple algebra | 0.10 | 0.14 | 0.24 | 0.27 |
| Math 37: Add and subtract fractions (1/2 + 3/2) | 0.18 | 0.07 | 0.05 | 0.11 |
| Math 38: Add and subtract fractions (7/5−3/4) | 0.01 | 0.01 | 0.03 | 0.09 |
| Math 39: Convert fractions and percentages (7/3) | 0.02 | 0.04 | 0.06 | 0.13 |
| Math 40: LCM (needed for adding with different denominator) | 0.01 | 0.12 | 0.14 | 0.26 |
| Math 42: Read scale and compare numbers | 0.12 | 0.16 | 0.29 | 0.42 |
| Urdu | ||||
| Urdu 1: Alphabet order, fill in blank letter (Cheeh) | 0.57 | 0.61 | 0.68 | 0.70 |
| Urdu 2: Alphabet order, fill in blank letter (Meem) | 0.75 | 0.81 | 0.88 | 0.88 |
| Urdu 3: Match picture with word (Kitaab) | 0.71 | 0.78 | 0.90 | 0.93 |
| Urdu 4: Match picture with word (Kaila) | 0.71 | 0.78 | 0.89 | 0.93 |
| Urdu 5: Match picture with word (Ghar) | 0.52 | 0.57 | 0.67 | 0.74 |
| Urdu 6: Dejoin letters of word into indiv letters (Mashraq) | 0.46 | 0.56 | 0.69 | 0.75 |
| Urdu 7: Dejoin letters of word into indiv letters (Sooraj) | 0.56 | 0.65 | 0.77 | 0.81 |
| Urdu 9: Dejoin letters of word into indiv letters (Abdul Majeed) | 0.19 | 0.24 | 0.31 | 0.45 |
| Urdu 10: Combine letters into joined word (Kaam) | 0.72 | 0.76 | 0.85 | 0.88 |
| Urdu 12: Combine letters into joined word (Maalik) | 0.36 | 0.41 | 0.52 | 0.59 |
| Urdu 13: Combine letters into joined word (Maheena) | 0.09 | 0.14 | 0.24 | 0.35 |
| Urdu 16: Check correct word to fill in sentence (Gehri) | 0.41 | 0.49 | 0.67 | 0.76 |
| Urdu 17: Check correct word to fill in sentence (Saaf) | 0.53 | 0.65 | 0.82 | 0.87 |
| Urdu 19: Antonyms (Bara) | 0.42 | 0.47 | 0.65 | 0.77 |
| Urdu 20: Antonyms (Geila) | 0.35 | 0.45 | 0.60 | 0.67 |
| Urdu 22: Antonyms (Buzdil) | 0.22 | 0.27 | 0.35 | 0.45 |
| Urdu 23: Antonyms (Shikushat) | 0.20 | 0.24 | 0.42 | 0.54 |
| Urdu 24: Antonyms (Mukhtasir) | 0.24 | 0.28 | 0.40 | 0.48 |
| Urdu 26: Write plurals of singular words (Aadat) | 0.12 | 0.18 | 0.26 | 0.33 |
| Urdu 28: Write plurals of singular words (Haraf) | 0.13 | 0.23 | 0.37 | 0.48 |
| Urdu 29: Write plurals of singular words (Sajar) | 0.03 | 0.03 | 0.07 | 0.18 |
| Urdu 30: Write plurals of singular words (Shaer) | 0.01 | 0.02 | 0.03 | 0.06 |
| Urdu 32: Construct a sentence with a given word (Karigar) | 0.15 | 0.20 | 0.33 | 0.45 |
| Urdu 34: Construct a sentence with a given word (Ghosila) | 0.23 | 0.26 | 0.41 | 0.51 |
| Urdu 36: Complete passage for grammar (Key) | 0.28 | 0.35 | 0.53 | 0.65 |
| Urdu 37: Complete passage for grammar (Chuka) | 0.30 | 0.37 | 0.55 | 0.65 |
| Urdu 43: Read passage and answer questions | 0.21 | 0.32 | 0.56 | 0.66 |
| Urdu 45: Read passage and answer questions | 0.08 | 0.16 | 0.30 | 0.47 |
Notes: This table uses the full unbalanced sample and shows the proportion of correct answers for each item by subject and in each year (columns). Only anchoring items asked every year are included in the table. Questions left unanswered are marked as wrong and counted in the proportion. Note that while each year roughly corresponds to a primary grade, the sample tracks children who were observed in previous years even when they are not in their expected grade (e.g. children held back, double-promoted, etc.).
Fig. A5Scatter plot and linear fit of household assets PCA and IRT indices.
Correlates variables definitions.
| Variable | Variable Name | Definition |
|---|---|---|
| Asset Index (School-level data) | sc_pca_4years | Predicted first Principal Component index scaled to have mean 0 across all years. It uses assets collected from a random sample of children at each school. The PCA uses assets data from 2003−06. |
| Assets Index (HH-level data) | hh_pca_4years | Predicted first Principal Component index scaled to have mean 0 across all years. It uses assets collected at the household. The PCA uses assets data from 2003−06. |
| Mother Education Groups (School-level data) | sc_mother_educ | Mean of available parental education groups in 2003−06 at the school-level. Missings are ignored to estimate the mean, and results are rounded to the closest unit. |
| Parental groups follow this definition: | ||
| 1 = No Education | ||
| 2 = Less than Primary (less than Grade 5 - did not pass Grade 5 exams) | ||
| Father Education Groups (School-level data) | sc_father_educ | 3 = Greater than Primary to Higher Secondary (greater or equal than Grade 5 to less or equal than Grade 12) |
| 4 = Higher Secondary or higher (greater than Grade 12) | ||
| Mother Education Groups (HH-level data) | hh_mother_educ | First, years of parental education is defined as the average across all observed years (ignoring missings) of the highest grade of formal schooling completed by each parent. When above 12, the following assumptions are made: (i) BA/BSC/B.Ed = 15; and (ii) MA/MSC/M.Ed/MBA = 17. |
| Father Education Groups (HH-level data) | hh_father_educ | Then, these parental education groups are constructed from years of parental education using the following definition: |
| 1 = No Education | ||
| 2 = Less than Primary (less than Grade 5 - did not pass Grade 5 exams) | ||
| 3 = Greater than Primary to Higher Secondary (greater or equal than Grade 5 to less or equal than Grade 12) | ||
| 4 = Higher Secondary or higher (greater than Grade 12) | ||
| Test Items | eng_item* | Each variable assumes that unanswered questions that were asked are marked as wrong. Only typos (i.e. values different than 0 for incorrect and 1 for correct) are set to missing if the question was asked. |
| math_item* | ||
| urdu_item* | Variables take the value of missing if question is NOT asked in a given year. | |
Learning between Grades 3-6, top vs. bottom 25 % in PK, YL countries and FL.
| Country | Ages (t0 - t1) | Mean Score Difference t1 – t0 (4 Years Learning) | 75th Percentile at t0 | 25th Percentile at t1 | Percentage in Correct Grade at t1 | ||
|---|---|---|---|---|---|---|---|
| 9.7−12.8 | 1.08 | 1.29 | 1.19 | 0.13 | 0.07 | 81% in Grade 6 | |
| 9.2−12.2 | 1.04 | 0.99 | 0.99 | 0.01 | −0.01 | 83% in Grade 6 | |
| 8.1−12.1 | 0.88 | 1.10 | 0.99 | 0.70 | 0.29 | 38% in Grades 4−6 | |
| 8.0−12.0 | 0.98 | 1.17 | 1.08 | 0.04 | 0.57 | 54 % in Grades 5−7 | |
| 8.0−11.9 | 1.12 | 1.42 | 1.27 | 0.72 | 1.08 | 32% in Grades 5−7 | |
| 8.1−12.2 | 1.11 | 1.27 | 1.19 | 1.00 | 1.41 | 70 % in Grades 5−7 | |
Sources: LEAPS, micro-data from the Young Lives (YL) Surveys provided by Abhijeet Singh, and analytical results using Florida administrative data facilitated by David Figlio.
Notes: This table shows the mean test score gains between t1 and t0 by subject and the 75th and 25th percentiles at t0 and t1 respectively for a range of countries/territories where panel data with equated test scores are available. For Pakistan and Florida, t0 = 2003 and t1 = 2006, for YL countries, t0 = 2009 and t1 = 2013. Language refers to receptive vocabulary for YL countries, reading for Florida, and Urdu for Pakistan. For YL countries, combined refers to the mean of Math and Language average scores as the sample of tested children did not always complete both subjects. For Pakistan and Florida, combined refers to the average score across Math and Urdu/reading, respectively. Pakistan and Florida are panels observed first at the school in Grade 3, while YL numbers come from household surveys where children are first tracked at age 5 and then followed at age 8 and 12. Children tested at home in Pakistan are excluded for comparability purposes with Florida. YL uses EAP IRT theta estimates standardized with respect to age 5 test scores. Attrition is low in all countries.
Sample of children by number of years observed, child and household characteristics and mean learning (Household sample).
| N Years Observed | N Child-Year Obs. | % Obs. | N Unique Children | Female Proportion | Age (2003) | Avg Days Absent (last 30 days) | % Fathers w Primary Edu. or Less | % Mothers w Primary Edu. or Less | HH Assets PCA | Avg Annual Learning |
|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 2556 | 71.62 | 639 | 0.45 | 9.6 | 1.8 | 48.3 | 77.25 | 0.04 | 0.38 |
| 3 | 741 | 20.76 | 247 | 0.45 | 9.7 | 2.0 | 46.9 | 74.23 | −0.03 | 0.34 |
| 2 | 214 | 6.00 | 107 | 0.47 | 10.0 | 1.8 | 44.1 | 73.53 | −0.17 | 0.28 |
| 1 | 58 | 1.63 | 58 | 0.34 | 9.8 | 1.7 | 59.4 | 90.63 | −0.49 | – |
Notes: This table uses the full household sample. The number of years categories are exclusive, so children observed for 1 year are not counted again in other categories. Age in 2003 is estimated for those not present in that year. Average annual learning is defined as the mean of learning between every year present. If there are 2- or 3-year gaps, then learning is divided by the number of years. Father and mother education groups (used to construct % of fathers and mothers with primary education or less) and household assets are not available for every child as these data was only collected for a subsample of children that have test scores. The household assets PCA is the average of all years observed, ignoring missing data. The household assets PCA index is very highly correlated (corr = .96) with an index constructed using IRT on the same household assets (see Appendix Fig. A5 for details on how these two measures compare). Fathers, mothers, and household asset information used is from school survey to make it comparable with Table 2 and was cleaned to make it stable across years (see Appendix Table A8 for details on how these variables were cleaned).
Fig. 1Learning trajectories for 2005 dropouts, non-dropouts and difference – combined test scores.
Notes: Panel A shows test scores in every round for two groups of students in the full unbalanced school panel. The red solid line shows students who were enrolled in every year, while the dotted blue line shows test scores in every round for students who eventually dropped-out in the transition from primary to middle school. The last score for the dropout group reflects their scores when they were tested at home and have been out of school for one year. 95 % confidence intervals displayed for each year-group combination. The percentage of dropouts in 2006 is 11.89 %. Panel B shows the difference in test scores between both groups for each year and its corresponding 95 % confidence interval. Test scores refers to the mean across Urdu, English and Mathematics.
Fig. A2Learning trajectories for 2006 dropouts and non-dropouts – combined test scores (Household sample).
Notes: Panel A shows test scores in every round for two groups of students in the household panel. The red line shows students who were enrolled in every year while the blue line shows test scores in every round for students who eventually dropped-out in the transition from primary to middle school. The last score for the dropout group reflects their scores when they were tested at home and have been out of school for one year. 95 % confidence intervals displayed for each year-group combination. The percentage of dropouts in 2006 is 19.84 %. Panel B shows the difference in test scores between both groups for each year and its corresponding 95 % confidence interval. Test scores refers to the mean across Urdu, English and Mathematics.
Test scores gains over the years, (imperfect) learning persistence, and dropouts.
| Dep. Var: Mean Test Scores | ||||
|---|---|---|---|---|
| (1) | (2) | (3) | (4) | |
| 2004 Indicator | 0.21*** | 0.22*** | ||
| (0.035) | (0.036) | |||
| 2005 Indicator | 0.79*** | 0.79*** | 0.35*** | 0.35*** |
| (0.028) | (0.028) | (0.0096) | (0.010) | |
| 2006 Indicator | 1.18*** | 1.17*** | 0.36*** | 0.35*** |
| (0.042) | (0.041) | (0.011) | (0.011) | |
| Dropout Indicator 2005−06 | −0.45*** | −0.35*** | ||
| (0.055) | (0.031) | |||
| Dropout Group | −0.095* | −0.057 | ||
| (0.044) | (0.027) | |||
| 2004 # Dropout Group | −0.016 | |||
| (0.039) | ||||
| 2005 # Dropout Group | −0.040 | 0.0055 | ||
| (0.044) | (0.036) | |||
| 2006 # Dropout Group | −0.36*** | −0.29*** | ||
| (0.057) | (0.042) | |||
| Test Score at (t-1) | 0.71*** | 0.71*** | ||
| (0.0059) | (0.0060) | |||
| Constant | −0.78*** | −0.78*** | 0.018 | 0.022 |
| (0.056) | (0.056) | (0.0098) | (0.0099) | |
| District Fixed-Effects | Yes | Yes | Yes | Yes |
| Observations | 47,099 | 47,099 | 28,898 | 28,898 |
| Adjusted | 0.208 | 0.208 | 0.612 | 0.613 |
Notes: This table uses the full unbalanced school sample and is the regression analog of Fig. 1 (although controlling for district fixed effects, so estimates slightly differ). It shows the regression results of test scores in year t on year indicators and dropout indicators. Test scores refers to the mean across Urdu, English and Mathematics. The four specifications estimated differ in how persistence in learning and dropouts are treated. “Dropout Indicator 05−06” is an indicator variable equal to 1 if child dropped out between years 2005 and 2006 and t is 2006, and equal to 0 otherwise, including for all other years. “Dropout Group” is a time-invariant indicator variable equal to 1 for children who dropped-out between 2005 and 2006. Columns 1 and 2, present the association between dropping out and level test scores. Columns 3 and 4 re-run these regressions but allow the test score levels in time t to depend on test scores in t-1 using the value-added specification. Standard errors clustered at the village level are in parentheses.
p < 0.05, ** p < 0.01, *** p < 0.001.
Fig. 2Four-year learning gains/losses by learning deciles.
Notes: This figure plots test score gains from 2003 to 2006 by deciles of test score gains. Test scores refers to the mean across Urdu, English and Mathematics. Test score gains are defined as the difference in test scores between Grades 3 and 6. 95 % confidence intervals are also shown for each point but are very small. The bars show the test score in 2003 by decile with higher baseline test scores for those experiencing learning losses (i.e. decile 1). The red dashed line represents the overall test score gain mean. Test scores refers to the mean across Urdu, English and Mathematics.
Fig. 3Convergence: Learning trajectories by percentile group from initial combined test scores.
Notes: This figure shows learning trajectories by groups of baseline levels of test score performance during Grade 3–6 using the unbalanced full sample but restricting the graph for those who were observed in Grade 3 (2003). The graph shows averaged test scores across the three subjects tested (Appendix Fig. A3 shows the patterns for the 3 different subjects) for children at different test scores levels in 2003. That is, we have divided the children based on their baseline test scores in 2003 into six groups, as explained in the legend. Each line represents a group’s mean test scores over the rounds of testing.
Fig. A3Convergence: Learning trajectories by percentile group from initial test scores by subject.
Notes: This figure shows learning trajectories by groups of baseline levels of test score performance during Grades 3–6 using the unbalanced full sample but restricting the graph for those who were observed in Grade 3 (2003). The graph shows the patterns for the 3 different subjects (i.e. Math, Urdu, and English) for children at different test scores levels in 2003. That is, we have divided the children based on their baseline test scores in 2003 into six groups as explained in the legend for each subject. Every line represents their mean test scores over the rounds of testing.
Test scores over time and learning by quintile.
| Quintiles by Test Score 2003 | Stat | Test Score 2003 | Test Score 2004 | Test Score 2005 | Test Score 2006 | Learning (2006−04) | Learning (2006−03) |
|---|---|---|---|---|---|---|---|
| Quintile 1 | −2.01 | −1.34 | −0.64 | −0.26 | 1.10 | 1.75 | |
| 1471 | 1314 | 1249 | 1471 | 1314 | 1471 | ||
| Quintile 2 | −0.86 | −0.55 | 0.00 | 0.35 | 0.92 | 1.22 | |
| 1471 | 1347 | 1275 | 1471 | 1347 | 1471 | ||
| Quintile 3 | −0.38 | −0.14 | 0.37 | 0.67 | 0.83 | 1.05 | |
| 1471 | 1353 | 1312 | 1471 | 1353 | 1471 | ||
| Quintile 4 | 0.04 | 0.19 | 0.67 | 0.95 | 0.77 | 0.91 | |
| 1471 | 1383 | 1332 | 1471 | 1383 | 1471 | ||
| Quintile 5 | 0.62 | 0.58 | 1.09 | 1.33 | 0.77 | 0.71 | |
| 1471 | 1358 | 1336 | 1471 | 1358 | 1471 | ||
| All | −0.52 | −0.24 | 0.31 | 0.61 | 0.88 | 1.13 | |
| 7355 | 6755 | 6504 | 7355 | 6755 | 7355 |
Notes: This table uses the full unbalanced sample but is restricted to children observed in 2003, since new children in years 2004−06 cannot be classified in quintiles by 2003 test scores. Test scores refers to the mean across Urdu, English and Mathematics. Quintiles by test scores in 2003 are estimated only for those observed in 2006 as their test score in 2006 is needed to estimate their learning. For each quintile, the gains between 2004−06 and 2003−06 are shown. The table shows that measurement error alone does not explain why children who are initially low performers report higher test score gains in our data.
Learning convergence: IV correction for miss-assignment and measurement error.
| Dep. Var.: Mean Test Score Gains 2004−06 | |||
|---|---|---|---|
| (1) | (2) | (3) | |
| Test Score Quintiles 2004 = 1 | 1.32*** | ||
| (0.067) | |||
| Test Score Quintiles 2004 = 2 | 0.92*** | −0.39 | |
| (0.23) | (0.29) | ||
| Test Score Quintiles 2004 = 3 | 0.87*** | −0.45** | |
| (0.16) | (0.14) | ||
| Test Score Quintiles 2004 = 4 | 0.73*** | −0.59*** | |
| (0.10) | (0.15) | ||
| Test Score Quintiles 2004 = 5 | 0.81*** | −0.51*** | |
| (0.062) | (0.12) | ||
| Test Score in 2004 | −0.18*** | ||
| (0.021) | |||
| Constant | 1.32*** | 0.87*** | |
| (0.067) | (0.024) | ||
| Mauza Fixed-Effects | Yes | Yes | Yes |
| Observations | 6755 | 6755 | 6755 |
| Adjusted | 0.163 | 0.163 | 0.191 |
Notes: This table shows the regression results of 3-year test score gains (2004−06) on test scores or quintiles by test score in year 2 (2004) but instrumenting them with test scores or quintiles by test scores in year 1 (2003). Test scores refers to the mean across Urdu, English and Mathematics. Quintiles are estimated only for those observed in 2006 who had test scores in 2003 and 2004 respectively. Column 1 omits the constant to obtain the average gain by quintile, and Column 2 shows gains for each quintile relative to the omitted category, Quintile 1. Column 3 estimates the continuous version of this equation. The negative coefficient of Test Scores in 2004 from Column 3 implies test scores are converging over time. Convergence is evidenced across specifications with children with higher test scores in 2003 learning less between 2004 and 2006. Standard errors clustered at the village-level appear in parentheses. * p < 0.05, ** p < 0.01, *** p < 0.001.
Test scores on lagged test scores, child and household characteristics and fixed effects.
| (1) | (2) | (3) | (4) | |
|---|---|---|---|---|
| Test score t | Test score t | Test score t | Test score t | |
| Test Score t-1 | 0.74*** | 0.73*** | 0.71*** | 0.66*** |
| (0.0058) | (0.0060) | (0.0064) | (0.0076) | |
| Father Educ: <Primary | −0.0043 | −0.0030 | −0.0058 | |
| (0.013) | (0.012) | (0.012) | ||
| Father Educ: >Primary to Higher Secondary | 0.070*** | 0.074*** | 0.064*** | |
| (0.0097) | (0.0097) | (0.0094) | ||
| Father Educ: Higher Secondary or Higher | 0.13*** | 0.13*** | 0.11*** | |
| (0.016) | (0.016) | (0.016) | ||
| Mother Educ: <Primary | 0.0053 | 0.0047 | −0.0044 | |
| (0.011) | (0.011) | (0.010) | ||
| Mother Educ: >Primary to Higher Secondary | 0.044*** | 0.042*** | 0.018 | |
| (0.0098) | (0.0100) | (0.0098) | ||
| Mother Educ: Higher Secondary or Higher | 0.085** | 0.080** | 0.020 | |
| (0.027) | (0.027) | (0.028) | ||
| Average PCA Asset Index across Years | 0.015*** | 0.016*** | 0.0046 | |
| (0.0026) | (0.0026) | (0.0025) | ||
| Age in 2003 | −0.0093*** | −0.0099*** | −0.012*** | |
| (0.0027) | (0.0027) | (0.0028) | ||
| Dropout Group Indicator | −0.13*** | −0.13*** | −0.075*** | |
| (0.015) | (0.015) | (0.017) | ||
| Female Indicator | 0.034*** | 0.040*** | 0.050*** | |
| (0.0075) | (0.0075) | (0.012) | ||
| Constant | 0.26*** | 0.29*** | 0.031 | −0.63* |
| (0.0070) | (0.029) | (0.080) | (0.27) | |
| Mauza Fixed-Effects | No | No | Yes | No |
| School Fixed-Effects | No | No | No | Yes |
| District Fixed-Effects | Yes | Yes | No | No |
| Observations | 23,992 | 23,992 | 23,992 | 23,990 |
| Adjusted R-squared | 0.59 | 0.59 | 0.60 | 0.64 |
| Within Adjusted R-squared | 0.56 | 0.57 | 0.54 | 0.43 |
Notes: This table uses the full unbalanced panel to regress test scores in year t on lagged test scores at t-1 along with parental education groups, average wealth across rounds, baseline age, sex and whether the child dropped-out in 2005−06. Test scores refers to the mean across Urdu, English and Mathematics. Specifications across columns include a full set of village or school fixed effects to capture potential differences by geography and schools. The Within Adjusted R-square measures the explanatory power net of mauza, school and district fixed-effects respectively. Household and child characteristics explain very little of the variation in test score gains after accounting for fixed effects. Standard errors clustered at the village-level are in parentheses. The base category for the father and mother education groups is no education. The average wealth measure ignores missing data in any given year. * p < 0.05, ** p < 0.01, *** p < 0.001.
Robustness to scaling transformations and likely transformation for wealth and gender gaps.
| Groups | Gap Growth | Correlation | R-square | ||
|---|---|---|---|---|---|
| Original | Min | Max | Max | Max | |
| −.0792 | −.1185 | .0829 | −.1184 | −.0094 | |
| .0097 | .0047 | .0277 | .0049 | .0176 | |
Notes: This table compares the original 4-year learning gap for wealth and gender to those obtained from extreme monotonic transformations that maximize and minimize these gaps. The gap is defined as the coefficient on the variable for wealth or gender in the regression for year 1 minus the same coefficient in year 4. Specifications are similar to those in Table 7 and control for district fixed effects, age in 2003, parental education, and a dropout group indicator. Additionally, the wealth gap controls for gender, and the gender gap controls for wealth. The table also provides likely transformations, those that maximize correlation and R-square of test scores in year 1 and 4, to help benchmark the results. Wealth quartiles are constructed from the PCA of mean household assets across years. Max and Min Gap Growth discard very unlikely transformations, specifically those with skewness outside [-2,2] and/or kurtosis outside [0,10]. For computational speed and efficiency reasons, convergence is assumed after 15 iterations, and monotonicity is only checked up to a finite number of possibilities (46,735 different IRT scores that came from 4 rounds of surveys). Furthermore, the program allows the gap to reverse. This efficiency gain and flexibility might yield, in rare instances, results in the opposite direction of the intended max/min optimization. These unlikely results are discarded for this exercise.
Fig. A4Linear fit of 4-years gains and baseline test scores.
Notes: High and low parental education groups are constructed from the maximum level of education across father and mother for each child. High education groups are those where the most educated parent has completed more than primary school, and low education groups are families where the most educated parent reports 0 years of schooling. In our data, 22 % of parents fall in the first category, and 29 % fall in the second category. The figure shows, for each group, 20 quantiles in each group and the corresponding linear fit using Cattaneo et al. (2019) binscatter command in Stata. For clarity, we have excluded 2% of the observations with test scores higher than +2 SD or lower than -3 SD. We have also reported coefficients from the value-added and the gains specification where all data are included. The value-added specification shows that children who started off at the same score in 2003 gained 0.27 SD extra if they were in “high” education households. The gains specification shows that, on average, children from high education households gained an additional 0.11 SD between Grades 3 and 6.
Learning patterns by subject and items.
Notes: This table shows the proportion of students from the balanced panel (i.e. those observed every year; N = 6038) that, for each anchoring question asked every year, can be classified, based on the pattern of their correct/incorrect answer, as: (i) always learners: those who always answered correctly an item; (ii) never learners: those who never answered correctly an item; (iii) robust learners: those whose trajectories show (weakly) monotonic progression starting from a point where they could not answer the question; and (iv) fragile learners: those whose trajectories show regression at some point. The proportion of fragile to robust learners is shown at the bottom of the table. Non multiple-choice questions are highlighted in orange.
Fig. 4Proportion of fragile and robust learners by subject.
Notes: This figure examines the proportion of students from the balanced panel (i.e. those observed every year, N = 6038) that, for each anchoring question asked every year, can be classified, based on the pattern of their correct/incorrect answer, as: (i) robust learners: those whose trajectories show (weakly) monotonic progression starting from a point where they could not answer the question; and (ii) fragile learners: those whose trajectories show regression at some point. The proportion of fragile to robust learners is shown at the top of each bar. An asterisk before the question indicates that the item was a multiple-choice questions (MCQ). The missing proportion corresponds to always or never learners, those who always or never answered correctly a given item.
Learning simulation for random absence and misclassification.
| Present in | Actual Presence in our Sample (%) | Presence w/ Random Absence 10 % | Presence w/ Random Absence 15 % | Presence w/ Random Absence 20 % | Presence w/ Random Absence 10 % | Presence w/ Random Absence 15 % | Presence w/ Random Absence 20 % | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Misclass 3% | Misclass 5% | Misclass 7% | Misclass 3% | Misclass 5% | Misclass 7% | Misclass 3% | Misclass 5% | Misclass 7% | |||||
| 22.4 | 2.7 | 5.8 | 9.7 | 6.7 | 9.4 | 12.1 | 10.1 | 13.0 | 15.9 | 14.3 | 17.4 | 20.4 | |
| 27.4 | 24.3 | 32.6 | 38.7 | 29.3 | 32.0 | 34.2 | 36.2 | 38.1 | 39.6 | 41.1 | 42.3 | 43.1 | |
| 50.2 | 73.0 | 61.6 | 51.6 | 64.0 | 58.6 | 53.7 | 53.7 | 48.9 | 44.5 | 44.5 | 40.3 | 36.5 | |
Notes: Simulations use a total sample size of 16,428 across all years (the same as the number of unique students across all 4 rounds in the LEAPS sample). For simulating misclassification, this sample size grows every round. The average of 1000 simulations is shown above. Absence selection is random and independent each year. Misclassification selection in year t + 1 is random and conditional on being observed in year t and being observed at least once prior to year t (otherwise, it would not be misclassification but rather a “newly” observed student). Misclassified individuals are duplicated as new individuals, assigned a new unique ID, and are marked as not being observed every year prior to the misclassification period. Their original record for the year they were misclassified is then corrected to not observed. Misclassification rates are applied over the full sample size (not only those eligible based on being observed in year t and having been observed at least once before).
Learning for unbalanced, balanced, and household panels.
| Variable | (1) | (2) | (3) | t-test | t-test | t-test | |||
|---|---|---|---|---|---|---|---|---|---|
| Unbalanced Panel | Balanced Panel | Household Panel | Difference | Difference | Difference | ||||
| N | Mean/SE | N | Mean/SE | N | Mean/SE | (1)-(2) | (1)-(3) | (2)-(3) | |
| 7355 | 1.129 | 6038 | 1.155 | 1406 | 1.101 | −0.026* | 0.028 | 0.054* | |
| [0.010] | [0.011] | [0.025] | |||||||
| 8470 | 0.889 | 6038 | 0.884 | 1417 | 0.833 | 0.006 | 0.056** | 0.051** | |
| [0.008] | [0.009] | [0.022] | |||||||
| 8796 | 0.355 | 6038 | 0.344 | 1412 | 0.305 | 0.011 | 0.050** | 0.040* | |
| [0.007] | [0.008] | [0.019] | |||||||
| 8829 | 0.799 | 6038 | 0.811 | 1404 | 0.813 | −0.012 | −0.014 | −0.003 | |
| [0.008] | [0.009] | [0.019] | |||||||
| 10,212 | 0.537 | 6038 | 0.539 | 1427 | 0.542 | −0.002 | −0.004 | −0.003 | |
| [0.006] | [0.007] | [0.015] | |||||||
| 9890 | 0.258 | 6038 | 0.272 | 1450 | 0.284 | −0.014 | −0.026 | −0.012 | |
| [0.008] | [0.010] | [0.020] | |||||||
Notes: This tables compares learning for 3 different samples: (i) a balanced sample of 6038 children who were present in every year; (ii) the full unbalanced samples of children present in different years; and (iii) the household sample where the balanced proportion is higher. It shows that test score gains are very similar across all three samples, with statistically significant differences between the unbalanced/balanced panels when comparing against the household panel only when learning includes year 4. This is likely caused by the inclusion of testing dropouts at home and the fact that the balanced panel itself is a (slightly) selected group of children. The value displayed for t-tests are the differences in the means across the groups. Standard errors are robust and shown in brackets. ***, **, and * indicate significance at the 1, 5, and 10 percent critical level.
Regression of test score in year t on yearly quintiles of performance and test score lags in year t-1.
| Dep. Var.: Test Score in Year | ||
|---|---|---|
| (1) | (2) | |
| Test Score (t-1) | 0.55*** | 0.56*** |
| (0.031) | (0.032) | |
| Quintile from Test Score Performance (t-1) = = 2 | 0.12*** | 0.12*** |
| (0.030) | (0.031) | |
| Quintile from Test Score Performance (t-1) = = 3 | 0.23*** | 0.23*** |
| (0.046) | (0.047) | |
| Quintile from Test Score Performance (t-1) = = 4 | 0.34*** | 0.34*** |
| (0.059) | (0.061) | |
| Quintile from Test Score Performance (t-1) = = 5 | 0.43*** | 0.44*** |
| (0.067) | (0.069) | |
| Year 2005 Indicator | 0.38*** | 0.38*** |
| (0.034) | (0.035) | |
| Year 2006 Indicator | 0.45*** | 0.44*** |
| (0.038) | (0.039) | |
| Dropout Group Indicator | −0.15*** | −0.15*** |
| (0.020) | (0.019) | |
| Constant | −0.54*** | −0.28*** |
| (0.061) | (0.070) | |
| Mauza Fixed-Effects | Yes | No |
| District Fixed-Effects | No | Yes |
| Observations | 28,898 | 28,898 |
| Adjusted | 0.623 | 0.615 |
Notes: This tables replicates the results of Muralidharan, Singh and Ganimian (2019) value-added specification: , where q sums over the quartiles of lagged test scores within a grade, and Qq is an indicator variable equal to 1 if a student is in quartile q. To maintain comparability with other tables in this paper, we use quintiles rather than quartiles. This regression uses all-year IRT scores possible considering it includes lags. Quintiles are constructed within each year and might therefore vary across years for observations. As in Muralidharan, Singh and Ganimian (2019), identification is achieved because is computed across years, while quintiles are year specific. The omitted quintile indicator corresponds to the top quintile.
*p < 0.05, ** p < 0.01, *** p < 0.001.
Yearly forward test score gains by test score and quintiles in previous year.
| Groups | Stats | Quintiles | Total | ||||
|---|---|---|---|---|---|---|---|
| 1 | 2 | 3 | 4 | 5 | |||
| 0.80 | 0.32 | 0 | 0 | 0 | 0.76 | ||
| 2687 | 237 | 0 | 0 | 0 | 2924 | ||
| 0.54 | 0.44 | 0.21 | 0 | 0 | 0.45 | ||
| 658 | 1670 | 176 | 0 | 0 | 2504 | ||
| 0.42 | 0.47 | 0.34 | 0.15 | 0 | 0.37 | ||
| 279 | 1171 | 1822 | 416 | 0 | 3688 | ||
| 0 | 0.39 | 0.36 | 0.32 | 0.05 | 0.31 | ||
| 0 | 546 | 1265 | 1883 | 488 | 4182 | ||
| 0 | 0 | 0.32 | 0.28 | 0.22 | 0.26 | ||
| 0 | 0 | 358 | 1248 | 1425 | 3031 | ||
| 0 | 0 | 0 | 0.14 | 0.12 | 0.12 | ||
| 0 | 0 | 0 | 77 | 1708 | 1785 | ||
Notes: This table shows the forward yearly test score gains (i.e. ) for students given their test score at time t (rows) and their test score performance quintile at time t (columns). The number of observations in each group is also shown. Specifically, we first score children on a common linked scale as described in the text. Then, we construct within grade quintiles, so that children with the same score may be in different quintiles depending on what grade they were in. For instance, children with very low scores [-5 to -1) are mostly in the bottom quintile (Quintile 1) with less than 10 % in the 2nd Quintile. Children with scores between [0 to 0.5) are distributed across Quintile 2 and 5. If, for instance, this score was observed in Grade 3, they would likely be in Quintile 4 or 5, but if this score was observed in Grade 5, they may be in Quintile 2 or 3. We then show the average gain in test scores in the following year for each test-score interval and quintile. For instance, the 546 children who scored between [0 to 0.5) but were in the 2nd quintile gained an average of 0.39SD, but the 488 children who scored between [0 to 0.5) but were in the top quintile gained 0.