| Literature DB >> 32528352 |
Alexander Robitzsch1,2, Oliver Lüdtke1,2, Frank Goldhammer3,4, Ulf Kroehne3, Olaf Köller1.
Abstract
International large-scale assessments, such as the Program for International Student Assessment (PISA), are conducted to provide information on the effectiveness of education systems. In PISA, the target population of 15-year-old students is assessed every 3 years. Trends show whether competencies have changed in the countries between PISA cycles. In order to provide valid trend estimates, it is desirable to retain the same test conditions and statistical methods in all PISA cycles. In PISA 2015, however, the test mode changed from paper-based to computer-based tests, and the scaling method was changed. In this paper, we investigate the effects of these changes on trend estimation in PISA using German data from all PISA cycles (2000-2015). Our findings suggest that the change from paper-based to computer-based tests could have a severe impact on trend estimation but that the change of the scaling model did not substantially change the trend estimates.Entities:
Keywords: educational measurement; large-scale assessment; linking; mode effects; scaling
Year: 2020 PMID: 32528352 PMCID: PMC7264417 DOI: 10.3389/fpsyg.2020.00884
Source DB: PubMed Journal: Front Psychol ISSN: 1664-1078
Results of the German field test data 2014 for the mode effect, based on the 1PL Model.
| All items | Invariant items | |||||||||||
| Domain | SD | SD | ||||||||||
| PBA | CBA | SE | SE | SE | SE | |||||||
| Science | 340 | 338 | 77 | 0.08 | 0.17 | 0.05 | 56 | 0.08 | 0.13 | 0.06 | ||
| Mathematics | 345 | 340 | 66 | 0.07 | 0.31 | 0.05 | 36 | −0.09 | 0.07 | 0.27 | 0.08 | |
| Reading | 349 | 334 | 82 | −0.13 | 0.10 | 0.43 | 0.05 | 47 | −0.06 | 0.09 | 0.25 | 0.07 |
Sample sizes of PISA studies used for linking study.
| Domain | Study | Mode | #Items | #Link items | ||
| All | Invariant | |||||
| Science | 2006 | PBA | 4881 | 103 | 77 | 56 |
| 2009 | PBA | 3477 | 53 | 52 | 40 | |
| 2012 | PBA | 3505 | 52 | 52 | 40 | |
| 2014 | PBA | 340 | 91 | 77 | 56 | |
| 2014 | CBA | 338 | 91 | 77 | 56 | |
| 2015 | CBA | 6501 | 181 | 77 | 56 | |
| Mathematics | 2003 | PBA | 4656 | 84 | 31 | 16 |
| 2006 | PBA | 3795 | 48 | 31 | 16 | |
| 2009 | PBA | 3503 | 35 | 31 | 16 | |
| 2012 | PBA | 4971 | 84 | 66 | 36 | |
| 2014 | PBA | 345 | 70 | 66 | 36 | |
| 2014 | CBA | 340 | 68 | 66 | 36 | |
| 2015 | CBA | 2739 | 69 | 66 | 36 | |
| Reading | 2000 | PBA | 5060 | 128 | 35 | 19 |
| 2003 | PBA | 2555 | 27 | 24 | 13 | |
| 2006 | PBA | 2701 | 28 | 24 | 13 | |
| 2009 | PBA | 4975 | 100 | 82 | 47 | |
| 2012 | PBA | 3470 | 43 | 42 | 23 | |
| 2014 | PBA | 349 | 85 | 82 | 47 | |
| 2014 | CBA | 334 | 85 | 82 | 47 | |
| 2015 | CBA | 2746 | 87 | 82 | 47 | |
FIGURE 1Marginal trend estimation for science without consideration (left) and with consideration (right) of the data of the German field test study of 2014.
Overview of different linking approaches.
| IRT Model | Scaling | Linking | |||||
| Method | 1PL | 2PL | conc | sep | Haber | chain | |
| Without field test (all items) | C1 | x | x | ||||
| C2 | x | x | |||||
| H1 | x | x | x | ||||
| H2 | x | x | x | ||||
| S1 | x | x | x | ||||
| Without field test (invariant items) | C1I | x | x | ||||
| C2I | x | x | |||||
| H1I | x | x | x | ||||
| H2I | x | x | x | ||||
| With field test | C1F | x | x | ||||
| H1F | x | x | x | ||||
| S1F | x | x | x | ||||
Item difficulties of link items from the 1PL Model.
| Domain | Item group | #Items | 2000 PBA | 2003 PBA | 2006 PBA | 2009 PBA | 2012 PBA | 2015 CBA | 2014 PBA vs. CBA |
| Science | S2A | 52 | − | − | −0.40 | −0.41 | −0.49 | −0.34 | −0.24 |
| S2B | 25 | − | − | −0.29 | − | − | −0.13 | −0.23 | |
| Mathematics | M1A | 31 | − | 0.01 | −0.11 | −0.14 | −0.18 | −0.07 | −0.10 |
| M1B | 35 | − | − | − | − | −0.08 | 0.17 | −0.25 | |
| Reading | R1A | 24 | −0.34 | −0.32 | −0.41 | −0.62 | − | −0.53 | −0.34 |
| R1B | 11 | −0.97 | − | − | −1.09 | − | −1.10 | −0.37 | |
| R2A | 39 | − | − | − | −0.57 | −0.66 | −0.56 | −0.07 | |
| R2B | 8 | − | − | − | 0.00 | − | −0.18 | 0.12 |
Trend estimation for science in Germany.
| Method | 2006 | 2009 | 2012 | 2015 | Trend 2012 → 2015 | ||||
| SE | SE | SE | |||||||
| Original | 516 | 520 | 524 | 509 | −15 | 5.6 | 4.0 | 3.9 | |
| Without field test (all items) | C1 | 516 | 519 | 523 | 508 | −15 | 5.8 | 3.0 | 5.0 |
| C2 | 516 | 518 | 524 | 508 | −16 | 5.8 | 3.0 | 5.0 | |
| H1 | 516 | 515 | 522 | 506 | −16 | 6.7 | 4.2 | 5.2 | |
| H2 | 516 | 516 | 522 | 501 | −21 | 7.0 | 4.2 | 5.5 | |
| S1 | 516 | 517 | 524 | 511 | −13 | 6.7 | 4.2 | 5.2 | |
| Without field test (invariant items) | C1I | 516 | 519 | 523 | 513 | −10 | 5.8 | 3.0 | 5.0 |
| C2I | 516 | 519 | 524 | 513 | −11 | 5.8 | 3.0 | 5.0 | |
| H1I | 516 | 517 | 523 | 513 | −10 | 7.1 | 4.2 | 5.7 | |
| H2I | 516 | 518 | 524 | 506 | −18 | 6.8 | 4.2 | 5.3 | |
| With field test | C1F | 516 | 520 | 524 | 528 | 4 | 5.8 | 3.0 | 5.0 |
| H1F | 516 | 516 | 522 | 528 | 6 | 8.3 | 4.2 | 7.2 | |
| S1F | 516 | 517 | 524 | 531 | 7 | 5.8 | 3.0 | 5.0 | |
Trend estimation for mathematics in Germany.
| Method | 2003 | 2006 | 2009 | 2012 | 2015 | Trend 2012 → 2015 | ||||
| SE | SE | SE | ||||||||
| Original | 503 | 504 | 513 | 514 | 506 | −8 | 5.4 | 4.1 | 3.5 | |
| Without field test (all items) | C1 | 503 | 513 | 515 | 522 | 510 | −12 | 5.8 | 3.0 | 5.0 |
| C2 | 503 | 515 | 517 | 524 | 511 | −13 | 5.8 | 3.0 | 5.0 | |
| H1 | 503 | 512 | 514 | 521 | 505 | −16 | 6.0 | 4.2 | 4.2 | |
| H2 | 503 | 517 | 521 | 528 | 515 | −13 | 6.2 | 4.2 | 4.5 | |
| S1 | 503 | 512 | 514 | 518 | 503 | −15 | 6.1 | 4.2 | 4.4 | |
| Without field test (invariant items) | C1I | 503 | 513 | 515 | 522 | 521 | −1 | 5.8 | 3.0 | 5.0 |
| C2I | 503 | 515 | 517 | 524 | 522 | −2 | 5.8 | 3.0 | 5.0 | |
| H1I | 503 | 512 | 514 | 521 | 512 | −9 | 6.4 | 4.2 | 4.8 | |
| H2I | 503 | 516 | 521 | 528 | 524 | −4 | 7.0 | 4.2 | 5.6 | |
| With field test | C1F | 503 | 512 | 515 | 516 | 518 | 2 | 5.8 | 3.0 | 5.0 |
| H1F | 503 | 512 | 514 | 514 | 515 | 1 | 7.5 | 4.2 | 6.2 | |
| S1F | 503 | 512 | 514 | 518 | 517 | −1 | 5.8 | 3.0 | 5.0 | |
Trend estimation for reading in Germany.
| Method | 2000 | 2003 | 2006 | 2009 | 2012 | 2015 | Trend 2012 → 2015 | ||||
| SE | SE | SE | |||||||||
| Original | 484 | 491 | 495 | 497 | 508 | 509 | 1 | 6.7 | 4.1 | 5.3 | |
| Without field test (all items) | C1 | 484 | 479 | 488 | 504 | 510 | 504 | −6 | 5.8 | 3.0 | 5.0 |
| C2 | 484 | 484 | 492 | 502 | 506 | 501 | −5 | 5.8 | 3.0 | 5.0 | |
| H1 | 484 | 478 | 487 | 501 | 509 | 502 | −7 | 6.6 | 4.2 | 5.0 | |
| H2 | 484 | 473 | 478 | 491 | 504 | 495 | −9 | 9.9 | 4.2 | 9.0 | |
| S1 | 484 | 482 | 490 | 510 | 516 | 505 | −11 | 8.0 | 4.2 | 6.8 | |
| Without field test (invariant items) | C1I | 484 | 478 | 488 | 507 | 513 | 523 | 10 | 5.8 | 3.0 | 5.0 |
| C2I | 484 | 483 | 491 | 506 | 511 | 521 | 10 | 5.8 | 3.0 | 5.0 | |
| H1I | 484 | 477 | 486 | 504 | 512 | 517 | 5 | 7.3 | 4.2 | 5.9 | |
| H2I | 484 | 473 | 477 | 496 | 508 | 510 | 2 | 11.1 | 4.2 | 10.2 | |
| With field test | C1F | 484 | 480 | 489 | 499 | 501 | 512 | 11 | 5.8 | 3.0 | 5.0 |
| H1F | 484 | 479 | 488 | 499 | 505 | 516 | 11 | 9.2 | 4.2 | 8.2 | |
| S1F | 484 | 482 | 490 | 510 | 516 | 528 | 12 | 5.8 | 3.0 | 5.0 | |
FIGURE 2Original trend estimates and several marginal trend estimation approaches for science, mathematics, and reading. For the marginal approaches, the average of all approaches is displayed (e.g., for “with field test”: average of means of methods C1F, H1F, and S1F). The vertical gray bars are defined by the corresponding minimum and maximum.
FIGURE 3Trends for paper-based assessment (PBA), computer-based assessment (CBA), and officially reported adjusted trends. (A) Time-constant mode effect; (B) disappearing mode effect by performance increase in CBA; (C) disappearing mode effect by performance decrease in PBA.