OBJECTIVE: To determine the effectiveness of single-point benchmarking and longitudinal benchmarking for inter-school educational evaluation. METHODS: We carried out a mixed, longitudinal, cross-sectional study using data from 24 annual measurement moments (4 tests x 6 year groups) over 4 years for 4 annual progress tests assessing the graduation-level knowledge of all students from 3 co-operating medical schools. Participants included undergraduate medical students (about 5000) from 3 medical schools. The main outcome measures involved between-school comparisons of progress test results based on different benchmarking methods. RESULTS: Variations in relative school performance across different tests and year groups indicate instability and low reliability of single-point benchmarking, which is subject to distortions as a result of school-test and year group-test interaction effects. Deviations of school means from the overall mean follow an irregular, noisy pattern obscuring systematic between-school differences. The longitudinal benchmarking method results in suppression of noise and revelation of systematic differences. The pattern of a school's cumulative deviations per year group gives a credible reflection of the relative performance of year groups. CONCLUSIONS: Even with highly comparable curricula, single-point benchmarking can result in distortion of the results of comparisons. If longitudinal data are available, the information contained in a school's cumulative deviations from the overall mean can be used. In such a case, the mean test score across schools is a useful benchmark for cross-institutional comparison.
OBJECTIVE: To determine the effectiveness of single-point benchmarking and longitudinal benchmarking for inter-school educational evaluation. METHODS: We carried out a mixed, longitudinal, cross-sectional study using data from 24 annual measurement moments (4 tests x 6 year groups) over 4 years for 4 annual progress tests assessing the graduation-level knowledge of all students from 3 co-operating medical schools. Participants included undergraduate medical students (about 5000) from 3 medical schools. The main outcome measures involved between-school comparisons of progress test results based on different benchmarking methods. RESULTS: Variations in relative school performance across different tests and year groups indicate instability and low reliability of single-point benchmarking, which is subject to distortions as a result of school-test and year group-test interaction effects. Deviations of school means from the overall mean follow an irregular, noisy pattern obscuring systematic between-school differences. The longitudinal benchmarking method results in suppression of noise and revelation of systematic differences. The pattern of a school's cumulative deviations per year group gives a credible reflection of the relative performance of year groups. CONCLUSIONS: Even with highly comparable curricula, single-point benchmarking can result in distortion of the results of comparisons. If longitudinal data are available, the information contained in a school's cumulative deviations from the overall mean can be used. In such a case, the mean test score across schools is a useful benchmark for cross-institutional comparison.
Authors: Ralf Schmidmaier; Matthias Holzer; Matthias Angstwurm; Zineb Nouns; Martin Reincke; Martin R Fischer Journal: GMS Z Med Ausbild Date: 2010-11-15
Authors: M Heijne-Penninga; J B M Kuks; W H A Hofman; A M M Muijtjens; J Cohen-Schotanus Journal: Adv Health Sci Educ Theory Pract Date: 2012-06-27 Impact factor: 3.853
Authors: Mayke W C Vereijken; Roeland M van der Rijst; Jan H van Driel; Friedo W Dekker Journal: Adv Health Sci Educ Theory Pract Date: 2017-11-11 Impact factor: 3.853
Authors: Pedro Tadao Hamamoto Filho; Pedro Luiz Toledo de Arruda Lourenção; Joélcio Francisco Abbade; Dario Cecílio-Fernandes; Jacqueline Teixeira Caramori; Angélica Maria Bicudo Journal: PLoS One Date: 2021-09-10 Impact factor: 3.240