| Literature DB >> 21123695 |
Abstract
Previously we showed that weekly, written, timed, and peer-graded practice exams help increase student performance on written exams and decrease failure rates in an introductory biology course. Here we analyze the accuracy of peer grading, based on a comparison of student scores to those assigned by a professional grader. When students graded practice exams by themselves, they were significantly easier graders than a professional; overall, students awarded ≈25% more points than the professional did. This difference represented ≈1.33 points on a 10-point exercise, or 0.27 points on each of the five 2-point questions posed. When students graded practice exams as a group of four, the same student-expert difference occurred. The student-professional gap was wider for questions that demanded higher-order versus lower-order cognitive skills. Thus, students not only have a harder time answering questions on the upper levels of Bloom's taxonomy, they have a harder time grading them. Our results suggest that peer grading may be accurate enough for low-risk assessments in introductory biology. Peer grading can help relieve the burden on instructional staff posed by grading written answers-making it possible to add practice opportunities that increase student performance on actual exams.Entities:
Mesh:
Year: 2010 PMID: 21123695 PMCID: PMC2995766 DOI: 10.1187/cbe.10-03-0017
Source DB: PubMed Journal: CBE Life Sci Educ ISSN: 1931-7913 Impact factor: 3.325
Figure 1.Students grade practice exams more leniently than a professional. Bars represent mean scores for the practice exam indicated; horizontal lines represent SEs of the mean. Results of statistical tests are reported in Table 1.
Results of paired t tests: Practice exams graded by individual students versus a professional grader
| Mean score: peer grading | Mean score: professional grading | ||||
|---|---|---|---|---|---|
| Spring 2005: Week 5 | 6.90 ± 0.20 | 6.00 ± 0.20 | 96 | −4.95 | <<0.0001 |
| Spring 2005: Week 10 | 6.45 ± 0.25 | 4.64 ± 0.21 | 94 | −10.43 | <<0.0001 |
| Autumn 2005: Week 2 | 6.71 ± 0.18 | 6.25 ± 0.20 | 100 | −3.73 | 0.0003 |
| Autumn 2005: Week 5 | 6.45 ± 0.18 | 4.84 ± 0.16 | 100 | −11.45 | <<0.0001 |
| Autumn 2005: Week 9 | 7.3 ± 0.16 | 5.4 ± 0.20 | 100 | −9.02 | <<0.0001 |
Ten points were possible on each exercise; means are reported with SEs.
Results of unpaired t tests: Differences in practice exam scores graded by individual students versus a professional grader and student groups versus a professional grader
| Mean difference: individual students versus professional | Mean difference: student groups versus professional | ||||
|---|---|---|---|---|---|
| Spring 2005: Week 5 | 0.89 ± 0.18 | 0.76 ± 0.23 | 96, 43 | −0.47 | 0.64 |
| Spring 2005: Week 10 | 1.81 ± 0.17 | 1.38 ± 0.24 | 94, 43 | −1.43 | 0.15 |
Means are reported with SEs.
Do differences between student and professional grading vary with time in the term?
| a. Spring 2005 | ||||
|---|---|---|---|---|
| Week 5 | Week 10 | |||
| Mean difference: Individual students versus professional | 0.89 ± 0.18 (96) | 1.81 ± 0.17 (94) | −3.64 | 0.0003 |
Means are reported with SEs; sample sizes (numbers of practice exams graded) are in parentheses; the results are based on an unpaired t test in part (a) and an ANOVA in part (b).
Differences between student and professional grading vary with level on Bloom's taxonomy
| Lower-order cognitive skills | Higher-order cognitive skills | |||
|---|---|---|---|---|
| Mean difference: individual students versus professional | 0.15 ± 0.02 (688) | 0.31 ± 0.02 (1760) | −5.96 | <<0.0001 |
Means are reported with SEs; sample sizes (number of student answers) are in parentheses; the t statistic is from an unpaired test. For a definition of lower-order and higher-order thinking skills, see Methods.
Relationship between time-in-term and average Bloom's level of practice exam questions
| a. Spring 2005 | ||||
|---|---|---|---|---|
| Week 5 | Week 10 | |||
| Average Bloom's level | 2.6 ± 0.40 (5) | 3.2 ± 0.66 (5) | −0.77 | 0.46 |
Means are reported with SEs; sample sizes (number of different questions asked) are in parentheses.