| Literature DB >> 34007668 |
Michael J Peeters1, M Kenneth Cor2, Erik D Maki3.
Abstract
DESCRIPTION OF THE PROBLEM: High-stakes decision-making should have sound validation evidence; reliability is vital towards this. A short exam may not be very reliable on its own within didactic courses, and so supplementing it with quizzes might help. But how much? This study's objective was to understand how much reliability (for the overall module-grades) could be gained by adding quiz data to traditional exam data in a clinical-science module. THE INNOVATION: In didactic coursework, quizzes are a common instructional strategy. However, individual contexts/instructors can vary quiz use formatively and/or summatively. Second-year PharmD students took a clinical-science course, wherein a 5-week module focused on cardiovascular therapeutics. Generalizability Theory (G-Theory) combined seven quizzes leading to an exam into one module-level reliability, based on a model where students were crossed with items nested in eight fixed testing occasions (mGENOVA used). Furthermore, G-Theory decision-studies were planned to illustrate changes in module-grade reliability, where the number of quiz-items and relative-weighting of quizzes were altered. CRITICAL ANALYSIS: One-hundred students took seven quizzes and one exam. Individually, the exam had 32 multiple-choice questions (MCQ) (KR-20 reliability=0.67), while quizzes had a total of 50MCQ (5-9MCQ each) with most individual quiz KR-20s less than or equal to 0.54. After combining the quizzes and exam using G-Theory, estimated reliability of module-grades was 0.73; improved from the exam alone. Doubling the quiz-weight, from the syllabus' 18% quizzes and 82% exam, increased the composite-reliability of module-grades to 0.77. Reliability of 0.80 was achieved with equal-weight for quizzes and exam. NEXT STEPS: Expectedly, more items lent to higher reliability. However, using quizzes predominantly formatively had little impact on reliability, while using quizzes more summatively (i.e., increasing their relative-weight in module-grade) improved reliability further. Thus, depending on use, quizzes can add to a course's rigor. © Individual authors.Entities:
Keywords: generalizability theory; quizzes; reliability; validation
Year: 2021 PMID: 34007668 PMCID: PMC8102960 DOI: 10.24926/iip.v12i1.2235
Source DB: PubMed Journal: Innov Pharm ISSN: 2155-0417
G-Study variance components estimates by testing occasion, along with KR-20 coefficients for comparison
9 | 9 | 9 | 5 | 5 | 5 | 8 | 32 | 82 | |
|
|
|
|
|
|
|
|
| |
0.004 (3%) | 0 (0%) | 0.017 (9%) | 0.011 (7%) | 0.014 (9%) | 0.011 (8%) | 0.014 (9%) | 0.007 (5%) | 0.006 (6%) | |
0.038 (25%) | 0.035 (21%) | 0.032 (18%) | 0.013 (9%) | 0.013 (8%) | 0.002 (2%) | 0.029 (19%) | 0.025 (18%) | 0.017 (16%) | |
0.112 (73%) | 0.128 (79%) | 0.128 (73%) | 0.129 (84%) | 0.136 (84%) | 0.124 (90%) | 0.11 (72%) | 0.111 (78%) | 0.082 (78%) | |
0.26 | 0.00 | 0.54 | 0.30 | 0.34 | 0.30 | 0.51 | 0.67 |
|
These Classical Test Theory KR-20 coefficients are only to compare with the G-Study’s g-coefficient of 0.73; they were not part of the G-Theory analysis.
Figure 1.Decision-Studies for the number of items and weight of quizzes/exam during a second-year PharmD course