| Literature DB >> 34007682 |
Michael J Peeters1, M Kenneth Cor2, Sai Hs Boddu1,3, Jerry Nesamony1.
Abstract
DESCRIPTION OF THE PROBLEM: Reliability is critical validation evidence on which to base high-stakes decision-making. Many times, one exam in a didactic course may not be acceptably reliable on its own. But how much might multiple exams add when combined together? THE INNOVATION: To improve validation evidence towards high-stakes decision-making, Generalizability Theory (G-Theory) can combine reliabilities from multiple exams into one composite-reliability (G_String IV software). Further, G-Theory decision-studies can illustrate changes in course-grade reliability, depending on the number of exams and exam-items. CRITICAL ANALYSIS: 101 first-year PharmD students took two midterm-exams and one final-exam in a pharmaceutics course. Individually, Exam1 had 50MCQ (KR-20=0.69), Exam2 had 43MCQ (KR-20=0.65), and Exam3 had 67MCQ (KR-20=0.67). After combining exam occasions using G-Theory, the composite-reliability was 0.71 for overall course-grades-better than any exam alone. Remarkably, increased numbers of exam occasions showed fewer items per exam were needed, and fewer items over all exams, to obtain an acceptable composite-reliability. Acceptable reliability could be achieved with different combinations of number of MCQs on each exam and number of exam occasions. IMPLICATIONS: G-Theory provided reliability critical validation evidence towards high-stakes decision-making. Final course-grades appeared quite reliable after combining multiple course exams-though this reliability could and should be improved. Notably, more exam occasions allowed fewer items per exam and fewer items over all the exams. Thus, one added benefit of more exam occasions for educators is developing fewer items per exam and fewer items over all exams. © Individual authors.Entities:
Keywords: course; generalizability theory; occasion; reliability; validation
Year: 2021 PMID: 34007682 PMCID: PMC8102975 DOI: 10.24926/iip.v12i1.2925
Source DB: PubMed Journal: Innov Pharm ISSN: 2155-0417
Decision-studies of estimated reliability (via G-coefficients) for various numbers of items and various numbers of exam occasions for a first-year PharmD basic-science course
20 | 30 | 40 | 50 | 60 | 70 | 80 | ||
|---|---|---|---|---|---|---|---|---|
1 | 0.29 | 0.36 | 0.41 | 0.44 | 0.47 | 0.49 | 0.51 | |
2 | 0.45 | 0.53 | 0.58 | 0.61 | 0.64 | 0.66 | 0.67 | |
3 | 0.55 | 0.63 | 0.67 | 0.70 | 0.73 | 0.74 | 0.76 | |
4 | 0.62 | 0.69 | 0.73 | 0.76 | 0.78 | 0.79 | ||
5 | 0.67 | 0.74 | 0.77 | 0.79 | ||||
6 | 0.71 | 0.77 | ||||||
Note: Bold meets an acceptable threshold of 0.80[1,3]
Figure 1.Course-grade reliability as a function of number of testing occasions and items nested in occasion
Note: Line at 0.8 as threshold for acceptable reliability (for high-stakes decision-making) [1,3]