| Literature DB >> 29488098 |
Sandra Monteiro1,2, Debra Sibbald3,4, Karen Coetzee3.
Abstract
INTRODUCTION: Tablet-based assessments offer benefits over scannable-paper assessments; however, there is little known about the impact to the variability of assessment scores.Entities:
Keywords: Generalizability; OSCE; Psychometrics; Tablet based assessment
Mesh:
Year: 2018 PMID: 29488098 PMCID: PMC5889381 DOI: 10.1007/s40037-018-0410-4
Source DB: PubMed Journal: Perspect Med Educ ISSN: 2212-2761
Fig. 1Image of scannable sheet typically used by Touchstone Institute. Performance rating anchors are described at the top of the page and raters are instructed to select the corresponding letter for each competency assessed. In this test sheet, 11 competencies are shown. In this study scores for 10 competencies are analyzed as Physical Examination is not evaluated in all stations
Fig. 2Screenshot of current tablet interface for raters. This image shows a hypothetical candidate receiving scores such as ‘Borderline’ or ‘Not Acceptable’ for 11 competencies. Similar to the scan test data, only scores for 10 competencies were analyzed in the current study
Summary of the study designs, analyses and main results. Overall, there were 3 facets: station, modality and candidate. Candidate was always the facet of differentiation
| Study 1 | Study 2a | Study 2b | Control | |
|---|---|---|---|---|
|
| 154 | 46 | 44 | 116 |
| Mean total score on a 5-point scale (SD) | 3.19 (0.52) | 3.40 (0.46) | 3.48 (0.42) | 3.19 (0.47) |
| Change in average scores | 0.4 | 0.1 | – | |
| Reliability—Cronbach’s alpha | 0.88 | 0.82 | 0.84 | 0.86 |
| Reliability—McDonald’s omega | 0.88 | 0.82 | 0.84 | 0.87 |
| Internal consistency of 12 stations—McDonald’s omega | 0.91 to 0.96 | 0.89 to 0.98 | 0.88 to 0.97 | 0.90 to 0.96 |
| G-Study design | 2 facets crossed, candidate nested | 2 facets crossed, station nested | 2 facets crossed, station nested | 2 facets crossed |
|
| ||||
| Facets | ||||
| – Station (Random) | 0.03 | 0.04 | 0.05 | 0.05 |
| – Station nested in Modality | – | 0.22 | 0.03 | – |
| – Candidate (Differentiation) | – | 0.17 | 0.14 | 0.20 |
| – Candidate nested in Modality | 0.20 | – | – | – |
| – Modality (Random) | 0.07 | 0.02 | 0.0001 | – |
| – Competency | 0.02 | 0.01 | 0.01 | 0.02 |
| – Residual | 0.35 | 0.36 | 0.32 | 0.34 |
| Absolute g‑coefficient for 10 competency scores as repeated measures across 12 stations | 0.65 | 0.73 | 0.75 | 0.83 |
| Absolute g‑coefficient using 12 station means | 0.66 | 0.74 | 0.81 | 0.85 |
| Inter-station generalizability | 0.3 | 0.3 | 0.3 | |
| Inter-modality generalizability | – | 0.8 | – | |
Fig. 3Twelve Rasch calculated station difficulties compared for Modality 1 and Modality 2. This scatterplot indicates that scores for the 12 stations were similar in difficulty for both modality 1 and 2 (i. e. tablet and scan sheet)