| Literature DB >> 28587597 |
Nancy Sturman1,2, Remo Ostini3, Wai Yee Wong4, Jianzhen Zhang5, Michael David6.
Abstract
BACKGROUND: Robust and defensible clinical assessments attempt to minimise differences in student grades which are due to differences in examiner severity (stringency and leniency). Unfortunately there is little evidence to date that examiner training and feedback interventions are effective; "physician raters" have indeed been deemed "impervious to feedback". Our aim was to investigate the effectiveness of a general practitioner examiner feedback intervention, and explore examiner attitudes to this.Entities:
Mesh:
Year: 2017 PMID: 28587597 PMCID: PMC5461633 DOI: 10.1186/s12909-017-0929-9
Source DB: PubMed Journal: BMC Med Educ ISSN: 1472-6920 Impact factor: 2.463
Feedback to examiners on Case Ratingsa
| 1 | Overall mean rating, Standard Deviation (SD), and range of each examiner’s Diagnosticb Case ratings |
| 2 | Mean differences between each examiner’s mean rating and the mean ratings of every other examiner, highlighting significant differences |
| 3 | Overall mean rating and SD for each Diagnosticb case |
| 4 | Mean ratings, SDs, and range of ratings for each examiner on each Diagnosticb case examined |
aInformation for the 2012 and 2013 examinations was presented separately
bThe equivalent information was also presented for Management Cases
Fig. 1Data collection timelines (a) and analysis schema (b)
Demographics and Experience of General Practitioner Examiner Participants (N = 15)
| Gender | Female 9: Male 6 |
| General practice experience (range in years) | 1.5–35 |
| Medical student teaching experience (range in years) | 0.5–33 |
| Number of general practice clinical case examinations previously examined by examiner (range) | 2–30 |
| Previous participation in informal discussions about assessment with other examiners | Yes:15 No: 0 |
| Previous participation in an examiner training session | Yes: 13 No: 2 |
| Previous experience of co-marking with another examiner | Yes: 12 No: 3 |
Examiner overall ratings
| Pre-Intervention | Post-Intervention | |
|---|---|---|
| Diagnostic Case Mean | 5.035 | 5.129 |
| Diagnostic Case Standard Deviation | 0.941 | 0.881 |
| Diagnostic Case Range | 3.00–7.00 | 2.5–7.00 |
| Number of outlier examiners in Diagnostic Case Examinationsa | 8 (out of 11 examiners) | 7 (out of 13 examiners) |
| Management Case Mean | 4.874 | 4.990 |
| Management Case Standard Deviation | 1.104 | 0.954 |
| Management Case Range | 2.75–6.75 | 3.75–7.00 |
| Number of outlier examiners in Management Case Examinationsa | 5 (out of 9 examiners) | 4 (out of 13 examiners) |
aExaminers whose mean rating was more than 3 standard errors from the overall mean
Changes in examiner ratings
| (a) Analysis of Effectiveness of Intervention | ||
| Outlier Analysis | ||
| Diagnostic Case Type | Management Case Type | |
| Pre-intervention | 8 (out of 11 examiners) | 5 (out of 9 examiners) |
| Post-intervention | 7 (out of 13 examiners) | 4 (out of 13 examiners) |
| Bootstrapping Analysis | ||
| Diagnostic Case Type | Management Case Type | |
| Change in lenient examiner ratings | −0.18 (95%CI −0.52 – +0.17) | −0.18 (95%CI −0.79 – +0.34) |
| Change in stringent examiner ratings | +0.37 (95%CI +0.14 – +0.28) | +0.17 (95%CI −0.35 – +0.64) |
| Difference in change between lenient and stringent examiners | +0.55 (95%CI +0.05 – +0.68) | +0.35 (95% CI −0.37 – +1.07) |
| Multivariable Linear Regression | ||
| Diagnostic Case Type | Management Case Type | |
| Full model | F (4,3) = 4.96; | F (5,1) = 0.52; |
| Intervention effect | t = 0.54; | t = −0.16; |
| (b) Analysis of Historical Stability of Examiner Severity | ||
| Intra-class Correlation Analysis | ||
| Diagnostic Case Type | Management Case Type | |
| Intra-examiner correlation between average ratings 2012 and pre-intervention 2013 | 0.420 | 0.179 |
| Intra-examiner correlation between average ratings pre- and post-intervention 2013 | 0.665 | 0.578 |
Examiner attitudes to examiner feedback (N = 14)
|
|
|
| The comparative examiner marking feedback was useful to me in informing me about my leniency or stringency as an examiner | 4.3 (3–5) |
| The comparative examiner marking feedback was easy to understand | 3.6 (1–5) |
| I am interested in receiving comparative examiner marking feedback in future | 4.4 (3–5) |
| Comparative marking feedback is effective in improving the reliability of our examinations | 4.0 (3–5) |
|
| |
|
| |
|
| |
|
| |