Achim Mortsiefer1, André Karger2, Thomas Rotthoff3, Bianca Raski4, Michael Pentzek5. 1. Heinrich-Heine-University Düsseldorf, Medical Faculty, Institute of General Practice, Werdener Str. 7, 40227 Düsseldorf, Germany. 2. Heinrich-Heine-University Düsseldorf, Medical Faculty, Clinical Institute of Psychosomatic Medicine and Psychotherapy, Moorenstr. 5, 40225, Düsseldorf, Germany,. Electronic address: Andre.Karger@med.uni-duesseldorf.de. 3. Heinrich-Heine-University Düsseldorf, Medical Faculty, Deanery of Study and Department for Endocrinology, Diabetes, and Rheumatology, Moorenstr. 5, 40225 Düsseldorf, Germany. Electronic address: Rotthoff@med.uni-duesseldorf.de. 4. Heinrich-Heine-University Düsseldorf, Medical Faculty, Deanery of Study and Clinical Institute of Psychosomatic Medicine and Psychotherapy, Moorenstr. 5, 40225 Düsseldorf, Germany. Electronic address: Bianca.Raski@med.uni-duesseldorf.de. 5. Heinrich-Heine-University Düsseldorf, Medical Faculty, Institute of General Practice, Werdener Str. 7, 40227 Düsseldorf, Germany. Electronic address: Pentzek@med.uni-duesseldorf.de.
Abstract
OBJECTIVE: To identify inter-individual examiner factors associated with interrater reliability in a summative communication OSCE in the 4th study year. METHODS: The OSCE consists of 4 stations assessed with a 4-item 5-point global rating instrument. A bivariate secondary analysis of interrater reliability in relation to 4 examiner factors (gender, profession, OSCE experience, examiner training) was conducted. Intraclass correlation coefficients (ICC) were calculated and compared between examiner dyads of different similarity. RESULTS: 169 pairwise ratings from 19 different examiners in 16 dyads were analysed. Interrater reliability is significantly higher in examiner dyads of same vs. different gender (ICC=0.76 (95%CI=0.65-0.83) vs. ICC=0.41 (95%CI=0.21-0.57)), in dyads of two clinicians vs. non-clinical/mixed professions (ICC=0.72 (95%CI=0.56-0.83) vs. ICC=0.57 (95%CI=0.41-0.69)), and in dyads with high vs. low/mixed OSCE experience (ICC=0.73 (95%CI 0.50-0.87) vs. ICC=0.56 (95%CI=0.41-0.69)). Participation in recent examiner training had no influence on ICCs. CONCLUSION: Better concordance of ratings between clinically active examiners might be a hint for context specificity of good communication. Higher interrater reliability between examiners with same gender may indicate gender-specific communication concepts. PRACTICE IMPLICATIONS: Medical faculties introducing summative assessment of communication competence should focus the influence of examiner characteristics on interrater reliability.
OBJECTIVE: To identify inter-individual examiner factors associated with interrater reliability in a summative communication OSCE in the 4th study year. METHODS: The OSCE consists of 4 stations assessed with a 4-item 5-point global rating instrument. A bivariate secondary analysis of interrater reliability in relation to 4 examiner factors (gender, profession, OSCE experience, examiner training) was conducted. Intraclass correlation coefficients (ICC) were calculated and compared between examiner dyads of different similarity. RESULTS: 169 pairwise ratings from 19 different examiners in 16 dyads were analysed. Interrater reliability is significantly higher in examiner dyads of same vs. different gender (ICC=0.76 (95%CI=0.65-0.83) vs. ICC=0.41 (95%CI=0.21-0.57)), in dyads of two clinicians vs. non-clinical/mixed professions (ICC=0.72 (95%CI=0.56-0.83) vs. ICC=0.57 (95%CI=0.41-0.69)), and in dyads with high vs. low/mixed OSCE experience (ICC=0.73 (95%CI 0.50-0.87) vs. ICC=0.56 (95%CI=0.41-0.69)). Participation in recent examiner training had no influence on ICCs. CONCLUSION: Better concordance of ratings between clinically active examiners might be a hint for context specificity of good communication. Higher interrater reliability between examiners with same gender may indicate gender-specific communication concepts. PRACTICE IMPLICATIONS: Medical faculties introducing summative assessment of communication competence should focus the influence of examiner characteristics on interrater reliability.