Danielle Blouin1. 1. Department of Emergency Medicine, Queen's University, Kingston, Ontario, Canada. blouind@kgh.kari.net
Abstract
BACKGROUND: Interviews are most important in resident selection. Structured interviews are more reliable than unstructured ones. PURPOSE: We sought to measure the interrater reliability of a newly designed structured interview during the selection process to an Emergency Medicine residency program. METHODS: The critical incident technique was used to extract the desired dimensions of performance. The interview tool consisted of 7 clinical scenarios and 1 global rating. Three trained interviewers marked each candidate on all scenarios without discussing candidates' responses. Interitem consistency and estimates of variance were computed. RESULTS: Twenty-eight candidates were interviewed. The generalizability coefficient was 0.67. Removing the central tendency ratings increased the coefficient to 0.74. Coefficients of interitem consistency ranged from 0.64 to 0.74. CONCLUSIONS: The structured interview tool provided good although suboptimal interrater reliability. Increasing the number of scenarios improves reliability as does applying differential weights to the rating scale anchors. The latter would also facilitate the identification of those candidates with extreme ratings.
BACKGROUND: Interviews are most important in resident selection. Structured interviews are more reliable than unstructured ones. PURPOSE: We sought to measure the interrater reliability of a newly designed structured interview during the selection process to an Emergency Medicine residency program. METHODS: The critical incident technique was used to extract the desired dimensions of performance. The interview tool consisted of 7 clinical scenarios and 1 global rating. Three trained interviewers marked each candidate on all scenarios without discussing candidates' responses. Interitem consistency and estimates of variance were computed. RESULTS: Twenty-eight candidates were interviewed. The generalizability coefficient was 0.67. Removing the central tendency ratings increased the coefficient to 0.74. Coefficients of interitem consistency ranged from 0.64 to 0.74. CONCLUSIONS: The structured interview tool provided good although suboptimal interrater reliability. Increasing the number of scenarios improves reliability as does applying differential weights to the rating scale anchors. The latter would also facilitate the identification of those candidates with extreme ratings.