| Literature DB >> 32393228 |
Aileen Faherty1, Tim Counihan2, Thomas Kropmans2, Yvonne Finn2.
Abstract
BACKGROUND: The reliability of clinical assessments is known to vary considerably with inter-rater reliability a key contributor. Many of the mechanisms that contribute to inter-rater reliability however remain largely unexplained and unclear. While research in other fields suggests personality of raters can impact ratings, studies looking at personality factors in clinical assessments are few. Many schools use the approach of pairing examiners in clinical assessments and asking them to come to an agreed score. Little is known however, about what occurs when these paired examiners interact to generate a score. Could personality factors have an impact?Entities:
Keywords: Clinical assessments; Examiner factors; Examiner variability; Reliability
Mesh:
Year: 2020 PMID: 32393228 PMCID: PMC7212618 DOI: 10.1186/s12909-020-02009-4
Source DB: PubMed Journal: BMC Med Educ ISSN: 1472-6920 Impact factor: 2.463
Overall Scores for Good, Average and Weak Candidate comparing scores given by Single Examiners when examining alone and the agreed consensus score when in pairs. The middle column illustrates what the average score would have been for each examiner pair
| Examiner | Good Candidate Overall Score | Average Candidate Overall Score | Weak Candidate Overall Score | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Alone | Paired | Paired | Alone | Paired | Paired | Alone | Paired | Paired | |
| 1 | 64 | 64 | 64 | 44 | 41 | 48 | 34 | 27 | 26 |
| 3 | 74 | 78 | 78 | 50 | 49 | 46 | 36 | 31 | 24 |
| 5 | 64 | 64 | 64 | 41 | 48 | 20 | 27 | 26 | |
| 6 | 64 | 79 | 82 | 44 | 51 | 56 | 24 | 18 | 18 |
| 7 | 68 | 69 | 64 | 42 | 49 | 52 | 34 | 37 | 34 |
| 9 | 80 | 85 | 88 | 44 | 46 | 48 | 28 | 29 | 28 |
| 10 | 80 | 83 | 80 | 44 | 28 | 31 | 30 | ||
| 11 | 82 | 78 | 78 | 48 | 49 | 46 | 26 | 31 | 24 |
| 12 | 70 | 69 | 64 | 56 | 49 | 52 | 40 | 37 | 34 |
| 14 | 94 | 79 | 82 | 58 | 51 | 56 | 12 | 18 | 18 |
| 16 | 90 | 85 | 88 | 48 | 46 | 48 | 30 | 29 | 28 |
| 17 | 86 | 83 | 80 | 50 | 42 | 44 | 34 | 31 | 30 |
| 76.33 (10.54) | 76.33 (8.19) | 76 (9.87) | 46.33 (6.86) | 45 (6.41) | 49 (4.33) | 28.83 (7.69) | 28.83 (6.27) | 34 (5.46) | |
| 30 | 21 | 24 | 24 | 17 | 12 | 28 | 19 | 16 | |
Avg Average
Fig. 1Box and Whisker Plots showing the Variability of Overall Scores for the Weak Performance using Single and Paired Examiners
Fig. 2Box and Whisker Plots showing the Variability of Overall Scores for the Average Performance using Single and Paired Examiners
Fig. 3Box and Whisker Plots showing the Variability of Overall Scores for the Good Performance using Single and Paired Examiners
Analysis of Variance of the main facets of the assessment using 12 single examiners using EDU G Negative Variance was set to zero
| Source | Components | ||||
|---|---|---|---|---|---|
| df | MS | Random | % | SE | |
| 11 | 0.13392 | 0.04254 | 87.1 | 0.01752 | |
| 2 | 0.00630 | 0.00000 | 0.0 | 0.00040 | |
| 22 | 0.00630 | 0.00630 | 12.9 | 0.00182 | |
| 35 | 100 | ||||
df degrees of freedom, MS mean square, SE standard error, O Observations, S Scenarios, SO interaction of scenario and observation
Reliability Statistics for the Assessments using both Single and Paired examiners
| Cronbach’s Alpha | Intraclass Correlation Co-efficient | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Intraclass Correlation | 95% Confidence Interval | F Test with True Value 0 | |||||||
| Lower Bound | Upper Bound | Value | df1 | df2 | Sig | ||||
| 0.99 | Single Measures | 0.887 | .648 | .997 | 98.97 | 2 | 22 | .000 | |
| Average Measures | 0.990 | .957 | 1.00 | 98.97 | 2 | 22 | .000 | ||
| 0.983 | Single Measures | 0.925 | .700 | .998 | 60.533 | 2 | 10 | .000 | |
| Average Measures | 0.987 | .933 | 1.00 | 60.533 | 2 | 10 | .000 | ||
df degrees of freedom
Changes in examiners’ marks when they moved from examining alone to examining in a pair
| Examiners | Pair A | Pair B | Pair C | Pair D | Pair E | Pair F | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 5 | 3 | 11 | 7 | 12 | 6 | 14 | 9 | 16 | 10 | 17 | |
| Good | 0.0 | 0.0 | 4 | −4 | −4 | −6 | 18 | −12 | 8 | −2 | 0 | −6 |
| Average | 4.0 | 10 | −4 | −2 | 10 | −4 | 12 | −2 | 4 | 0 | 10 | −6 |
| Weak | −8.0 | 6.0 | −12 | −2 | 0 | −6 | −6 | 6 | 0 | −2 | 2 | −4 |
| 12 | 16 | 20 | 8 | 14 | 16 | 36 | 20 | 12 | 4 | 12 | 16 | |
Relationship between the amount of change in examiners scores and personality. Only ‘Extroversion’ contributed significantly to the variation in marks per examiner with this personality score
| Spearman’s Correlation co-efficient rho | ||
|---|---|---|
| 0.352 | 0.262 | |
| −0.808 | 0.001 | |
| −0.185 | 0.565 | |
| −0.501 | 0.097 | |
| −0.451 | 0.141 |