| Literature DB >> 34739632 |
Josepha Kuhn1,2, Pieter van den Berg3, Silvia Mamede4,5, Laura Zwaan4, Patrick Bindels3, Tamara van Gog6.
Abstract
When physicians do not estimate their diagnostic accuracy correctly, i.e. show inaccurate diagnostic calibration, diagnostic errors or overtesting can occur. A previous study showed that physicians' diagnostic calibration for easy cases improved, after they received feedback on their previous diagnoses. We investigated whether diagnostic calibration would also improve from this feedback when cases were more difficult. Sixty-nine general-practice residents were randomly assigned to one of two conditions. In the feedback condition, they diagnosed a case, rated their confidence in their diagnosis, their invested mental effort, and case complexity, and then were shown the correct diagnosis (feedback). This was repeated for 12 cases. Participants in the control condition did the same without receiving feedback. We analysed calibration in terms of (1) absolute accuracy (absolute difference between diagnostic accuracy and confidence), and (2) bias (confidence minus diagnostic calibration). There was no difference between the conditions in the measurements of calibration (absolute accuracy, p = .204; bias, p = .176). Post-hoc analyses showed that on correctly diagnosed cases (on which participants are either accurate or underconfident), calibration in the feedback condition was less accurate than in the control condition, p = .013. This study shows that feedback on diagnostic performance did not improve physicians' calibration for more difficult cases. One explanation could be that participants were confronted with their mistakes and thereafter lowered their confidence ratings even if cases were diagnosed correctly. This shows how difficult it is to improve diagnostic calibration, which is important to prevent diagnostic errors or maltreatment.Entities:
Keywords: Calibration; Diagnostic error; Feedback; General practice; Instructional design; Self-assessment
Mesh:
Year: 2021 PMID: 34739632 PMCID: PMC8938348 DOI: 10.1007/s10459-021-10080-9
Source DB: PubMed Journal: Adv Health Sci Educ Theory Pract ISSN: 1382-4996 Impact factor: 3.853
Overview of the chief symptoms and medical conditions that were described in the 12 cases
| Chief symptom | Correct diagnosis |
|---|---|
| Diarrhoea | Chronic pancreatitis |
| Shortness of breath | Heart failure |
| Palpitation | Panic disorder |
| Turn dizziness | Benign Paroxysmal Position Vertigo |
| Rash/eczema | Scarlet fever |
| Lower back pain | Spondylodiscitis |
| Amenorrhea | Pregnancy |
| Pain in legs | Spinal canal stenosis |
| Tremor in hand | Multiple sclerosis |
| Facial paralysis | Bell's palsy |
| Rash in the face | Rosacea |
| Vaginal discharge | Bacterial vaginosis |
Demographics and prior experience ratings
| No-feedback condition | Feedback condition | Total | |
|---|---|---|---|
| Sample size | 35 | 34 | 69 |
| Gender | 27 female | 27 female | 54 female |
| Age, | 29.23 (2.31) | 29.35 (2.73) | 29.29 (2.51) |
| Prior experience with diagnoses, | 2.38 (.52) | 2.43 (.61) | 2.41 (.57) |
| Prior experience with symptoms, | 3.21 (.55) | 3.24 (.64) | 3.22 (.59) |
Prior experience was rated on a 5-point Likert-scale ranging from 1 (I have never seen a patient with this condition, symptom, or complaint) to 5 (I have seen many patients with this condition, symptom, or complaint)
Mean and standard deviation for all outcome measures (diagnostic accuracy, confidence in the diagnosis, mental effort, case complexity, and as measures of calibration: absolute accuracy and bias)
| No-feedback condition ( | Feedback condition ( | Total ( | ||||
|---|---|---|---|---|---|---|
| Mean | SD | Mean | SD | Mean | SD | |
| Diagnostic accuracy | 0.42 | 0.14 | 0.43 | 0.12 | 0.42 | 0.13 |
| Confidence rating | 5.82 | 0.80 | 5.43 | 0.79 | 5.63 | 0.82 |
| Mental effort rating | 5.02 | 1.04 | 5.12 | 0.90 | 5.07 | 0.97 |
| Complexity rating | 5.64 | 0.82 | 5.40 | 0.89 | 5.52 | 0.86 |
| Absolute accuracy | 0.42 | 0.12 | 0.46 | 0.09 | 0.44 | 0.11 |
| Bias | 0.22 | 0.21 | 0.15 | 0.20 | 0.18 | 0.21 |
Diagnostic accuracy was scored as either 0 (incorrect), 0.5 (partially correct) or 1 (correct). Confidence and complexity were rated on a 9-point Likert-scale ranging from 1 (very, very low) to 9 (very, very high). Absolute accuracy ranges from 0 to 1. Bias ranges from − 1 to + 1
Post hoc analysis of confidence and calibration, split up for the cases that were diagnosed correctly or incorrectly
| No-feedback condition ( | Feedback condition ( | Total ( | ||||
|---|---|---|---|---|---|---|
| Mean | SD | Mean | SD | Mean | SD | |
| Confidence rating | 5.30 | 1.07 | 5.11 | .86 | 5.20 | .97 |
| Bias | .54 | .19 | .52 | .16 | .53 | .18 |
| Confidence rating | 6.49 | .80 | 5.90 | 1.03 | 6.20 | .96 |
| Bias | − .25 | .15 | − .35 | .19 | − .30 | .17 |
The number of correct or incorrect cases on which the means are based differs for each participant, depending on their performance