| Literature DB >> 29556423 |
Morten Jørgensen1,2,3, Lars Konge1,2, Yousif Subhi1,3.
Abstract
BACKGROUND: The contrasting groups' standard setting method is commonly used for consequences analysis in validity studies for performance in medicine and surgery. The method identifies a pass/fail cut-off score, from which it is possible to determine false positives and false negatives based on observed numbers in each group. Since groups in validity studies are often small, e.g., due to a limited number of experts, these analyses are sensitive to outliers on the normal distribution curve.Entities:
Keywords: Contrasting groups; False negatives; False positives; Medical education; Messick’s validity framework; Standard setting
Year: 2018 PMID: 29556423 PMCID: PMC5845294 DOI: 10.1186/s41077-018-0064-7
Source DB: PubMed Journal: Adv Simul (Lond) ISSN: 2059-0628
Fig. 1Illustration of the contrasting groups’ method. Blue represents the novice group. Orange represents the experienced group. The black vertical line goes through the identified intercept of the curves, representing the pass/fail cut-off score
Fig. 2Illustration of the theoretical false positive and theoretical false negative using the normal distribution curve of group-specific scores. Top, the black area represents the cumulative distribution corresponding to the false positives. This is defined as the area under the curve of the blue curve that is to the right of the intercept point of the curves corresponding to the pass/fail cut-off score. Bottom, the black area represents the cumulative distribution corresponding to the false negatives. This is defined as the area under the curve of the orange curve that is to the left of the intercept point of the curves corresponding to the pass/fail cut-off score
Fig. 3Example of contrasting groups’ method with data on theoretical false positives and theoretical false negatives. In this case, the authors of the study observed that one novice passed (observed false positive rate of 9.1%) and that no experts failed (observed false negative rate of 0.0%). The theoretical false positives and theoretical false negatives suggest that if the groups had been much larger, 2.7% of the novices would have passed the test and 6.4% of the experts would have failed the test
Fig. 4Example of contrasting groups’ method with data on theoretical false positives and theoretical false negatives. In this case, the authors of the study observed that one novice passed (observed false positive rate of 6.7%) and that one expert failed (observed false negative rate of 10.0%). The theoretical false positives and theoretical false negatives suggest that if the groups had been much larger, 0.0% of the novices would have passed the test and 0.0% of the experts would have failed the test
All data extracted from studies examined in this paper. In addition, we have calculated theoretical false positives (FP) and theoretical false negatives (FN) and provided the absolute difference between observed and theoretical FP and FN.
| Ref. | Number of novices | Novices’ score, mean (SD) | Number of experts | Experts’ score, mean (SD) | Pass/fail cut-off score | Novices passed (observed FP), | Calculated theoretical FP, % | Absolute difference in FP | Experts failed (observed FN), | Calculated theoretical FP, % | Absolute difference in FN |
|---|---|---|---|---|---|---|---|---|---|---|---|
| [ | Data only available for one group | ||||||||||
| [ | 20 | 244 (88) | 20 | 446 (52) | 358 | 2 (10.0%) | 9.8% | 0.2% | 2 (10.0%) | 4.5% |
|
| [ | 13 | 38.6 (27.3) | 13 | 0 (9.1) | 15.5 | 2 (15.4%) | 19.9% | 4.5% | 1 (7.7%) | 4.4% | 3.3% |
| [ | 10 | 1.5 (0.4) | 10 | 4.4 (0.4) | 3 | 0 (0.0%) | 0.0% | 0.0% | 0 (0.0%) | 0.0% | 0.0% |
| [ | 10 | 1.8 (0.2) | 10 | 3.9 (0.5) | 2.5 | 0 (0.0%) | 0.0% | 0.0% | 0 (0.0%) | 0.3% | 0.3% |
| [ | 14 | 0.27 (0.065) | 14 | 0.65 (0.117) | 0.42 | 0 (0.0%) | 1.1% | 1.1% | 0 (0.0%) | 2.5% | 2.5% |
| [ | No numbers on pass/fail | ||||||||||
| [ | No numbers on pass/fail | ||||||||||
| [ | Data only available as median and range | ||||||||||
| [ | 11 | 93.1 (73.4) | 10 | 459.7 (147.5) | 235 | 1 (9.1%) | 2.7% |
| 0 (0.0%) | 6.4% |
|
| [ | 11 | 41.4 (35.5) | 10 | 106.9 (102.5) | 93 | 1 (9.1%) | 7.2% | 1.9% | 7 (70%) | 44.6% |
|
| [ | 26 | 333 (96) | 11 | 497 (52) | 422 | 5 (19.2%) | 17.7% | 1.5% | 1 (9.1%) | 7.5% | 1.6% |
| [ | 15 | 7.2 (1.1) | 10 | 27 (3.2) | 15.5 | 1 (6.7%) | 0.0% |
| 1 (10.0%) | 0.0% |
|
| [ | 15 | 0.32 (0.31) | 10 | 2.48 (1.09) | 0.79 | 1 (6.7%) | 6.5% | 0.2% | 0 (0.0%) | 6.1% |
|
| [ | 10 | 30 (32) | 10 | 76 (10) | 58 | 0 (0.0%) | 19.1% |
| 1 (10.0%) | 3.6% |
|
| [ | 8 | 0.098 (0.074) | 6 | 0.240 (0.037) | 0.19 | 1 (12.5%) | 10.7% | 1.8% | 0 (0.0%) | 8.8% |
|
| [ | Did not use contrasting groups | ||||||||||
| [ | 11 | 2.7127 (2.25645) | 10 | 0.7890 (0.39156) | 1.51 | 5 (45.5%) | 29.7% |
| 0 (0.0%) | 3.3% | 3.3% |
| [ | No numbers on pass/fail | ||||||||||
| [ | No numbers on pass/fail | ||||||||||
Abbreviations: SD standard deviation, FP false positives, FN false negatives
An absolute difference of > 5% between the observed and theoretical FP and FN are marked in italics
aTransabdominal novices
bTransvaginal novices
cData from case 1
dData from case 2
eData from the virtual reality model
fData from the physical model
gMean and standard deviation are estimated from median and interquartile range
Fig. 5Group size (x axis) in relation to the absolute difference between observed and theoretical false positives/negatives (y axis). Trend (black line) suggests relationship between group size and precision of the observed false positives/negatives