| Literature DB >> 29606802 |
Henriette Pisani Sundhagen1, Stian Kreken Almeland1,2, Emma Hansson3,4,5.
Abstract
BACKGROUND: In recent years, emphasis has been put on that medical student should demonstrate pre-practice/pre-registration core procedural skills to ensure patient safety. Nonetheless, the formal teaching and training of basic suturing skills to medical students have received relatively little attention and there is no standard for what should be tested and how. The aim of this study was to develop and validate, using scientific methods, a tool for assessment of medical students' suturing skills, measuring both micro- and macrosurgical qualities.Entities:
Keywords: Assessment tool; Microsurgery; Plastic surgery; Surgical education; Suturing skills; Technical skills assessment; Undergraduate training
Year: 2017 PMID: 29606802 PMCID: PMC5871634 DOI: 10.1007/s00238-017-1378-8
Source DB: PubMed Journal: Eur J Plast Surg ISSN: 0930-343X
Variables investigated and statistical tests used
| Concept and definition | Methodology | Statistical test |
|---|---|---|
| Content validity | ||
| Extent to which a test measures the intended content [ | Review of literature on previous assessment tools | |
| Construct validity | ||
| Extent to which a test is able to differentiate between a good and a bad performer [ | Difference of scores between (1) subjects pre- and post-course and between (2) post-course subjects and expert controls | 1) Paired |
| Concurrent validity | ||
| Extent to which the results of a test correlate with gold standard tests known to measure the same domain [ | Correlation of subjects’ in-house tool scores with their OSATS and UWOMSA scores | Spearman R non-parametric correlation |
| Inter-rater reliability | ||
| Extent of agreement between more than two assessors [ | Correlation of score given by the three different assessors | Intra class correlation (ICC) coefficients estimation of variances |
| Inter-item reliability | ||
| Extent to which different components of a test correlate [ | Correlation of in-house scores and global “able to suture” assessment | Logistic regression with AUC. The regression estimates the likelihood of “ability to suture”-judgment by an assessor and the AUC estimates the likelihood of a subject categorized as “able to suture” having the highest score, when randomly compared to a subject “not able to suture” |
| Inter-test reliability | ||
| Ability of a test to generate similar results when applied at two different time points [ | Comparison of score as assessed by the same assessor at two different time points | Repeatability coefficient (CR), intra class coefficients (ICC). The CR is computed on the basis of mean variance of all subjects as scored by all three assessors. The CR suggests that 95% of repeated scores of the same subject can be expected to differ by less than the calculated value (Vaz, Falkmer et al. 2013). |
Fig. 1Data distribution of in-house scores by all assessors in the different groups
Fig. 2Subjects time distribution in time to complete task. The 67% cutoff is marked as the dashed line at 378 s
Summary statistics on performance by study group
| Study group | Time to complete task (in seconds) | Number of errors registered* | In-house score* | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Range | Median | Mean | SD | Range | Median | Mean | SD | Range | Median | Mean | SD | |
| Pre-course subjects | 279–559 | 372 | 387 | 112 | 5–6 | 6 | 6 | 0.3 | − 1.2–3.2 | 2.0 | 1.4 | 1.7 |
| Post-course subjects | 172–607 | 324 | 351 | 116 | 0–5 | 2 | 2 | 1.5 | 0.8–9.3 | 6.3 | 5.3 | 2.7 |
| Expert controls | 97–177 | 123 | 126 | 32 | 0–2 | 1 | 1 | 0.6 | 8.6–11.1 | 10.8 | 10.3 | 1.1 |
SD standard deviation
*For simplicity presented as product of average score from all three assessors
Fig. 3Differences in in-house scores between different groups. Individual scores from all three assessors are plotted
Summary of statistical tests of validity and reliability
| Statistical test and subgroups of analysis | Output |
| SD | 95% CI |
|---|---|---|---|---|
| Construct validity | md | |||
| Paired | 4.9 | 0.03 | 3.3 | 0.8–9.0 |
| Two-sample | 5.0 | < 0.01 | 4.0 | 3.9–6.0 |
| Concurrent validity | ||||
| Spearman R non-parametric correlation | ||||
| OSATS correlation to in-house score |
| |||
| Assessor 1 | 0.89 | < 0.01 | ||
| Assessor 2 | 0.88 | < 0.01 | ||
| Assessor 3 | 0.86 | < 0.01 | ||
| Combined average | 0.90 | < 0.01 | ||
| UWOMSA correlation to in-house score | ||||
| Assessor 1 | 0.91 | < 0.01 | ||
| Assessor 2 | 0.87 | < 0.01 | ||
| Assessor 3 | 0.86 | < 0.01 | ||
| Combined average | 0.91 | < 0.01 | ||
| Inter-rater reliability | ||||
| Intra class correlation (ICC) coefficients | ICC | |||
| Pre course subject | 0.83 | < 0.01 | 0.43–0.98 | |
| Post-course subject | 0.80 | < 0.01 | 0.60–0.92 | |
| Expert controls | 0.65 | < 0.01 | 0.15–0.95 | |
| All groups combined | 0.92 | < 0.01 | 0.84–0.96 | |
| Inter-item reliability | OR (AUC) | |||
| Logistic regression with AUC | ||||
| Assessor 1 | 2.68 (0.94) | 0.01 | 1.23–5.84 | |
| Assessor 2 | 1.71 (0.91) | 0.01 | 1.15–2.55 | |
| Assessor 3 | 2.96 (0.97) | 0.04 | 1.07–8.22 | |
| Inter-test reliability | CR (SEM) | |||
| Repeatability coefficient (CR) | 2.7 (0.98) | |||
| ICC | ||||
| Intra class correlation (ICC) coefficients | 0.93 | < 0.01 | 0.79–0.99 | |
SD standard deviation, CI confidence interval, OR odds ratio, md mean difference, AUC area under the curve, ρ Spearman’s correlation coefficient, P p value, SEM standard error of measurement (intra observer standard deviation)
Fig. 4Matched improvement of pre- and post-course performance as measured by the in-house scoring tool
Fig. 5Correlation of subjects’ in-house scores with their OSATS and UWOMSA scores. O—pre- and post-course subjects, X—expert controls
Fig. 6Relationship of in-house scores given by the three assessors plotted against the average score
Fig. 7Repeatability of the in-house score. Single assessor scores plotted against the average score by the three assessors. Arrows represent scores of the same subject by the same assessor at two different time points. The starting point of the arrow is the score given at the first assessment and the arrowhead represents the second assessment score. Large differences in the first and second scores by the same assessor can be spotted as elongated arrows. Only arrowheads are shown when the two scores are equal. The average score is indicated as a numeric value on the graph