| Literature DB >> 26715145 |
D A McGill1, C P M van der Vleuten2, M J Clarke3.
Abstract
BACKGROUND: Evaluations of clinical assessments that use judgement-based methods have frequently shown them to have sub-optimal reliability and internal validity evidence for their interpretation and intended use. The aim of this study was to enhance that validity evidence by an evaluation of the internal validity and reliability of competency constructs from supervisors' end-of-term summative assessments for prevocational medical trainees.Entities:
Mesh:
Year: 2015 PMID: 26715145 PMCID: PMC4696206 DOI: 10.1186/s12909-015-0520-1
Source DB: PubMed Journal: BMC Med Educ ISSN: 1472-6920 Impact factor: 2.463
Fig. 1Optimal Model, Parameter Estimates and Error Estimates (Residual variances). (See Model Structure in the text for an explanation of the diagram)
Descriptive statistics, correlations, and reliability results for the competency items, and the standardised estimates and reliability results of the modelled constructs
The diagonal cells contain percent variance for the score due to the trainee; all remaining variance is considered error variance; p < 0.001 for all correlations
All 2-tailed p-values <0.000; (see Fig. 1 for factor structure)
aStandardised Estimates of constructs with the items defining those constructs (SE) in shaded areas
Model Fit Indexes for alternative non-nested models
| Model | Chi-squared ( | Ratio of | Akaike information criterion (AIC) | Bayes information criterion (BIC) | Tucker–Lewis index (TLI) | Comparative fit index (CFI) | Root mean square error of approximation (RMSEA) (95%CI) | Standardised root mean square residual (SRMR) | Weighted root mean residual (WRMR) |
|---|---|---|---|---|---|---|---|---|---|
| Ideal Benchmarka | Non-significant | <3; | Smaller the better; for model comparison (non-nested) | Smaller the better; for model comparison (non-nested) | ≥ 0.95 id | ≥ 0.95 ideal | <0.06 ideal; | ≤ 0.08 | < 0.90 |
| <0.90 reject | <0.90 reject | <0.08 acceptable; and with narrow 95 % confidence intervals | |||||||
| 3 Factor Model 1b | 116.563 | 2.8 | 3879 | 4018 | 0.93 | 0.95 | 0.07 | 0.039 | 0.93 |
| (0.057–0.088) | |||||||||
| 3 Factor Model 3c | 223.258 | 3.0 | 4732 | 4906 | 0.89 | 0.91 | 0.08 | 0.048 | 1.14 |
| (0.067–0.090) | |||||||||
| 3 Factor Model 4d | 121.571 | 3.0 | 3884 | 4023 | 0.92 | 0.94 | 0.08 | 0.041 | 1.07 |
| 3 Factor Modele | 211.42 | 2.85 | 4711 | 4884 | 0.90 | 0.92 | 0.07 | 0.045 | 1.06 |
| (0.062–0.085) | |||||||||
| 1 Factor | 170.483 | 3.9 | 3955 | 4082 | 0.87 | 0.91 | 0.09 | 0.050 | 1.24 |
| 2 Factor | 139.489 | 3.2 | 3910 | 4041 | 0.91 | 0.93 | 0.08 | 0.043 | 1.11 |
| (0.066–0.095) | |||||||||
| 1 Factor OC Modelh | 46.586 | 5.1 | 2103 | 2172 | 0.92 | 0.95 | 0.109 | 0.037 | 0.882 |
| (0.080–0.141) |
aFrom (Schreiber et al., 2006)
b3 Factor Model 1 = Factor structure from SPSS EFA identifying a possible general job performance factor as Factor 1
c3 Factor Model 3 = Factor structure from EFA using the a priori defined competency domains as 3 proposed Factors
d3 Factor Model 4 = Factor structure from SPSS EFA using the a priori defined competency domains as 3 proposed Factors but with potentially redundant items removed (Procedural, emergency and teach and learn)
e3 Factor model from original EFA with all 14 items
f1 Factor model with all 14 items
g2 Factor model with all 14 items
h1 Factor model with only those items within the “operational competence” construct and no other items
Reliability for Competency Items
| Competency Item | Variance Components | Variances SEMa | Percent of Total Variance of trainees’ scores | Individual item Reliability Coefficient ( | NAAMARb |
|---|---|---|---|---|---|
| Overall Rating | 0.084 | 0.016 | 29.4 | 0.676 | 10 |
| Communication | 0.104 | 0.017 | 36.4 | 0.741 | 7 |
| Teamwork Skills | 0.067 | 0.014 | 26.9 | 0.648 | 11 |
| Professional Responsibility | 0.043 | 0.013 | 18.1 | 0.557 | 19 |
| Time Management Skills | 0.071 | 0.018 | 24.6 | 0.620 | 13 |
| Medical Records | 0.054 | 0.015 | 21.5 | 0.578 | 15 |
| Knowledge Base | 0.045 | 0.015 | 18.2 | 0.527 | 17 |
| Clinical Skills | 0.051 | 0.014 | 17.9 | 0.522 | 18 |
| Clinical Judgement | 0.105 | 0.022 | 29.6 | 0.678 | 10 |
| Awareness of Limitations | 0.046 | 0.013 | 18.3 | 0.561 | 18 |
| Professional Obligations | 0.049 | 0.012 | 19.8 | 0.543 | 16 |
| Competency Domain Construct 1 | 2.465 | 40.0 | 0.769 | 6 | |
| Competency Domain Construct 2 | 0.579 | 31.2 | 0.664 | 11 | |
| Competency Domain Construct 3 | 0.180 | 22.3 | 0.589 | 13 |
a Standard Error of the Measurement
bNAAMAR = Number (rounded to digit) of assessments for adequate minimum acceptable reliability level of R = 0.80 with the NAAMAR calculated form the formula: R (reliability coefficient) = {σ2 subjects /(σ2 subjects + σ2 error /n)}, where n = assessments needed per trainee to attain the desired reliability coefficient
Measurement invariance for nested model comparisons of major sub-groupsa
| Grouping | Model | df | χ2b |
| RMSEA | CFI | TLI | SRMR | ∆χ2 |
| ∆CFI | ∆TLI | ∆SRMR |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| (90 % CI) | |||||||||||||
| Female and Male Supervisors | Unconstrained | 107 | 302.01 | 2.82 | 0.072 | 0.914 | 0.912 | 0.0746 | |||||
| (0.063–0.082) | |||||||||||||
| All factor loadings constrained equal | 118 | 323.37 | 2.74 | 0.071 | 0.910 | 0.916 | 0.0739 | 21.36 | 0.030 | 0.004 | −0.004 | 0.0007 | |
| (0.062–0.080) | |||||||||||||
| Female and Male Trainees | Unconstrained | 107 | 296.97 | 2.775 | 0.072 | 0.916 | 0.914 | 0.0599 | |||||
| (0.062–0.081) | |||||||||||||
| All factor loadings constrained equal | 118 | 304.27 | 2.579 | 0.067 | 0.918 | 0.924 | 0.0601 | 7.299 | 0.774 | 0.002 | −0.010 | 0.0002 | |
| (0.058–0.077) | |||||||||||||
| Overseas (OTDs) and Australian Trained Doctors (ATDs) | Unconstrained | 107 | 283.35 | 2.648 | 0.069 | 0.922 | 0.919 | 0.0718 | |||||
| (0.059–0.079) | |||||||||||||
| All factor loadings constrained equal | 118 | 301.60 | 2.556 | 0.067 | 0.918 | 0.924 | 0.0710 | 18.248 | 0.076 | 0.004 | −0.004 | 0.0008 | |
| (0.058–0.076) |
aAssuming models unconstrained to be correct
bAll p-values <0.000 for the model χ2
χ2 minimum fit function chi-square, RMSEA root mean square error of approximation, CFI comparative fit index, TLI Tucker-Lewis index, SMSR standardized root mean square residual, Δ parameter difference between constrained and unconstrained model