| Literature DB >> 30140344 |
Kathryn Hodwitz1, William Tays1, Rhoda Reardon1.
Abstract
This paper describes the use of Kane's validity framework to redevelop a workplace-based assessment program for practicing physicians administered by the College of Physicians and Surgeons of Ontario. The developmental process is presented according to the four inferences in Kane's model. Scoring was addressed through the creation of specialty-specific assessment criteria and global, narrative-focused reports. Generalization was addressed through standardized sampling protocols and assessor training and consensus-building. Extrapolation was addressed through the use of real-world performance data and an external review of the scoring tools by practicing physicians. Implications were theoretically supported through adherence to formative assessment principles and will be assessed through an evaluation accompanying the implementation of the redeveloped program. Kane's framework was valuable for guiding the redevelopment process and for systematically collecting validity evidence throughout to support the use of the assessment for its intended purpose. As the use of workplace-based assessment programs for physicians continues to increase, practical examples are needed of how to develop and evaluate these programs using established frameworks. The dissemination of comprehensive validity arguments is vital for sharing knowledge about the development and evaluation of WBA programs and for understanding the effects of these assessments on physician practice improvement.Entities:
Year: 2018 PMID: 30140344 PMCID: PMC6104320
Source DB: PubMed Journal: Can Med Educ J
Interpretation/Use Arguments for the Peer Assessment program
| Inference | Definition | Interpretation/Use Argument |
|---|---|---|
| Scoring | The way in which performance is measured or scored during an assessment | Assessors will accurately and consistently provide scores (ratings and feedback) that are formatively valuable for physicians and informative for committee members. |
| Generalization | The degree to which the sample of performance assessed relates to performance in other situations or domains | Assessors will review a representative sample of a physician’s performance and reliably make judgements about the physician’s practice. |
| Extrapolation | The degree to which assessment performance reflects real-world performance | Assessment data sources reflect actual practice; assessed physicians find the assessment criteria to be acceptable; physicians agree with assessors’ interpretation of their performance. |
| Implications | The accuracy of interpretations and decisions resulting from an assessment and the effects of those decisions on stakeholders | Committee members have the information they need to confidently make decisions; decision making is consistent and credible; assessed physicians find the assessment to be fair, educational, and motivating for engaging in self-directed QI. |
Figure 1Redevelopment process and alignment with Kane’s validity framework
Assessment tools
| Tool | Description |
|---|---|
| Patient Record Selection Protocols | Standardized criteria for how patient records are selected and reviewed |
| Interview Discussion Guides | Instructions on how to conduct the interview and discussion themes for promoting quality improvement |
| Scoring Rubrics | For each assessment domain, elements of high quality patient care and examples of care trends for each score (collectively, the assessment criteria); see |
| Reporting Templates | Templates for recording raw data and documenting global scores and narrative feedback |
| Quality Improvement Resources | Brief summaries of specific conditions, patient presentations, or therapeutic modalities, including references and resources for further information, to serve as educational material for physicians |
Pertinent positive and negative findings Physical measurements and vital signs, where appropriate Relevant descriptive information (e.g., dimensions indicating spread of cellulitis at presentation, quality of respiratory sounds; description of rash) Illustrations of conditions, where appropriate (e.g., location of rash, laceration, abdominal tenderness) Mental Status Examinations (MSEs) (e.g., mood and affect (including risk of harm to self/others), appearance, attitude, behavior, speech, thought process, thought content, perception, cognition, insight and judgment) Interplay of psychological and physiological factors Scoring flow sheets (e.g., PHQ-9, mini-mental state exam, pain scale) | |
| 1 | Examinations sometimes included components not relevant to the presenting complaints Mental status examinations were present but could be expanded upon |
| 2 | Descriptions of general appearance, level of alertness, and comfort level were minimal Relevant physical measurements were not consistently present (e.g., height, weight, and BMI for preventive care and other assessments) Physical examinations were often not thorough enough to fully assess current presentations (e.g., repeated diabetic assessments with no evidence of a foot examination) Important, relevant descriptive information (e.g., dimensions indicating spread of cellulitis at presentation) was often not included Illustrated/described conditions (e.g., location of rash, laceration, abdominal tenderness) were often not included when appropriate Observations tended to be poorly described Key elements of examinations (e.g., pertinent positive and negative findings) were often not documented |
| 3 | Pertinent vital signs (e.g., temperature and weight in child with infectious complaint) were consistently not documented Mental status examinations were often not included when relevant |