| Literature DB >> 34912922 |
Mostafa Dehghani Poudeh1, Aeen Mohammadi2, Rita Mojtahedzadeh2, Nikoo Yamani3, Ali Delavar4.
Abstract
BACKGROUND: Kane's validity framework examines the validity of the interpretation of a test at the four levels of scoring, generalization, extrapolation, and implications. No model has been yet proposed to use this framework particularly for a system of assessment. This study provided a model for the validation of the internal medicine residents' assessment system, based on the Kane's framework.Entities:
Keywords: Educational measurement; Kane's framework; graduate; internship and residency; medical; reliability and validity; validity of results
Year: 2021 PMID: 34912922 PMCID: PMC8641708 DOI: 10.4103/jehp.jehp_1500_20
Source DB: PubMed Journal: J Educ Health Promot ISSN: 2277-9531
Figure 1Flowchart of systematic review of Kane's validity studies
Figure 2Flow chart of the study method
Table of initial assumptions extracted from literature in the assessment system of internal medicine residents at the four inferences of the Kane’s validity framework
| Validity level | Assumptions | Sub-assumptions |
|---|---|---|
| Scoring | The test is properly designed and executed, and also, the scores are a true and accurate representation of the observations. In other words, in addition to the fact that the observations must be done according to the principled and correct methods, the translation of the observations into scores has also been done correctly. | The designers of the various test questions have received the necessary training on the characteristics of each test |
| The assessment system has a comprehensive plan and an overall blueprint | ||
| Tests cover different inferences of competence | ||
| Each test has a blueprint and the questions are formulated accordingly | ||
| The minimum passing score in the assessment system under study is determined based on coherent and logical methods and based on scientific principles | ||
| Each of the tests has good internal consistency | ||
| The design of the questions, the holding and execution of each test has proceeded according to scientific principles | ||
| Generalization | The tests evaluate appropriate examples of the competencies, expected from the residents, and their results can be generalized to all competencies. | Tests are a good example of the different levels of Miller Pyramid competencies |
| Test items are a good example of the content to be evaluated | ||
| The tests have good reliability | ||
| The tests have little error | ||
| The difference between the scores in the tests in general is just due to the real difference between the abilities of the residents and not due to other factors | ||
| Residents’ final tests and scores have an acceptable generalizability coefficient | ||
| Extrapolation | In addition to being correlated with each other, tests of different inferences of competence also have good predictability for each other | The test results are such that they distinguish the residents of the older years from the residents of the younger years. |
| The questions, scenarios and problems of the patients, raised in the tests, correspond to the real world conditions | ||
| There is a good correlation between the scores of the corresponding competencies in different tests | ||
| Low levels of competency scores predict higher levels | ||
| Implications | Granting assistants to enter promotion and board exams is consistent with their actual performance throughout the year in the workplace and the results of promotion and board exams. | There is a correlation between the scores obtained in the departmental exams, and the score of the regional or national promotion exam and board exam. |
| The scores, obtained in the exams by residents, are correlated with their medical orders | ||
| Test scores show the trend of increasing the experience and ability of residents in each year | ||
| Test scores show the trend of increasing the experience and ability of residents during the course of study |
Proposed methods extracted from literature to evaluate the validity of the system of assessment of specialized residents of internal medicine based on the Kane’s framework
| Type of test Scoring | Methods and measures required to validate tests at each level of the Kane’s framework | |||
|---|---|---|---|---|
|
| ||||
| Generalization | Extrapolation | Implications | ||
| Written exams | ||||
| Objective (pre-progress test) | Checking the training of item designers | Reviewing the results of test analysis | Investigating the difference in scores in different years of residency | Checking the correlation of scores with the results of the progress test |
| Essay | Examining the quality of questions | Checking the sampling of questions | Checking the correlation of the corresponding questions in different tests | Checking the correlation of scores with the results of the progress test |
| OSCE | Checking the training of question designers | Examining how the curriculum is sampled to determine stations | Investigating the correlation between station scores and corresponding tests | Checking the correlation of scores with pre-progress test |
| Mini-CEX | Checking the training of question designers | Frequency of test components (patient type, test setting, disease complexity, test focus type .) | Reviewing the progress of scores in different months | Investigating the correlation of scores with the pre- progress test |
| Intra-wards score | Checking the holding according to the comprehensive schedule of residents’ exams | Checking the reliability of scores | Checking the correlation between the corresponding items in different tests | Assessing the correlation between test/assessment results based on the general opinion of professors |
| Professional behavior score | Checking how to complete the tool | Checking the reliability of scores | Checking the correlation between the corresponding items in different tests | Checking the correlation between the corresponding items in different tests |
| Logbook | Checking how the residents complete the logs | Checking the reliability of scores | Checking the correlation between the corresponding items in different tests | Assessing the correlation between test/assessment results based on the general opinion of professors |
| Record writing score | How to design test tools | Checking the reliability of scores | Checking the correlation between the corresponding items in different tests | Assessing the correlation between test/assessment results based on the general opinion of professors |
| Final scores | Checking the conformity of how to calculate the score with the regulations | Investigating the factors of variations in scores | Checking the correlation between the scores of different tests | Investigating how minimum pass levels are implemented for each year |
Assumptions in the assessment system of internal medicine assistants at the four inferences of Kane’s validity framework
| Validity level | Assumptions | Sub-assumptions |
|---|---|---|
| Scoring | The test is properly designed and executed, and also, the scores are a true and accurate representation of the observations. In other words, in addition to the fact that the observations must be done according to the principled and correct methods, the translation of the observations into scores has also been done correctly | The designers of the various test questions have received the necessary training on the characteristics of each test |
| The assessment system has a comprehensive plan and an overall blueprint | ||
| Tests cover different inferences of competence | ||
| The schedule of tests has gone according to the plan | ||
| Each test has a blueprint and the questions are formulated accordingly | ||
| The minimum passing score in the assessment system under the study is determined based on coherent and logical methods and based on the scientific principles | ||
| Each of the tests has good internal consistency | ||
| The design of the questions, the holding and execution of each test has proceeded according to scientific principles | ||
| Generalization | The tests evaluate appropriate examples of the competencies, expected from the residents, and their results can be generalized to all competencies | Tests are a good example of the different levels of Miller Pyramid competencies |
| Test items are a good example of the content to be evaluated | ||
| The tests have good reliability (error rate is low) | ||
| The difference between the scores in the tests in general is just due to the real difference between the abilities of the residents and not due to other factors | ||
| Residents’ final tests and scores have an acceptable generalizability coefficient | ||
| Extrapolation | In addition to being correlated with each other, tests of different inferences of competence also have good predictability for each other | The test results are such that they distinguish the residents of the older years from the residents of the younger years. |
| The questions, scenarios, and problems of the patients, raised in the tests, correspond to the real-world conditions | ||
| There is a good correlation between the scores of the corresponding competencies in different tests | ||
| Low levels of competency scores predict higher levels | ||
| Implications | Granting assistants to enter promotion and board exams is consistent with their actual performance throughout the year in the workplace and the results of promotion and board exams | There is a correlation between the scores obtained in the group exams, and the score of the regional or national promotion exam and board exam |
| The scores, obtained in the group exams, are correlated with the general opinions of the professors about each assistant | ||
| Test scores show the trend of increasing the experience and ability of residents in each year | ||
| Test scores show the trend of increasing the experience and ability of residents during the course in 1 year |
Tools and methods used to evaluate the validity of the system of assessment of specialized assistants for internal diseases based on the Kane’s framework
| Competency level | Type of test | Methods and measures required to validate tests at each level of the Kane’s framework | |||
|---|---|---|---|---|---|
|
| |||||
| Scoring | Generalization | Extrapolation | Implications | ||
| Knows and knows how | Written exams, multiple choice, descriptive tests and PMPa | Reviewing the results of test analysis | Investigating the difference in scores in different years of residency | Checking the correlation of scores with the results of the progress testd
| |
| Shows how | OSCE | Checking the quantity and quality of training of question designers | Examining how the curriculum is sampled to determine stations | Investigating the correlation between station scores and corresponding testsg
| Checking the correlation of scores with progress test results |
| Shows | Mini-CEX | Checking the quantity and quality of training of question designers | Frequency of test components (patient type, test setting, disease complexity, test focus type) | Reviewing the progress of scores in different months | Investigating the correlation of scores with the results of the progress test |
| Does | Intra-wards score and 360-degree assessment | Checking the holding according to the comprehensive schedule of residents’ exams | Checking the reliability of scores | Checking the correlation between the corresponding items in different tests | Assessing the correlation between test/assessment results based on the general opinion of professors |
| Professional behavior score | How to compile test tools | Checking the reliability of scores | Checking the correlation between the corresponding items in different tests | Checking the correlation between the corresponding items in different tests | |
| Logbook | Checking how the residents complete the logs | Checking the reliability of scores | Checking the correlation between the corresponding items in different tests | Assessing the correlation between test/assessment results based on the general opinion of professors | |
| Record writing score | How to design test tools | Checking the reliability of scores | Checking the correlation between the corresponding items in different tests | Assessing the correlation between test/assessment results based on the general opinion of professors | |
| Final scores | Checking the conformity of how to calculate the score with the regulations | Investigating the factors of variations in scores between assistants | Checking the total generalizability coefficient | Checking the correlation between the corresponding items in different tests | Investigating the effect of residency year on scores |
aPatient Management Problems is a written test to assess problem-solving ability or clinical reasoning, bThe purpose is to statistically calculate the degree of generalizability of the results to the total expected results of the examinee, cThe first part of each question in the medical exams, which describes the main situation and context of the problem to ask the relevant questions, dIt is a written test that is held at the end of each residency year to grant entry permission to a higher year, eComments that are made at the end of each year by the professors on a subjective basis about each resident, fThe reliability of a test shows the degree of reproducibility of scores or test results and is calculated by determining the degree of correlation between the scores obtained from the repetition of a test or two halves of a test, gCorresponding tests or competencies are tests or competencies that measure a common construct. OSCE=Objective Structured clinical Examination, Mini-CEX=Mini Clinical Evaluation Examination, PMP=Patient Management Problems is a written test to assess problem-solving ability or clinical reasoning.