Literature DB >> 33235666

Internal medicine resident perspectives on scoring USMLE as pass/fail.

Sara L Wallach1,2,3, Christopher Williams4, Robert T Chow5,6, Nagesh Jadhav7,8, Sapna Kuehl6,9, Jaya M Raj10, Richard Alweis8,11,12.   

Abstract

BACKGROUND: The scoring rubric on the USMLE Step 1 examination will be changing to pass/fail in January 2022. This study elicits internal medicine resident perspectives on USMLE pass/fail scoring at the national level.
OBJECTIVE: To assess internal medicine resident opinions regarding USMLE pass/fail scoring and examine how variables such as gender, scores on USMLE 1 and 2, PGY status and type of medical school are associated with these results.
METHODS: In the fall of 2019, the authors surveyed current internal medicine residents via an on-line tool distributed through their program directors. Respondents indicated their Step 1 and Step 2 Clinical Knowledge scores from five categorical ranges. Questions on medical school type, year of training year, and gender were included. The results were analyzed utilizing Pearson Chi-square testing and multivariable logistic regression.
RESULTS: 4012 residents responded, reflecting 13% of internal medicine residents currently training in the USA. Fifty-five percent of respondents disagreed/strongly disagreed with pass/fail scoring and 34% agreed/strongly agreed. Group-based differences were significant for gender, PGY level, Step 1 score, and medical school type; a higher percentage of males, those training at the PGY1 level, and graduates of international medical schools (IMGs) disagreed with pass/fail reporting. In addition, high scorers on Step 1 were more likely to disagree with pass/fail reporting than low scoring residents.
CONCLUSION: Our results suggest that a majority of internal medicine residents, currently training in the USA prefer that USMLE numerical scoring is retained and not changed to pass/fail.
© 2020 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group on behalf of Greater Baltimore Medical Center.

Entities:  

Keywords:  Graduate medical education; United States medical licensing examination scoring; recruitment

Year:  2020        PMID: 33235666      PMCID: PMC7671726          DOI: 10.1080/20009666.2020.1796366

Source DB:  PubMed          Journal:  J Community Hosp Intern Med Perspect        ISSN: 2000-9666


Introduction

The USA Medical Licensure Examination (USMLE), a three-step standardized examination administered by the Federation of State Medical Boards (FSMB) and the National Board of Medical Examiners (NBME), is one of the key components used by state medical boards in the USA to assess candidacy for licensure [1]. Since its inception, many stakeholders in medical education have adapted the examination for their own purposes and needs [2,3]. Medical schools have increasingly forgone reporting grades and GPA, especially in the preclinical years, and shifted to using USMLE for evaluation of individual student performance [3] as well as the assessment of the overall medical school curriculum [2,4-6]. Additionally, in the last decade, graduate medical education (GME) training programs in the US have seen an exponential rise in applications without a commensurate increase in staffing or resources, leading many to increasingly utilize USMLE results as objective and standardized measures of the medical knowledge of applicants [2,7]. USMLE scores are often used as an initial screen to differentiate amongst applicants from an increasingly diverse undergraduate medical education training milieu [2,8]. Critics cite a number of problems in this extrapolated use of the examination, particularly the USMLE Step 1 examination: lack of correlation with physicians’ clinical abilities, emphasis on the importance of these examinations in medical school curricula at the expense of other essential competencies, emphasis by GME programs on scores over other metrics, and potential cultural bias [9]. Amidst considerable controversy in the value of reporting USMLE scores, in August 2019, the decision was made to change Step 1 to a pass/fail score rather than the traditional three-digit numeric score [10]. Internal medicine (IM) constitutes one-quarter of the total number of active residents and fellows among specialties accredited by the Accreditation Council for Graduate Medical Education [11]. Internal medicine trainees represent the largest share of US allopathic, osteopathic, and international medical school graduates [12]. Although IM has 15,000 more residents than the second most populous specialty, little is known about the perspectives of IM residents concerning changes to USA Medical Licensing Examination (USMLE) reporting. To our knowledge, no study has systematically surveyed current IM trainees to understand perspectives on USMLE Step 1 policy changes. We sought to assess support for pass/fail reporting, attitudes regarding the value of the examination, and group-based differences due to gender, post-graduate year (PGY) level, or type of UME training.

Methods

The authors adapted, with permission, a previously published survey instrument that assesses medical student and resident opinions on pass/fail USMLE scoring [13]. In our study, eight questions, using a five-item Likert scale of agreement (Appendix A – Survey), were administered to residents training in internal medicine in the fall of 2019. Respondents indicated their Step 1 and Step 2 Clinical Knowledge scores from five categorical ranges. Two open-ended questions gathered feedback on the pass/fail scoring schema for Steps 1 and 2. Finally, questions on medical school type, year of training, and gender assessed representativeness. The authors solicited resident survey participation using convenience sampling by posting on the single, largest discussion forum of internal medicine program directors, using a survey methodology that has been frequently cited [14,15]. The e-mail invitation included an anonymous hyperlink to the web-based survey (Survey Monkey, San Mateo, CA). An initial request was sent in October 2019, followed by five follow-up requests. The survey closed in December 2019. The Institutional Review Board of St. Francis Medical Center, Trenton, NJ approved this study.

Quantitative Data Analysis

Frequencies and percentages were used for descriptive statistics. Only PGY1 through PGY 3 resident responses were included in the analysis. Responses were not included in the analysis if the responder did not identify their PGY status or identified as PGY4. Likert scales were collapsed to three-item scales for analysis. The Pearson chi-square test was used to report P values for group-based differences between categorical variables. Population estimates for post-graduate year (PGY) level were available from the Accreditation Council for Graduate Medical Education (ACGME), and gender and medical school type were publicly available from the American Board of Internal Medicine [11,16]. Post-hoc comparisons were performed with Bonferroni correction for each combination of independent categories to control for type I error inflation when the Pearson chi-square comparisons were significant. We evaluated the probability values for each categorical value against the adjusted alpha. Gender, PGY, Step 1 score, Step 2 score and medical school were used to assess differences in agreement. Data were analyzed using IBM SPSS Statistics for Windows, version 26 (IBM Corp., Armonk, NY, USA). We used a hierarchical modeling approach and performed a multivariate logistic regression for agreement with USMLE pass/fail reporting. Our model incorporates self-reported categorical Step 1 and Step 2 CK scores, attitudes on whether each respective exam reflected knowledge at the time of examination, and medical school type. Data were weighed for medical school type. The alpha level was set at 0.05.

Results

Four thousand twelve respondents participated in the survey; 265 were removed due to PGY exclusion criteria. Total respondents represented 13% (3,747/29,418) of all PGY-1, PGY-2, and PGY-3 IM residents training in the USA. The distribution of gender and PGY level did not differ from the overall population of IM residents (Table 1). The post-hoc analysis showed, however, a small but significant shortfall of osteopathic and slight over-representation of international medical graduates in our respondent group (p = 0.00001) compared to the overall IM trainee population (Table 1).
Table 1.

Respondents versus total population based on gender, PGY, USMLE Step 1 and Step 2 CK scores, and medical school type.

 
Male n (%)
Female n (%)
p-value
PGY1 n (%)
PGY2 n (%)
PGY3 n (%)
p-value
   0.095   0.491
Total Population15,497 (58)11,315 (42) 11,114 (39)8839 (31)8467 (30) 
Respondents
2089 (56)
1618 (44)
 
1469 (39)
1131 (30)
1142 (31)
 
 
US Allopathic
US Osteopathic
International
p-value
    <0.00001*
Total Population12,860 (45)4499 (16)*a11,061 (39) 
Respondents1756 (47)366 (9.8)*a1594 (43)*a 

* <0.05.

aPost-analysis, Bonferroni correction.

Respondents versus total population based on gender, PGY, USMLE Step 1 and Step 2 CK scores, and medical school type. * <0.05. aPost-analysis, Bonferroni correction. Two thousand seventy-three (55%) either ‘strongly disagree’ or ‘disagree’ that Step 1 and Step 2 should be reported as pass/fail while 34% (1,259) agreed, and 11% (403) gave neutral responses. Group-based differences were significant for gender, step 1 score, step 2 score, and medical school type. A higher percentage of males and IMG disagreed with the pass/fail decision. In addition, high scorers on Step 1 and Step 2 were more likely to disagree with pass/fail reporting than low scoring residents. In post-hoc analysis, each gender category and most categories for Step 1, Step 2 and medical school types were significant. Most respondents agreed that 1) the USMLE accurately estimates knowledge (1,584, 42%) and that 2) Step 1 (1793, 48%) and 3) Step 2 CK (2041, 55%) reflected their fund of knowledge at the time of examination (Table 2). Significant between-group differences for Step 1 scores, Step 2 scores, and medical school type were found across all three of these questions upon initial analysis, and later confirmed by post-hoc analysis. Post-hoc comparisons were significant for Step 1 score, Step 2 score categories, and many medical school types. Gender differences were significant for USMLE knowledge estimation and knowledge reflection of Step 1, but not for Step 2 CK. After Bonferroni correction, agreement, and disagreement for USMLE overall and Step 1 had statistically significant counts (Table 2).
Table 2.

Agreement with pass/fail USMLE reporting and attitudes about USMLE based on applicant characteristics.

 Step 1 and Step 2 Should be Pass/Fail
USMLE or Standardized Exam Provides Accurate Estimate of Knowledge
Step 1 Accurately Reflected My Fund of Knowledge at Examination
Step 2 Accurately Reflected My Fund of Knowledge at Examination
 
Disagree
Neutral
Agree
p-value
Disagree
Neutral
Agree
p-value
Disagree
Neutral
Agree
p-value
Disagree
Neutral
Agree
p-value
Gendera   <0.001*   <0.001*   <0.001*   0.432
Male1237 (59)**190 (9)**658 (32)** 710 (34)**424 (20)952 (46)** 700 (34)**325 (16)1062 (51)** 563 (27)366 (18)1152 (55) 
Female818 (51)**210 (13)**588 (36)** 664 (41)**332 (21)620 (38)** 627 (39)**271 (17)717 (44)** 430 (27)310 (19)871 (54) 
Residency   0.008*   0.446   0.004*   0.572
PGY1841 (57.3%)126 (8.6%)**500 (34.1%) 531 (36)302 (21)633 (43) 548 (37)203 (14)**716 (49) 393 (27)252 (17)818 (56) 
PGY2604(53.5%)135 (11.9%)391(34.6%) 446 (40)219 (19)465 (41) 412 (37)210 (19)507 (45) 304 (27)222 (20)601 (53) 
PGY3628(55.2%)142(12.5%)368(32.3%) 413 (36)240 (21)486 (43) 382 (34)188 (17)570 (50) 307 (27)207 (18)622 (55) 
Step1b   <0.001*   <0.001*   <0.001*   <0.001*
<20011 (14)**11 (14)59 (73)** 64 (79)**10 (12)7 (9)** 67 (83)**7 (9)7 (9)** 38 (48)**20 (25)22 (28)** 
200–220209 (30)**85 (12)398 (58)** 409 (59)**138 (20)146 (21)** 467 (67)**109 (16)117 (17)** 304 (44)**136 (20)249 (36)** 
221–240685 (50)**171 (13)506 (37)** 571 (42)**306 (23)483 (36)** 576 (42)**287 (21)**499 (37)** 417 (31)**289 (21)**652 (48)** 
241–260952 (76)**105 (8)*193 (15)** 240 (19)**246 (20)765 (61)** 137 (11)**153 (12)**961 (77)** 166 (13)**175 (14)**909 (73)** 
>260146 (84)**11 (6)17 (10)** 20 (12)**26 (15)128 (74)** 8 (5)**10 (6)**156 (90)** 11 (6)**10 (6)**153 (88)** 
Step2c   <0.001*   <0.001*   <0.001*   <0.001*
Less than or equal to 22084(31)**28(10)162(59)** 171(62)**47(17)56(20)** 176(64)**32(12)66(24)** 185(67)**40(15)49(18)** 
221–240505(43)**146(12)523(44)** 561(48)**245(21)367(31)** 565(48)**214(18)396(34)** 470(40)**279(24)**426(36)** 
241–2601011(63)**167(10)420(26)** 490(31)**333(21)776(48)** 455(28)**255(16)889(56)** 250(16)**271(17)1078(67)** 
>260413(79)**48(9)59(11)** 79(15)**107(21)334(64)** 57(11)**59(11)**404(78)** 29(6)**43(8)**447(86)** 
Medical School   <0.001*   <0.001*   <0.001*   0.014*
US Allopathic826 (47)**220 (13)**706 (40)** 759 (43)**386 (22)608 (35)** 691 (39)*295 (17)768 (44)* 477 (27)335 (19)942 (54) 
US Osteopathic192 (53)32 (9)141 (39) 145 (40)80 (22)140 (38) 152 (42)*65 (18)147 (40) 117 (33)67 (19)171 (48) 
International1040 (66)**148 (9)405 (25)** 481 (30)**289 (18)822 (52)** 495 (31)*239 (15)859 (54)* 404 (25)277 (17)911 (57) 
Overall2073 (55)403 (11)1259 (34) 1390 (37)761 (20)1584 (42) 1342 (36)601 (16)1793 (48) 1004 (27)681 (18)2041 (55) 

* <0.05.

** Significant, post-analysis, Bonferroni correction.

aExcludes ‘non-binary’, ‘other’, and ‘I do not wish to respond’.

bExcludes ‘I prefer not to respond’.

cExcludes ‘I don’t wish to respond’.

Agreement with pass/fail USMLE reporting and attitudes about USMLE based on applicant characteristics. * <0.05. ** Significant, post-analysis, Bonferroni correction. aExcludes ‘non-binary’, ‘other’, and ‘I do not wish to respond’. bExcludes ‘I prefer not to respond’. cExcludes ‘I don’t wish to respond’. Our stepwise logistic regression model (Table 3) showed that the lowest performing groups for USMLE step 1 and USMLE Step 2 CK were 2.4 times and 1.6 times more likely to agree with pass/fail reporting compared to the top performing groups in the same category, respectively. All analyses were significant at the 0.05 significance level except for US osteopathic school applicants. In comparison to respondents who agreed that Step 1 or Step 2 CK reflected knowledge at the time of examination, those who disagreed were, respectively, 4.6 and 2.3 times more likely to agree with pass/fail reporting. US allopathic medical school graduates were two times more likely to agree with pass/fail reporting than international medical school graduates.
Table 3.

Multivariate logistic regression with agreement with USMLE pass/fail reporting as explanatory outcomea

 Model (R2 = 0.24)
 OR95% CIP
USMLE Step 1  <0.001**
 ≤2202.371.90–2.95<0.001**
 221–2401.571.32–1.87<0.001**
 ≥241 (reference)  
USMLE Step 2 CK  <0.001**
 ≤2201.641.28–2.12<0.001**
 221–2401.311.12–1.530.001**
 ≥241 (reference)  
Step 1 Reflects Knowledge  <0.001**
 Disagree4.563.79–5.50<0.001**
 Neutral2.071.68–2.56<0.001**
 Agree (reference)  
Step 2 Reflects Knowledge  <0.001**
 Disagree2.251.87–2.70<0.001**
 Neutral1.891.56–2.28<0.001**
 Agree (reference)
Medical School  <0.001**
 US Allopathic2.061.79–2.37<0.001**
 US Osteopathic1.100.88–1.380.394
 International (reference)

aWeighted for medical school.

* p < 0.05

** p < 0.01.

Multivariate logistic regression with agreement with USMLE pass/fail reporting as explanatory outcomea aWeighted for medical school. * p < 0.05 ** p < 0.01.

Discussion

This is the first study to assess systematically IM resident perspectives about changes in USMLE reporting from continuous to pass/fail. A majority of resident respondents (55%) disagreed with the decision to change USMLE Step 1 to pass/fail scoring. Residents’ preferences for USMLE reporting correlated with examination performance, attitudes about the test’s ability to reflect knowledge, and medical school type in the regression model. Low Step 1 scores, US allopathic status, and disagreement that the USMLE is an accurate assessment of knowledge were significant predictors for agreement with pass/fail reporting. The majority (66%) of international medical graduates responded that they disagree with pass/fail scoring. This study has several limitations. The survey responses reflect the opinions of 13% of all Internal Medicine residents in the USA, and residents from only one specialty were surveyed. The r-squared was low at 0.24, meaning that it does not account for most of the variation in the data. Although the demographic distribution of respondents was similar to that of the total population of Internal Medicine residents in terms of gender, PGY level, and percentage of US allopathic graduates, international graduates were slightly over-represented and osteopathic graduates were under-represented. Future studies should endeavor to explore further the needs and perspectives of international and osteopathic trained residents, who constitute an essential part of our physician workforce.

Conclusion

The scoring rubric on the USMLE Step 1 examination will be changing to pass/fail in January 2022. Our aggregated results suggest that a majority of internal medicine residents may disagree with this decision. Further decisions regarding the scoring of USMLE Step 2 should consider these viewpoints.
  12 in total

1.  Numerical Versus Pass/Fail Scoring on the USMLE: What Do Medical Students and Residents Want and Why?

Authors:  Catherine E Lewis; Jonathan R Hiatt; Luann Wilkerson; Areti Tillou; Neil H Parker; O Joe Hines
Journal:  J Grad Med Educ       Date:  2011-03

2.  The relationship between the National Board of Medical Examiners' prototype of the Step 2 clinical skills exam and interns' performance.

Authors:  Marcia L Taylor; Amy V Blue; Arch G Mainous; Mark E Geesey; William T Basco
Journal:  Acad Med       Date:  2005-05       Impact factor: 6.893

3.  Selection criteria for residency: results of a national program directors survey.

Authors:  Marianne Green; Paul Jones; John X Thomas
Journal:  Acad Med       Date:  2009-03       Impact factor: 6.893

4.  The relationship between internal medicine residency graduate performance on the ABIM certifying examination, yearly in-service training examinations, and the USMLE Step 1 examination.

Authors:  Cynthia Kay; Jeffrey L Jackson; Michael Frank
Journal:  Acad Med       Date:  2015-01       Impact factor: 6.893

5.  Internal Medicine Residency Program Directors' Screening Practices and Perceptions About Recruitment Challenges.

Authors:  Steven V Angus; Christopher M Williams; Emily A Stewart; Michelle Sweet; Michael Kisielewski; Lisa L Willett
Journal:  Acad Med       Date:  2020-04       Impact factor: 6.893

6.  Use of NBME and USMLE examinations to evaluate medical education programs.

Authors:  R G Williams
Journal:  Acad Med       Date:  1993-10       Impact factor: 6.893

Review 7.  Use of the USMLE to select residents.

Authors:  E S Berner; C M Brooks; J B Erdmann
Journal:  Acad Med       Date:  1993-10       Impact factor: 6.893

8.  Background essential to the proper use of results of step 1 and step 2 of the USMLE.

Authors:  M J O'Donnell; S S Obenshain; J B Erdmann
Journal:  Acad Med       Date:  1993-10       Impact factor: 6.893

9.  The USMLE, the NBME subject examinations, and assessment of individual academic achievement.

Authors:  K I Hoffman
Journal:  Acad Med       Date:  1993-10       Impact factor: 6.893

10.  The impact of United States Medical Licensing Exam (USMLE) step 1 cutoff scores on recruitment of underrepresented minorities in medicine: A retrospective cross-sectional study.

Authors:  Myia Williams; Eun Ji Kim; Karalyn Pappas; Omolara Uwemedimo; Lyndonna Marrast; Renee Pekmezaris; Johanna Martinez
Journal:  Health Sci Rep       Date:  2020-04-20
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.