Dr. Charles Zhang and Dr. Alyssa Ehrlich
On February 12, 2020, it was first announced that the United States Medical Licensing Examination (USMLE) Step 1 would change the score reporting from a three-digit numeric score to a pass/fail result, with this change set to take effect no earlier than January 1, 2022. More recently, on September 15th, it was announced on the USMLE website that:
“USMLE Step 1 score reporting will transition from a numeric score and pass/fail outcome to pass/fail only for exams taken on or after January 26, 2022. The USMLE program views this change as an important first step toward facilitating broader, system-wide changes to improve the transition from undergraduate to graduate medical education.”
In this announcement, it was again confirmed that scores for all Step 1 exams taken before January 26th will continue to be reported as a numeric score on all USMLE transcripts (i.e., residency programs will still be able to see a numeric score, as long as you take your exam by January 25th, 2022). This major shift in score reporting will clearly have huge implications for the residency application process. Such a large change has resulted in uncertainty and confusion for many medical students. Here, we offer our answers to some of the most common questions our students ask about the score reporting transition.
Why did this happen?
The primary argument in favor of shifting to pass/fail was to relieve the psychological stress imposed on medical students due to the overemphasis of current residency programs on Step 1 test results. According to the program director survey conducted by the NRMP across many years, Step 1 has consistently been ranked as the number one factor in deciding if their program will extend interview offers to applicants. The more competitive the specialty, the more weight is typically placed on an applicant’s Step 1 score. Accordingly, medical students have embarked on a never-ending arms race to improve their scores in an effort to attain those limited, coveted interview spots.
This came at a price, with some students sacrificing important academic and non-academic endeavors unrelated to Step 1. Medical students were forced to devote an increasingly disproportionate time of their preclinical years in preparation for a single test that will likely dictate which fields of medicine they can ultimately pursue. This singular emphasis brought an increased sense of distress and mental anguish to many students, particularly for those who struggle with standardized testing. When arriving at this decision, the AMA acknowledged that the current residency application system is “causing significant distress for our students”. In our experience as tutors, we have certainly seen this to be the case, with many students experiencing significant psychological suffering throughout the Step 1 preparation period.
How did we get here?
Step 1 was never intended to be used as a screening tool for residency applications. It’s a multiple-choice question test of basic science knowledge; it was designed for medical licensure. At one point in time, residency directors noticed that those with higher scores had higher pass rates on their respective specialty board exams. While there are many publications that support this argument, most demonstrate diminishing returns after reaching a plateau of 210 (which, if you account for score inflation, would likely equate to a score of 215-220). With certain specialties such as Dermatology and Plastic Surgery having hard cutoffs of 240 or higher, there is little data to make the argument that these students are any more likely to pass their specialty board exams, relative to students scoring a 220 on Step 1.
Then why have program directors relied on Step 1 scores so heavily for screening applicants? Simply put, Step 1 scores seemed to be the most objective and convenient measure by which to reduce the number of applications requiring a thorough review. Residency program directors have to sift through hundreds of applications while balancing their roles as clinicians, educators, and administrators. If you speak to any residency director it’s clear they believe that every prospective applicant should be evaluated holistically with a thorough evaluation of their entire application. However, the time required to do this makes a holistic review of every application essentially impossible for any program to achieve. Consider, for example, the fact that there are a total of 47,012 applicants who submit an average of 92 applications each, for a total of over 4.3 million applications in 2019.
Furthermore, the number of applications continues to rise every year. With such a massive number of applications, cutoffs are more than just convenient in the residency application review process; they are necessary. It follows then, that Step 1 changing to a pass/fail reporting system will not eliminate filters in the application process; rather, other metrics such as the Step 2 CK score or medical school ranking, will simply take the place of the Step 1 score. Even if programs did review more applications individually as a result of the change in score reporting, program directors would likely need to spend less time per application as a result.
Where does this change fall short?
One of the major arguments against changing the score reporting for Step 1 is that this change doesn’t solve the problem that resulted in such a strong emphasis being placed on Step 1 scores in the first place. Beyond the practical limitations of the time that program directors have to review applications, there is a mismatch between the limited number of spots available in residency programs (especially in competitive specialties), and the number of US and IMG applicants who wish to apply for those spots. For most US allopathic seniors applying to residency, there is a limited number of programs in one’s desired specialty, which is further limited by geographical and other preferences. For IMGs, the situation is even more challenging, as not all residency programs sponsor visas. Program directors will still have as many applications to review, and there will still be a strong need for a seemingly objective metric that can help reduce the number of applications requiring holistic review.
While it can be argued that there will likely be increased emphasis on more holistic components of the application, such as clinical grades, research and volunteer activity, Step 2 CK remains as an appealing, standardized metric that can be easily implemented as a screening tool. As long as Step 2 CK results are reported as three-digit scores, it’s very likely that it will replace Step 1 as the de facto screening tool for residency interviews. In addition to this likely outcome (“Step 2 is the new Step 1”), a larger emphasis may be placed in the screening process on which medical school a student attends. While this is good news for those who attend top 40 medical schools, it is disadvantageous for IMGs and those attending lower-ranked schools.
Where does this leave us?
Step 2 CK becoming “the new Step 1” has both potential benefits and downsides. Some of the negativity expressed by medical students towards Step 1 stems in part from a supposed overemphasis on minutiae that likely will have minimal clinical relevance. It can be argued that having students dedicate more effort towards a clinically oriented exam such as Step 2 CK could resolve this dissonance and even result in physicians beginning residency with a stronger clinical knowledge base. Many patients, for example, would rather have a clinician who focused more time learning about the management of a CHF exacerbation rather than one who can recite the Krebs cycle from memory.
Unfortunately, it’s not clear that this change in emphasis will translate into better clinical care. Similar to Step 1, Step 2 CK was also designed to be a licensure exam. Like Step 1, it consists of multiple-choice questions where patients are presented in a linear stem and 5 or more potential answer choices are fed to the student. The way the test is structured places emphasis on a specific set of skills, and literature exists that suggests that there is a limited correlation between scores on this test and performance in clinical situations such as ACLS management. In many ways, the real-life process of clinical decision-making requires a very different cognitive approach than does a multiple-choice exam. Rather than emphasizing a broad differential diagnosis, multiple-choice exams can reward thought patterns like anchoring and premature closure, cognitive errors that can be detrimental in clinical practice.
Furthermore, it can be argued that the change in Step 1 score reporting is not actually advancing the NBME’s purported goal of reducing medical student anxiety. Whereas medical students previously had two opportunities to earn a high numeric score--both Step 1 and Step 2 CK--now the Step 2 CK score must stand on its own, immensely increasing the pressure on the outcome of a single day of testing. Previously, those that did poorly on Step 1 had another attempt to demonstrate their academic prowess by showing improvement on Step 2 CK. By making Step 2 CK the one and only opportunity for students to prove themselves via an exam score, we run the risk of students neglecting clinical learning opportunities during their clerkship year and using this time to instead focus on Step 2 CK exam preparation. Moreover, the singular emphasis on Step 2 CK is likely more distressing for those who suffer from test-taking anxiety, and it is often a more challenging exam than Step 1 for those who struggle with effectively implementing advanced test-taking strategies.
What lies in the future?
The announcement of Step 1 becoming a pass/fail system has shaken up the established status quo of the US residency application system in many ways. Most would agree (ourselves included) that too much emphasis was placed on the three-digit Step 1 score and that a more holistic review of an applicants’ experiences, personality, and academic qualifications would be ideal. Unfortunately, unless the number of programs applicants can apply to is limited in some way, or the amount of time program directors have to review applications is dramatically increased, screening tools are a practical necessity. The residency program application process, particularly for competitive specialties, would not function without an objective tool to filter their ever-growing number of applications to a more manageable pool. Thus, so long as Step 2 CK retains a three-digit score, it will likely be what program directors will look towards to fulfill this need, perhaps in combination with a student’s medical school ranking.
So, what does this mean for me?
For the vast majority of IMGs and those who attend lower-tier medical schools, we highly recommend taking Step 1 by the January 25th change deadline. If you are interested in a competitive specialty and tend to perform well on standardized exams, it is also likely sensible for you to take Step 1 while it still has a three-digit score. Finally, regardless of whether you take Step 1 before or after the pass/fail transition this coming January, your Step 2 CK score is certainly going to matter.
Charles Zhang, MD is a PGY-2 ophthalmology resident at the University of Buffalo. He tutors for the USMLE Step 1, Step 2 CK, and Step 3 and serves as a residency application advisor for USMLE Pro.
Alyssa Ehrlich, MD is currently a PGY-2 psychiatry resident at Brigham and Women’s Hospital. She is a Step 2 CK and Step 3 tutor, residency application advisor, and the founder of USMLE Pro.