Blog → NBME Score Interpretation

NBME Score Interpretation: What the Research Says About Predicting Step 1 Performance

March 28, 20265 min readBy Dante

You just finished an NBME practice exam. You got a 62%. Is that passing? Should you schedule your exam or push it back two weeks?

The internet is full of anecdotal score correlations. Reddit threads where one person posts their NBME 28 score and fifteen commenters extrapolate in opposite directions. That is not useful. What IS useful is the published data on how well these exams actually predict Step 1 performance.

What Each Score Range Means

These ranges are based on aggregate data from CBSSA (Comprehensive Basic Science Self-Assessment) forms and their documented correlation with actual Step 1 outcomes.

Below 55%: Red Zone

Scores in this range correlate with a high probability of failing Step 1. At this level, there are likely significant content gaps across multiple organ systems. If you are scoring here during dedicated study, this is not the time to schedule your exam. It is the time to reassess your study approach entirely.

55-61%: Borderline

This is the danger zone. You are near the pass/fail threshold, and a bad day could tip you either way. Students in this range typically have foundational knowledge but struggle with application, reasoning under time pressure, or specific high-yield topics. A targeted intervention, not just more hours, is what changes outcomes here.

62-67%: Likely Passing

Most students scoring consistently in this range will pass. Foundational knowledge is in place. The focus at this level should shift toward QBank performance, weak-area targeting, and test-taking strategy refinement.

68-74%: Comfortable Pass

You are well above the passing threshold. At this point, your efforts should center on identifying and closing the remaining content gaps, refining test-day stamina through full-length practice exams, and reducing careless errors.

75%+: Strong Performance

Scores at this level indicate comprehensive mastery. Focus on maintaining your knowledge base through continued spaced repetition and identifying the handful of remaining weak spots. Avoid delaying the exam unnecessarily, as burnout becomes a bigger risk than under-preparation.

Which Practice Exam Is Most Accurate?

Not all practice exams predict equally well. A study analyzing multiple self-assessment tools found that UWSA2 has the highest correlation with actual Step 1 scores, with an R-squared of 0.680 (PMC7198101). That means UWSA2 explains about 68% of the variance in Step 1 performance. For a single data point, that is remarkably strong.

UWSA1, on the other hand, tends to overpredict by 10 to 15 points. Students routinely score lower on Step 1 than UWSA1 suggested they would. If UWSA1 is your primary benchmark, you may have a falsely reassuring picture.

The NBMEs (CBSSAs) fall in between. According to NBME's own published guidance (2023), CBSSA scores predict Step 1 within 13 points approximately 67% of the time when taken within one week of the exam. Bigach et al. (2020) found a tighter window: CBSSAs predicted Step 1 within roughly 8 points on average (PMC8368818). Here is an important detail from that study: 88% of scores that fell outside one standard deviation were students who performed above their CBSSA prediction. NBMEs rarely overpredict.If an NBME says you're passing, you probably are.

The practical ranking: UWSA2 > NBME > UWSA1.

Why Timing and Test Conditions Matter

A 2004 study (PubMed 15383390) found that CBSSA explained 62% of Step 1 score variation, but only when taken under timed, exam-like conditions. When students self-paced their practice exams (pausing the timer, looking things up, taking extended breaks) the predictive value dropped substantially.

This makes intuitive sense. If you pause an NBME to look up a lab value, you are no longer measuring what the exam measures. You are artificially inflating your score and generating a data point that tells you nothing about your readiness.

Every practice exam should replicate test-day conditions as closely as possible: timed blocks, no references, no phone, no extended breaks. If the score was not generated under these conditions, do not use it for decision-making.

Why You Need 3+ Data Points

A single NBME score is a snapshot, not a portrait. Even UWSA2, with its R-squared of 0.680, still leaves 32% of the variance unexplained. You could have had a good day or a bad one. The question selection might have aligned perfectly (or poorly) with your strengths.

The prediction becomes far more reliable when you aggregate multiple assessments. Three or more timed, full-length practice exams taken over the course of dedicated study give you a trend line, not just a single dot. You can see whether scores are climbing, plateauing, or declining. That trajectory matters more than any individual number.

Ideally, your assessment sequence looks something like this: an early NBME to establish a baseline, one or two NBMEs during the middle of dedicated to track progress, and UWSA2 about 7 to 10 days before your exam date as your final calibration point.

If your scores are consistently in the passing range across multiple assessments under timed conditions, the data says you are ready. If they are inconsistent or consistently borderline, the data says you need to address specific weaknesses before sitting for the real thing.

The Bottom Line

Practice exam scores are the single best predictor of Step 1 performance, but only when interpreted correctly. Use the right exams (UWSA2 and recent NBMEs), take them under real conditions, collect multiple data points, and pay attention to the trend. That combination gives you a genuinely reliable forecast of where you stand.

If you want a structured framework for tracking your readiness across all of these data points, the Study Blueprint walks through exactly when to take each assessment and how to interpret the results.

Not sure what your scores mean?

Book a free consult. Bring your NBME scores and I'll tell you exactly where you stand and what to do next.

Book a Free Consult