Ask a Test Scientist: Why Did My Quarterly Questions Score Change?


Last month, you may have received an email from the ABO indicating that your score on the 2019 Quarterly Questions assessment changed. You might be wondering why and how that happened.

Each year, the ABO reviews the performance of all questions in the Quarterly Questions activity during a process known as Key Validation. If a question is deemed “flawed” during this review, individuals are given credit for the item regardless of what answer he or she entered.

What is a flawed item? A flawed item might contain an answer that is debatable or controversial, [GBB1] [SS2] or if an important contextual or typographical error is identified. If, for any reason, our team of subject matter experts determines that an item was not a fair question, we remove it from scoring and award credit.

Why do you check for errors after the assessment has gone live? Although each question goes through a thorough quality assurance process prior to publication, some items do not perform as expected or, through diplomate feedback, we are made aware of new information relevant to the question. Checking the items both before AND after the activity has launched helps to improve the fairness, validity, and reliability of the assessment.

Let’s look at a question that appeared in the cataract subspecialty section. It read:

A patient with bilateral visually significant cataract would like to minimize the use of glasses following surgery. Corneal topography of the left eye is shown (the right eye is similar). Which of the following is the most appropriate surgical plan to help this patient achieve the desired refractive outcome?

  1. Monofocal intraocular lenses for monovision with limbal relaxing incisions

  2. Non-toric monofocal intraocular lenses for monovision

  3. Paired non-toric multifocal intraocular lenses

  4. Paired toric multifocal intraocular lenses

This question was flagged for two reasons: First, almost 11% of diplomates did not enter an answer within the allotted 60 seconds. The overall average time to answer a question is about 25 seconds and for the vast majority of items, between 0 and 1% of diplomates fail to submit an answer within the allotted time, so this item stood out. Second, of the diplomates who submitted an answer, only 54% answered correctly (choice B). Among a generally high-performing group, this low performance level indicates a possible problem with the question. Several diplomates commented that the vignette was too long to be answered within 60 seconds, and some opined that the history should have indicated that the patient had undergone prior radial keratotomy.

After an item is flagged, it is sent to one or more subject matter experts for review along with any comments that diplomates submitted about the item. For this question, the reviewer concurred with the concerns raised by our diplomates. The item was deemed flawed, scored as correct for all diplomates who took it, and replaced with a new question for future assessments.

#askapsychometrician #psychometrics #quarterlyquestions #examdevelopment #exams

© 2020 American Board of Ophthalmology