Article

Editorial Changes and Item Performance: Implications for Calibration and Pretesting

Authors
  • Heather Stoffel
  • Mark R. Raymond
  • S. Deniz Bucak
  • Steven A. Haist

Abstract

Previous research on the impact of text and formatting changes on test-item performance has produced mixed results. This matter is important because it is generally acknowledged that any change to an item requires that it be recalibrated. The present study investigated the effects of seven classes of stylistic changes on item difficulty, discrimination, and response time for a subset of 65 items that make up a standardized test for physician licensure completed by 31,918 examinees in 2012. One of two versions of each item (original or revised) was randomly assigned to examinees such that each examinee saw only two experimental items, with each item being administered to approximately 480 examinees. The stylistic changes had little or no effect on item difficulty or discrimination; however, one class of edits – changing an item from an open lead-in (incomplete statement) to a closed lead-in (direct question) – did result in slightly longer response times. Data for nonnative speakers of English were analyzed separately with nearly identical results. These findings have implications for the conventional practice of repretesting (or recalibrating) items that have been subjected to minor editorial changes. Accessed 3,791 times on https://pareonline.net from November 11, 2014 to December 31, 2019. For downloads from January 1, 2020 forward, please click on the PlumX Metrics link to the right.

Keywords: Test Construction

How to Cite:

Stoffel, H., Raymond, M. R., Bucak, S. D. & Haist, S. A., (2014) “Editorial Changes and Item Performance: Implications for Calibration and Pretesting”, Practical Assessment, Research, and Evaluation 19(1): 14. doi: https://doi.org/10.7275/yn9j-qn49

Downloads:
Download PDF
View PDF

167 Views

35 Downloads