Skip to main content
Article

How raters differ: A study of structured oral mathematics assessment

Author
  • Samuel Sollerman orcid logo (Stockholm University)

Abstract

This article explores where and why raters disagree in structured oral mathematics assessments. Based on Swedish data involving 74 student performances across three national upper-secondary oral test formats, six experienced raters evaluated student reasoning, communication, and method using shared rubrics. Despite structured tasks, raters showed substantial disagreement, especially in evaluating reasoning and communication. Multiple agreement measures and Svensson’s method were used to identify both systematic and unsystematic patterns of divergence.  Raters also reported high confidence in their scoring, often misaligned with actual agreement. Using a typological framework for oral assessment, the study shows how the format structure and interaction influence scoring interpretation. These findings underscore the need for reliable assessment practices in systems increasingly focused on competencies and accountability. The article identifies four strategies to improve scoring consistency—enhanced rubrics, rater training, reflective tools, and collaborative assessment—and argues that reliable oral assessment is both important and achievable.

Keywords: Oral assessment, Inter-rater reliability, Mathematics Education, Rater judgement, Structured assessment formats

How to Cite:

Sollerman, S., (2026) “How raters differ: A study of structured oral mathematics assessment”, Practical Assessment, Research, and Evaluation 31(1): 2. doi: https://doi.org/10.7275/pare.3268

Downloads:
Download PDF
View PDF

204 Views

152 Downloads

Published on
2026-01-08

Peer Reviewed