Discourse Sensitivity in Attraction Effects: The Interplay Between Language Model Size and Training Data
Abstract
While work on the linguistic ability of language models (LMs) is driven by a variety of aims, one dominant motivation is using LMs to determine what linguistic knowledge can be learned from unstructured text. The current work aims to evaluate LMs on discourse sensitivity--the capability to distinguish between content that is more relevant and important to the discourse and that which is less so. We ground our evaluation of LMs by leveraging an existing psycholinguistics study on the number agreement attraction effect, one of the well-studied measures of human language comprehension. Based on human empirical findings on the modulation of the attraction effect by discourse, we establish three tests that LMs should pass if they demonstrate discourse sensitivity. A total of 25 models were evaluated that vary in (i) model size (small or large) and (ii) training type (dialogue-based, plain, and instruction-based). The models showed systematicity in discourse sensitivity, though in ways dissimilar to humans, either by over-relying on structural cues or overusing discourse cues. Notably, models that patterned most similarly to human performance were predominantly smaller and those trained on dialogue-targeted data. We discuss the implications of these findings and insights into human language processing.
How to Cite:
Kim, S. & Davis, F., (2025) “Discourse Sensitivity in Attraction Effects: The Interplay Between Language Model Size and Training Data”, Society for Computation in Linguistics 8(1): 14. doi: https://doi.org/10.7275/scil.3156
Downloads:
Download PDF
25 Views
20 Downloads