Do LLMs Understand Anaphora Accessibility?
Abstract
We propose the task of anaphora accessibility as a diagnostic for assessing discourse understanding, and to this end, present an evaluation dataset inspired by theoretical research in dynamic semantics. We evaluate human and LLM performance on our dataset and find that LLMs and humans align on some tasks and diverge on others. Such divergence can be explained by LLMs' reliance on specific lexical items during language comprehension, in contrast to human sensitivity to structural abstractions.
Keywords: dynamic semantics, anaphora accessibility, large language model, targeted evaluation
How to Cite:
Zhu, X., Zhou, Z., Charlow, S. & Frank, R., (2025) “Do LLMs Understand Anaphora Accessibility?”, Society for Computation in Linguistics 8(1): 36. doi: https://doi.org/10.7275/scil.3154
Downloads:
Download PDF
32 Views
11 Downloads