Generics are puzzling. Can language models find the missing piece?
Abstract
Generic sentences make general statements about the world without using specific quantities. Although they are important in everyday communication, creating a precise way to understand them is challenging. This is partly because people use generics to describe properties that vary widely in how common they are.
In our study, we examine how generics imply quantities and how they depend on context by using language models. We developed a dataset called ConGen, which includes 2,873 naturally occurring generic and quantified sentences in context. We also introduce a metric called p-acceptability, which uses surprisal to measure sensitivity to quantification.
Our experiments reveal that generics are more sensitive to context than sentences with specific quantifiers. Additionally, about 20% of the generics we analyzed express weak generalizations. We also investigate how human biases and stereotypes can be reflected in language models.
Keywords: generics, quantifiers, language models
How to Cite:
Cilleruelo, G., Allaway, E., Haddow, B. & Birch, A., (2025) “Generics are puzzling. Can language models find the missing piece?”, Society for Computation in Linguistics 8(1): 49. doi: https://doi.org/10.7275/scil.3263
Downloads:
Download PDF
26 Views
4 Downloads