Language Models and the Paradigmatic Axis
Abstract
The massive relevance of large language models, static, and contextualized word embeddings in today's research in NLP implies a need for accounts of how they process data from the point of view of the linguist. The goal of the present article is to frame language modeling objectives in structuralist terms: Word embeddings are derived from models attempting to quantify the probability of lexical items in a given context, and thus can be understood as models of the paradigmatic axis. This re-framing further allows us to demonstrate that, with some consideration given to how to formulate a word's context, training a simple model with a masked language modeling objective can yield paradigms that are both accurate and coherent from a theoretical linguistic perspective.
Keywords: language models, word embeddings, distributional semantics, structuralism
How to Cite:
Mickus, T., (2024) “Language Models and the Paradigmatic Axis”, Society for Computation in Linguistics 7(1), 63–74. doi: https://doi.org/10.7275/scil.2131
Downloads:
Download PDF