Masked language models directly encode linguistic uncertainty
- Cassandra Jacobs (SUNY University at Buffalo)
- Ryan J. Hubbard (University of Illinois at Urbana-Champaign)
- Kara D. Federmeier (University of Illinois at Urbana-Champaign)
Abstract
Large language models (LLMs) have recently been used as models of psycholinguistic processing, usually focusing on lexical or syntactic surprisal. However, this approach casts away representations of utterance meaning (e.g., hidden states), which are used by LLMs to predict upcoming words. The present work explores whether hidden state representations of LLMs encode human language processing-relevant uncertainty. We specifically assess this possibility using sentences from Federmeier et al. (2007) that are either strongly or weakly predictive of a final word. Using a machine learning approach, we tested and confirmed that LLMs encode uncertainty in their hidden states.
Keywords: natural language processing, psycholinguistics, neural network, prediction
How to Cite:
Jacobs, C., Hubbard, R. J. & Federmeier, K. D., (2022) “Masked language models directly encode linguistic uncertainty”, Society for Computation in Linguistics 5(1), 225-228. doi: https://doi.org/10.7275/znzq-3m28
Downloads:
Download PDF