Apparent Communicative Efficiency in the Lexicon is Emergent
- Spencer Caplan (University of Pennsylvania)
- Jordan Kodner (University of Pennsylvania)
- Charles Yang (University of Pennsylvania)
Abstract
Is language designed for communicative and functional efficiency? G. K. Zipf famously argued that shorter words are more frequent because they are easier to use, thereby resulting in the statistical law that bears his name. Yet, G. A. Miller showed that even a monkey randomly typing at a keyboard, and intermittently striking the space bar, would generate “words” with similar statistical properties. Recent quantitative analyses of human language lexicons (Piantadosi et al., 2012) have revived Zipf\'s functionalist hypothesis. Ambiguous words tend to be short, frequent, and easy to articulate in language production. Such statistical findings are commonly interpreted as evidence for pressure for efficiency, as the context of language use often provides cues to overcome lexical ambiguity. In this study, we update Miller\'s monkey thought experiment to incorporate empirically motivated phonological and semantic constraints on the creation of words. We claim that the appearance of communicative efficiency is a spandrel (Gould & Lewontin, 1979), as lexicons formed without the context of language use or reference to communication or efficiency exhibit comparable statistical properties. Furthermore, the updated monkey model provides a good fit for the growth trajectory of English as recorded in the Oxford English Dictionary. Focusing on the history of English words since 1900, we show that lexicons resulting from the monkey model provide a better embodiment of communicative efficiency than the actual lexicon of English. We conclude by arguing for the need to go beyond correlational statistics and to seek direct evidence for the mechanisms that underlie principles of language design.
Keywords: Words, Computational modeling, Communication, Information theory
How to Cite:
Caplan, S., Kodner, J. & Yang, C., (2021) “Apparent Communicative Efficiency in the Lexicon is Emergent”, Society for Computation in Linguistics 4(1), 349-350. doi: https://doi.org/10.7275/ncy0-dk32
Downloads:
Download PDF