Extended Abstract

Topical advection as a baseline model for corpus-based lexical dynamics

Authors
  • Andres Karjus (Centre for Language Evolution, School of Philosophy, Psychology and Language Sciences, University of Edinburgh)
  • Richard A Blythe (Centre for Language Evolution, School of Philosophy, Psychology and Language Sciences, University of Edinburgh; School of Physics and Astronomy, University of Edinburgh)
  • Simon Kirby (Centre for Language Evolution, School of Philosophy, Psychology and Language Sciences, University of Edinburgh)
  • Kenny Smith (Centre for Language Evolution, School of Philosophy, Psychology and Language Sciences, University of Edinburgh)

Abstract

An important question in the field of corpus-based evolutionary language dynamics research is concerned with distinguishing selection-driven linguistic change from neutral evolution, and from changes stemming from language-external factors (cultural drift). A commonly used proxy for the popularity or selective fitness of an element is its corpus frequency. However, a number of recent works have pointed out that raw frequencies can often be misleading. We propose a model for controlling for drift in contextual topics in corpora - the topical-cultural advection model - and demonstrate that this simple measure is capable of accounting for a considerable amount of variability in word frequency changes in a corpus spanning two centuries of language use.

Keywords: language dynamics, language evolution, topical advection, corpora

How to Cite:

Karjus, A., Blythe, R. A., Kirby, S. & Smith, K., (2018) “Topical advection as a baseline model for corpus-based lexical dynamics”, Society for Computation in Linguistics 1(1), 186-188. doi: https://doi.org/10.7275/R5RR1WFX

Downloads:
Download PDF

59 Views

20 Downloads

Published on
01 Jan 2018