Abstract

Investigating Morphosyntactic Variation in African American English on Twitter

Authors
  • Tessa Masis (University of Massachusetts - Amherst)
  • Chloe Eggleston (University of Massachusetts - Amherst)
  • Lisa J Green (University of Massachusetts - Amherst)
  • Taylor Jones (Naval Postgraduate School)
  • Meghan Armstrong (University of Massachusetts - Amherst)
  • Brendan O\'Connor (University of Massachusetts - Amherst)

Abstract

Early work on African American English (AAE) perpetuated myths that the language variety was uniform across regions and that it was spoken primarily by working class men, due to being conducted in inner city areas and examining a specific set of linguistic features. These sociolinguistic myths negatively impacted not only the field of linguistics but also how the public viewed AAE. Since then studies have looked at a broader range of geographical areas and demonstrated distinct local differences. Here we build on this line of research by analyzing relative incidences of 18 morphosyntactic features in relationship to geographic and social factors, at scale. Our data is a corpus of 224M geotagged tweets, which is five orders of magnitude larger than previous social media studies of AAE. We use machine learning methods to automatically detect linguistic features and to identify common patterns of variation across the features. Our results show that, contrary to sociolinguistic myths of uniformity, there is clear variation in AAE across both geographic and social dimensions. Regionally, we see a distinct spatially contiguous southern core which aligns with national-level phonological and lexical variation in AAE, although it is less variable. Across social groups, there is higher AAE usage in the rural south and in Black-Hispanic contact communities – both of which are groups currently underrepresented in the literature. This work provides a significant advance in descriptive work on AAE morphosyntax, presenting the first national-level description and analysis of overall grammatical variation in AAE in order to answer key questions about variation in AAE. More broadly, our methods demonstrate how machine learning tools can be applied to large-scale real-world data to help us gain a more representative understanding of language in marginalized communities.

Keywords: natural language processing, sociolinguistics, language variation, social media, machine learning, African American English

How to Cite:

Masis, T., Eggleston, C., Green, L. J., Jones, T., Armstrong, M. & O\'Connor, B., (2023) “Investigating Morphosyntactic Variation in African American English on Twitter”, Society for Computation in Linguistics 6(1), 392-393. doi: https://doi.org/10.7275/zdg0-0914

Downloads:
Download PDF

181 Views

63 Downloads

Published on
01 Jun 2023