Paper

Computing Ellipsis Constructions: Comparing Classical NLP and LLM Approaches

Authors
  • Damir Cavar (Indiana University)
  • Zoran Tiganj (Indiana University)
  • Ludovic Veta Mompelat (University of Miami)
  • Billy Dickson (Indiana University)

Abstract

State-of-the-art (SOTA) Natural Language Processing (NLP) technology faces significant challenges with constructions that contain ellipses. Although theoretically well-documented and understood, there needs to be more sufficient cross-linguistic language resources to document, study, and ultimately engineer NLP solutions that can adequately provide analyses for ellipsis constructions. This article describes the typological data set on ellipsis that we created for currently seventeen languages. We demonstrate how SOTA parsers based on a variety of syntactic frameworks fail to parse sentences with ellipsis, and in fact, probabilistic, neural, and Large Language Models (LLM) do so, too. We demonstrate experiments that focus on detecting sentences with ellipsis, predicting the position of elided elements, and predicting elided surface forms in the appropriate positions. We show that cross-linguistic variation of ellipsis-related phenomena has different consequences for the architecture of NLP systems.

Keywords: ellipsis, LLMs, LFG, dependency parsing, constituent parser, lexical-functional grammar

How to Cite:

Cavar, D., Tiganj, Z., Mompelat, L. V. & Dickson, B., (2024) “Computing Ellipsis Constructions: Comparing Classical NLP and LLM Approaches”, Society for Computation in Linguistics 7(1), 217–226. doi: https://doi.org/10.7275/scil.2147

Downloads:
Download PDF

312 Views

113 Downloads

Published on
24 Jun 2024
Peer Reviewed