Extended Abstract

Universal Dependencies and Semantics for English and Hebrew Child-directed Speech

Authors
  • Ida Szubert (University of Edinburgh)
  • Omri Abend (Hebrew University of Jerusalem)
  • Nathan Schneider (Georgetown University)
  • Samuel Gibbon (University of Edinburgh)
  • Sharon Goldwater (University of Edinburgh)
  • Mark Steedman (University of Edinburgh)

Abstract

While corpora of child speech and child-directed speech (CDS) have enabled major contributions to the study of child language acquisition, semantic annotation for such corpora is still scarce and lacks a uniform standard. We compile two CDS corpora—in English and Hebrew—with syntactic and semantic annotations. We employ a methodology that enforces a cross-linguistically consistent representation, building on recent advances in dependency representation and semantic parsing. Our semi-automatic syntactic annotation follows the Universal Dependencies standard (UD; de Marneffe et al., 2021), adapted to suit the CDS genre. To induce semantic forms, we develop an automatic method for transducing UD structures into sentential logical forms (LFs). The two representations have complementary strengths: UD structures are language-neutral and support direct annotation, whereas LFs are neutral as to the syntax-semantics interface, and transparently encode semantic distinctions.

Keywords: language acquisition, child-directed speech, corpus annotation, syntax-semantics interface, Universal Dependencies

How to Cite:

Szubert, I., Abend, O., Schneider, N., Gibbon, S., Goldwater, S. & Steedman, M., (2022) “Universal Dependencies and Semantics for English and Hebrew Child-directed Speech”, Society for Computation in Linguistics 5(1), 235-240. doi: https://doi.org/10.7275/2fhp-bf70

Downloads:
Download PDF

60 Views

22 Downloads

Published on
01 Feb 2022