Assessing the ability of Transformer-based Neural Models to represent structurally unbounded dependencies

Jillian K Da Costa; Rui P Chaves

doi:10.7275/3sb6-4g20

Options

Paper

Assessing the ability of Transformer-based Neural Models to represent structurally unbounded dependencies

Authors

Jillian K Da Costa (University at Buffalo)
Rui P Chaves (University at Buffalo)

Abstract

Filler-gap dependencies are among the most challenging syntactic constructions for com- putational models at large. Recently, Wilcox et al. (2018) and Wilcox et al. (2019b) provide some evidence suggesting that large-scale general-purpose LSTM RNNs have learned such long-distance filler-gap dependencies. In the present work we provide evidence that such models learn filler-gap dependencies only very imperfectly, despite being trained on massive amounts of data. Finally, we compare the LSTM RNN models with more modern state-of-the-art Transformer models, and find that these have poor-to-mixed degrees of success, despite their sheer size and low perplexity.

Keywords: GPT-2, BERT, XLNet, TransformerXL, Surprisal, Filler-gap Dependencies

How to Cite:

Da Costa, J. K. & Chaves, R. P., (2020) “Assessing the ability of Transformer-based Neural Models to represent structurally unbounded dependencies”, Society for Computation in Linguistics 3(1), 189-198. doi: https://doi.org/10.7275/3sb6-4g20

Downloads:
Download PDF

310 Views

126 Downloads

Published on
2020-01-01

License

Creative Commons Attribution 4.0

Authors

Jillian K Da Costa (University at Buffalo)
Rui P Chaves (University at Buffalo)

Publication details

Pages: 189-198
Submitted on: 2019-10-10

File Checksums (MD5)

PDF: 6b55174d41a82b7928579ec67bd3278e

Assessing the ability of Transformer-based Neural Models to represent structurally unbounded dependencies

Abstract

Harvard-Style Citation

Vancouver-Style Citation

APA-Style Citation

Non Specialist Summary