Assessing the ability of Transformer-based Neural Models to represent structurally unbounded dependencies
- Jillian K Da Costa (University at Buffalo)
- Rui P Chaves (University at Buffalo)
Abstract
Filler-gap dependencies are among the most challenging syntactic constructions for com- putational models at large. Recently, Wilcox et al. (2018) and Wilcox et al. (2019b) provide some evidence suggesting that large-scale general-purpose LSTM RNNs have learned such long-distance filler-gap dependencies. In the present work we provide evidence that such models learn filler-gap dependencies only very imperfectly, despite being trained on massive amounts of data. Finally, we compare the LSTM RNN models with more modern state-of-the-art Transformer models, and find that these have poor-to-mixed degrees of success, despite their sheer size and low perplexity.
Keywords: GPT-2, BERT, XLNet, TransformerXL, Surprisal, Filler-gap Dependencies
How to Cite:
Da Costa, J. K. & Chaves, R. P., (2020) “Assessing the ability of Transformer-based Neural Models to represent structurally unbounded dependencies”, Society for Computation in Linguistics 3(1), 189-198. doi: https://doi.org/10.7275/3sb6-4g20
Downloads:
Download PDF