Look at that! BERT can be easily distracted from paying attention to morphosyntax
- Rui P Chaves (University at Buffalo)
- Stephanie N Richter (University at Buffalo)
Abstract
Syntactic knowledge involves not only the ability to combine words and phrases, but also the capacity to relate different and yet truth-preserving structural variations (e.g. passivization, inversion, topicalization, extraposition, clefting, etc.), as well as the ability to infer that these syntactic variations all adhere to common morphosyntactic rules, like subject-verb agreement. Although there is some evidence that BERT has rich syntactic knowledge, our adversarial approach suggests that it is not deployed in a robust and linguistically appropriate way. English BERT can be tricked to miss even quite simple syntactic generalizations, when compared with GPT-2, underscoring the need for stronger priors and for linguistically controlled experiments in evaluation
Keywords: Attention, surprisal, subject-verb agreement, BERT, GPT-2, adversarial, syntactic paraphrase, filler-gap dependencies
How to Cite:
Chaves, R. P. & Richter, S. N., (2021) “Look at that! BERT can be easily distracted from paying attention to morphosyntax”, Society for Computation in Linguistics 4(1), 28-38. doi: https://doi.org/10.7275/b92s-qd21
Downloads:
Download PDF