Structure Here, Bias There: Hierarchical Generalization by Jointly Learning Syntactic Transformations
- Karl Mulligan (Johns Hopkins University)
- Robert Frank (Yale University)
- Tal Linzen (New York University)
Abstract
When learning syntactic transformations, children consistently induce structure-dependent generalizations, even though the primary linguistic data may be consistent with both linear and hierarchical rules. What is the source of this inductive bias? In this paper, we use computational models to investigate the hypothesis that evidence for the structure-sensitivity of one syntactic transformation can bias the acquisition of another transformation in favor of a hierarchical rule. We train sequence-to-sequence models based on artificial neural networks to learn multiple syntactic transformations at the same time in a fragment of English; we hold out cases that disambiguate linear and hierarchical rules for one of those transformations, and then test for hierarchical generalization to these held-out sentence types. Consistent with our hypothesis, we find that multitask learning induces a hierarchical bias for certain combinations of tasks, and that this bias is stronger for transformations that share computational building blocks. At the same time, the bias is in general insufficient to lead the learner to categorically acquire the hierarchical generalization for the target transformation.
Keywords: structure dependence, poverty of the stimulus, inductive bias, multitask learning
How to Cite:
Mulligan, K., Frank, R. & Linzen, T., (2021) “Structure Here, Bias There: Hierarchical Generalization by Jointly Learning Syntactic Transformations”, Society for Computation in Linguistics 4(1), 125-135. doi: https://doi.org/10.7275/j0es-xf97
Downloads:
Download PDF