The Role of Linguistic Features in Domain Adaptation: TAG Parsing of Questions

Aarohi Srivastava; Robert Frank; Sarah Widder; David Chartash

doi:10.7275/7gvd-cq20

Options

Paper

The Role of Linguistic Features in Domain Adaptation: TAG Parsing of Questions

Authors

Aarohi Srivastava (Yale University)
Robert Frank (Yale University)
Sarah Widder (Yale University)
David Chartash (Yale University)

Abstract

The analysis of sentences outside the domain of the training data poses a challenge for contemporary syntactic parsing. The Penn Treebank corpus, commonly used for training constituency parsers, systematically undersamples certain syntactic structures. We examine parsing performance in Tree Adjoining Grammar (TAG) on one such structure: questions. To avoid hand-annotating a new training set including out-of-domain sentences, an expensive process, an alternate method requiring considerably less annotation effort is explored. Our method is based on three key ideas: First, pursuing the intuition that “supertagging is almost parsing” (Bangalore and Joshi, 1999), the parsing process is decomposed into two distinct stages, supertagging and stapling. Second, following Rimell and Clark (2008), the supertagger is trained with an extended dataset including questions, and the resultant supertags are used with an unmodified parser. Third, to maximize improvements gained from additional training of the supertagger, the parser is provided with linguistically-significant features that reflect commonalities across supertags. This novel combination of ideas leads to an improvement in question parsing accuracy of 13% LAS. This points to the conclusion that adaptation of a parser to a new domain can be achieved with limited data through the careful integration of linguistic knowledge.

Keywords: Tree Adjoining Grammar, Syntactic Parsing, Domain Adaptation

How to Cite:

Srivastava, A., Frank, R., Widder, S. & Chartash, D., (2020) “The Role of Linguistic Features in Domain Adaptation: TAG Parsing of Questions”, Society for Computation in Linguistics 3(1), 423-434. doi: https://doi.org/10.7275/7gvd-cq20

Downloads:
Download PDF

211 Views

240 Downloads

Published on
2020-01-01

License

Creative Commons Attribution 4.0

Authors

Aarohi Srivastava (Yale University)
Robert Frank (Yale University)
Sarah Widder (Yale University)
David Chartash (Yale University)

Publication details

Pages: 423-434
Submitted on: 2019-10-13

File Checksums (MD5)

PDF: 661543d33634b3c4e981250a6e14e0d6

The Role of Linguistic Features in Domain Adaptation: TAG Parsing of Questions

Abstract

Harvard-Style Citation

Vancouver-Style Citation

APA-Style Citation

Non Specialist Summary