Multiple alignments of inflectional paradigms
- Sacha Beniamine (University of Surrey)
- Matías Guzmán Naranjo (Université de Paris)
Abstract
Most models of inflectional morphology rely at their core on the identification of recurrent and diverging material across inflected forms. Across theoretical frameworks, this can be expressed in terms of morpheme segmentation, rules, processes, patterns or analogies.
\nFinding these recurrences in large structured lexicons is an important step in empirical computational morphology, where analyses are induced bottom-up from inflected forms. This can be done by aligning all the forms in each paradigm, a task of Multiple Sequence Alignments which is well known in other fields such as evolutionary biology and historical linguistics.
\nIn this paper, we present the specific problems which arise when aligning inflected forms, provide a simple alignment format, define evaluation measures and compare two implemented methods on 13 inflectional lexicons. Our intent is to provide the conditions for the inter-operability of future systems, and for incremental improvements in this fundamental step for quantitative morphology.
Keywords: inflection, paradigms, stem, marker, multiple sequence alignment, MSA, alignment, LCS, longest common subsequence, quantitative, typology
How to Cite:
Beniamine, S. & Guzmán Naranjo, M., (2021) “Multiple alignments of inflectional paradigms”, Society for Computation in Linguistics 4(1), 216-227. doi: https://doi.org/10.7275/ymc0-p491
Downloads:
Download PDF