Unlocking finite-state morphological transducers: Derivational networks for Inuit-Yupik languages
Abstract
While derivational morphology is underrepresented in existing computational resources, finite-state morphological transducers (FSMTs) represent a promising untapped source, especially for low-resource, morphologically complex languages. This study presents a method for extracting Universal Derivations-style networks from FSMTs, applying it to Greenlandic and Saint Lawrence Island Yupik: two Inuit-Yupik languages known for their extreme synthesis and agglutination. Using available FSMTs and monolingual corpora, our approach identifies derivationally related forms by analyzing surface-attested words and recursively stripping or modifying morphemes to infer unseen but grammatically implied intermediate forms. The resulting networks include over 53,000 lexemes for SLI Yupik and over 127,000 for Greenlandic, with thousands of non-trivial derivational families and hundreds of unique derivational morphemes. Some individual families contain hundreds of morphemes, highlighting the rich derivational structure of these languages. These results highlight the potential of FSMTs in generating large-scale, empirically-grounded derivational resources for typologically diverse languages.
Keywords: morphology, derivational morphology, finite state transducers, finite state morphology, derivational networks, universal derivations, inuit-yupik languages, polysynthesis, agglutination, greenlandic
How to Cite:
Haley, C., (2025) “Unlocking finite-state morphological transducers: Derivational networks for Inuit-Yupik languages”, Society for Computation in Linguistics 8(1): 40. doi: https://doi.org/10.7275/scil.3172
Downloads:
Download PDF
22 Views
6 Downloads