Skip to main content
Proceedings

Vowel length contrasts in deep learning: Generative Adversarial Phonology and duration

Author
  • Wing Sze KAT (The Chinese University of Hong Kong)

Abstract

This paper presents a model for unsupervised learning of identity-based patterns, specifically focusing on vowel length contrasts in speech, through the analysis of acoustic durations derived from raw consecutive data of various Southeast Asian languages. To achieve this, we employ deep convolutional neural networks to explore this phenomenon in depth. The paper outlines four generative tests that demonstrate the network's ability to encode phonetic properties and abstract processes, such as vowel substitution. This aspect parallels language acquisition in Cantonese and other Southeast Asian languages. Additionally, the manipulation of categorical variables within the ciwGAN not only facilitates the creation of long vowels from short ones but also simulates the complex tone-vowel interactions prevalent in these languages. Our findings regarding intensity and Voice Onset Time (VOT) indicate that the deep convolutional network effectively mirrors certain language acquisition processes typical of both human learners and individuals with Autism Spectrum Disorder (ASD). This suggests that the model's data acquisition reflects the ways in which people learn the languages from which the data is sourced, underscoring the model's potential for advancing our understanding of language learning.

Keywords: Generative adversarial Phonology, Thai, Vietnamese, Cantonese

How to Cite:

KAT, W., (2026) “Vowel length contrasts in deep learning: Generative Adversarial Phonology and duration”, Proceedings of the Annual Meetings on Phonology 2(1). doi: https://doi.org/10.7275/amphonology.3906

Downloads:
Download PDF

90 Views

23 Downloads

Published on
2026-04-09

Peer Reviewed