Sound Analogies with Phoneme Embeddings
- Miikka P Silfverberg (University of Colorado)
- Lingshuang Mao (University of Colorado)
- Mans Hulden (University of Colorado)
Abstract
Vector space models of words in NLP---word embeddings---have been recently shown to reliably encode semantic information, offering capabilities such as solving proportional analogy tasks such as man:woman::king:queen. We study how well these distributional properties carry over to similarly learned phoneme embeddings, and whether phoneme vector spaces align with articulatory distinctive features, using several methods of obtaining such continuous-space representations. We demonstrate a statistically significant correlation between distinctive feature spaces and vector spaces learned with word-context PPMI+SVD and word2vec, showing that many distinctive feature contrasts are implicitly present in phoneme distributions. Furthermore, these distributed representations allow us to solve proportional analogy tasks with phonemes, such as "p" is to "b" as "t" is to "X", where the solution is that "X = d". This effect is even stronger when a supervision signal is added where we extract phoneme representations from the embedding layer of an recurrent neural network that is trained to solve a word inflection task, i.e. a model that is made aware of word relatedness.
Keywords: phonology, embeddings, RNN, distributional similarity, word2vec
How to Cite:
Silfverberg, M. P., Mao, L. & Hulden, M., (2018) “Sound Analogies with Phoneme Embeddings”, Society for Computation in Linguistics 1(1), 136-144. doi: https://doi.org/10.7275/R5NZ85VD
Downloads:
Download PDF