11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Unsupervised Learning of Vowels from Continuous Speech Based on Self-Organized Phoneme Acquisition Model

Kouki Miyazawa (1), Hideaki Kikuchi (1), Reiko Mazuka (2)

(1) Waseda University, Japan
(2) RIKEN BSI, Japan

All normal humans can acquire native phoneme systems naturally. However, it is unclear as to how infants learn the acoustic expression of each phoneme of their languages. In recent studies, researchers have inspected phoneme acquisition by using a computational model. However these studies have used a reading speech that has a limited vocabulary as input and do not handle a continuous speech. Therefore, we use a natural speech and build a self-organization model that simulates the cognitive ability, and we analyze the information that is necessary for the acquisition of the native vowels. Our model is designed to learn a natural continuation utterance and to estimate the number and boundaries of the vowel categories. In the simulation trial, we investigate the relationship between the quantity of learning and the accuracy for the vowels in a single Japanese speakerís speech. As a result, it is found that the vowel recognition rate of our model is comparable to that of an adult.

Full Paper

Bibliographic reference.  Miyazawa, Kouki / Kikuchi, Hideaki / Mazuka, Reiko (2010): "Unsupervised learning of vowels from continuous speech based on self-organized phoneme acquisition model", In INTERSPEECH-2010, 2914-2917.