Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker

Hidetsugu Uchida, Daisuke Saito, Nobuaki Minematsu


In this paper, we propose a method to predict the articulatory movements of phonemes that are difficult for a speaker to pronounce correctly because those phonemes are not seen in the native language of that speaker. When one wants to predict the articulatory movements of those unseen phonemes, since he/she has difficulty to generate those sounds, the conventional acoustic-to-articulatory mapping cannot be applied as it is. Here, we propose a solution by using the speech structure of another reference speaker who can pronounce the unseen phonemes. Speech structure is a kind of speech feature that represents only the linguistic information by suppressing the non-linguistic information, e.g. speaker identity, of an input utterance. In the proposed method, by using the speech structure of those unseen phonemes and other phonemes as constraint, the articulatory movements of the unseen phonemes are searched for in the articulatory space of the original speaker. Experiments using English short vowels show that the averaged prediction error was 1.02 mm.


DOI: 10.21437/Interspeech.2016-1138

Cite as

Uchida, H., Saito, D., Minematsu, N. (2016) Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker. Proc. Interspeech 2016, 450-454.

Bibtex
@inproceedings{Uchida+2016,
author={Hidetsugu Uchida and Daisuke Saito and Nobuaki Minematsu},
title={Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker},
year=2016,
booktitle={Interspeech 2016},
doi={10.21437/Interspeech.2016-1138},
url={http://dx.doi.org/10.21437/Interspeech.2016-1138},
pages={450--454}
}