INTERSPEECH 2010
11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Predicting Unseen Articulations from Multi-Speaker Articulatory Models

G. Ananthakrishnan (1), Pierre Badin (2), Julián Andrés Valdés Vargas (2), Olov Engwall (1)

(1) KTH, Sweden
(2) GIPSA, France

In order to study inter-speaker variability, this work aims to assess the generalization capabilities of data-based multi-speaker articulatory models. We use various three-mode factor analysis techniques to model the variations of midsagittal vocal tract contours obtained from MRI images for three French speakers articulating 73 vowels and consonants. Articulations of a given speaker for phonemes not present in the training set are then predicted by inversion of the models from measurements of these phonemes articulated by the other subjects. On the average, the prediction RMSE was 5.25 mm for tongue contours, and 3.3 mm for 2D midsagittal vocal tract distances. Besides, this study has established a methodology to determine the optimal number of factors for such models.

Full Paper

Bibliographic reference.  Ananthakrishnan, G. / Badin, Pierre / Vargas, Julián Andrés Valdés / Engwall, Olov (2010): "Predicting unseen articulations from multi-speaker articulatory models", In INTERSPEECH-2010, 1588-1591.