8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

Speaker Adaptation of a Three-dimensional Tongue Model

Olov Engwall

KTH, Stockholm, Sweden

Magnetic Resonance Images of nine subjects have been collected to determine scaling factors that can adapt a 3D tongue model to new subjects. The aim is to define few and simple measures that will allow for an automatic, but accurate, scaling of the model. The scaling should be automatic in order to be useful in an application for articulation training, in which the model must replicate the user's articulators without involving the user in a complicated speaker adaptation. It should further be accurate enough to allow for correct acoustic-to-articulatory inversion. The evaluation shows that the defined scaling technique is able to estimate a tongue shape that was not included in the training with an accuracy of 1.5 mm in the midsagittal plane and 1.7 mm for the whole 3D tongue, based on four articulatory measures.

Full Paper

Bibliographic reference.  Engwall, Olov (2004): "Speaker adaptation of a three-dimensional tongue model", In INTERSPEECH-2004, 465-468.