11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

Estimation Studies of Vocal Tract Shape Trajectory Using a Variable Length and Lossy Kelly-Lochbaum Model

Heikki Rasilo, Unto K. Laine, Okko Johannes Räsänen

Aalto University, Finland

This work demonstrates the use of a modified Kelly-Lochbaum (KL) vocal tract (VT) model in dynamic mapping from speech signals to articulatory configurations. The sixteen section KL model is equipped with a variable length segment for lip rounding and an accurate model for lip radiation impedance. Profiles for the eight Finnish vowels are used to form so called anchor points in the articulatory and spectral domain. These profiles are modulated by cosine functions to produce clusters of vowel variants around the anchor points. The resulting profile and formant frequency data are stored in a codebook that is used in the trajectory estimation task, proposing a number of profile candidates for each speech frame based on the observed formant frequencies. The final trajectory is estimated by minimizing the articulatory distance across all frames. The first trajectory estimation results are promising and in good balance with the present phonetic literature.

