INTERSPEECH 2011
12th Annual Conference of the International Speech Communication Association

Florence, Italy
August 27-31. 2011

Method for Speech Inversion with Large Scale Statistical Evaluation

Heikki Rasilo, Unto K. Laine, Okko Räsänen, Toomas Altosaar

Aalto University, Finland

An articulatory model of speech production is created for the purpose of studying the links between speech production and perception. A computationally effective method for speech inversion in proposed, using a two-pole predictor structure in order to maintain better articulatory dynamics when compared to conventional dynamic programming methods. Preliminary tests for the effect of inversion are performed for 2500 Finnish syllables extracted from continuous speech, consisting of 125 different syllable classes. A cluster selectivity test shows that the syllables are more reliably clustered using the automatically obtained parametric representation of articulatory gestures rather than the original formant representation that is used as a starting point for the inversion.

Full Paper

Bibliographic reference.  Rasilo, Heikki / Laine, Unto K. / Räsänen, Okko / Altosaar, Toomas (2011): "Method for speech inversion with large scale statistical evaluation", In INTERSPEECH-2011, 2693-2696.