5th European Conference on Speech Communication and Technology

Rhodes, Greece
September 22-25, 1997

Towards Articulatory Speech Recognition: Learning Smooth Maps to Recover Articulator Information

Sam Roweis (1), Abeer Alwan (2)

(1) Computation and Neural Systems, California Institute of Technology, USA
(2) Department of Electrical Engineering, University of California, Los Angeles, CA, USA

We present a novel method for recovering articulator movements from speech acoustics based on a constrained form [9] of a hidden Markov model. The model attempts to explain sequences of high dimensional data using smooth and slow trajectories in a latent variable space. The key insight is that this continuity constraint when applied to speech helps to solve the "ill-posed" problem of acoustic to articulatory mapping. By working with sequences of spectra rather than looking only at individual spectra, it is possible to choose between competing articulatory configurations for any given spectrum by selecting the configuration "closest" to those at nearby times. We present results of applying this algorithm to recover articulator movements from acoustics using data from the Wisconsin X-ray microbeam project [3]. We find that the recovered traces are highly correlated with the measured articulator movements under a single linear transform. Such recovered traces have the potential to be used for speech recognition, an application we are currently investigating.

Full Paper

Bibliographic reference.  Roweis, Sam / Alwan, Abeer (1997): "Towards articulatory speech recognition: learning smooth maps to recover articulator information", In EUROSPEECH-1997, 1227-1230.