Fourth European Conference on Speech Communication and Technology

Madrid, Spain
September 18-21, 1995

Maximum-Likelihood Estimation for Articulatory Speech Recognition Using a Stochastic Target Model

Gordon Ramsay, Li Deng

Dept. of Electrical & Computer Engineering, University of Waterloo, Ontario, Canada

A stochastic framework for articulatory speech recognition is presented. Utterances are described in terms of overlapping phonological units built into a Markov chain, where each state is identified with a set of acoustic/articulatory correlates represented by a target distribution on an articulatory space. Articulator motion is modelled by a Markov-modulated stochastic linear dynamical system, and observations of the articulatory state are generated in an acoustic space through a non-linear mapping. Procedures for state and parameter estimation are outlined based on the EM algorithm and extended Kalman filtering techniques, and illustrated using artificial data.

Full Paper

Bibliographic reference.  Ramsay, Gordon / Deng, Li (1995): "Maximum-likelihood estimation for articulatory speech recognition using a stochastic target model", In EUROSPEECH-1995, 1401-1404.