A stochastic framework for articulatory speech recognition is presented. Utterances are described in terms of overlapping phonological units built into a Markov chain, where each state is identified with a set of acoustic/articulatory correlates represented by a target distribution on an articulatory space. Articulator motion is modelled by a Markov-modulated stochastic linear dynamical system, and observations of the articulatory state are generated in an acoustic space through a non-linear mapping. Procedures for state and parameter estimation are outlined based on the EM algorithm and extended Kalman filtering techniques, and illustrated using artificial data.
Bibliographic reference. Ramsay, Gordon / Deng, Li (1995): "Maximum-likelihood estimation for articulatory speech recognition using a stochastic target model", In EUROSPEECH-1995, 1401-1404.