8th International Conference on Spoken Language Processing

Jeju Island, Korea
October 4-8, 2004

A Theoretical Analysis of Speech Recognition based on Feature Trajectory Models

Yasuhiro Minami, Erik McDermott, Atsushi Nakamura, Shigeru Katagiri

NTT Corporation, Japan

In previous work, we proposed a new speech recognition technique that generates a smooth speech trajectory from hidden Markov models (HMMs) by maximizing likelihood subject to the constraints that exist between static and dynamic speech features. This paper presents a theoretical analysis of this method. We show that the approach used to generate the smoothed trajectory is equivalent to a Kalman filter. This result demonstrates that there is a strong relationship between the dynamics of delta features (and delta-delta features) in HMM-based speech recognition and Kalman filter dynamics.

Full Paper

Bibliographic reference.  Minami, Yasuhiro / McDermott, Erik / Nakamura, Atsushi / Katagiri, Shigeru (2004): "A theoretical analysis of speech recognition based on feature trajectory models", In INTERSPEECH-2004, 549-552.