Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Nonstationary-State Hidden Markov Model with State-Dependent Time Warping: Application to Speech Recognition

D. Sun (1), L. Deng (2)

(1) Department of Applied Mathematics and Statistics, State University of New York, Stony Brook, NY, USA
(2) Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada

We present a new algorithm for estimating state-dependent polynomial coefficients in the nonstationary-state (or trended) hidden Markov model (HMM), which allows for the flexibility of linear time warping or scaling in individual model states. The need for the state-dependent time warping arises from the observation that multiple state-segmented speech data sequences used for training a single set of polynomial coefficients often vary appreciably in their sequence lengths due to speaking rate variation and other temporal factors. The algorithm is developed based on a general framework with use of auxiliary parameters, which, of no interests in themselves, provide an means for achieving maximal accuracy for estimating the polynomial coefficients in the model. The speech recognition experiment results based on TIMIT database demonstrate the advantages of the time-warping trended HMMs over the regular trended HMMs.

Full Paper

Bibliographic reference.  Sun, D. / Deng, L. (1994): "Nonstationary-state hidden Markov model with state-dependent time warping: application to speech recognition", In ICSLP-1994, 243-246.