Third International Conference on Spoken Language Processing (ICSLP 94)
This paper proposes a method for explicit modeling of the time varying discriminative power in phoneme-sized Hidden Markov Models (HMMs), which is expected to improve generalization ability by focussing on typical temporal segments of each class. Since HMM states generally cover stationary or homogeneous segments within a class, we use one weight parameter per state. In this way, each model obtains a distribution of the discriminative power over the states, which directly corresponds to a distribution of the influence of each state on the total emission score of a feature vector sequence. An algorithm based on a gradient descent method is used to estimate the additional HMM parameters. It has been found to be very important that the search for the optimal path within a model is not influenced by the weights of the states. To allow for this, an extended Viterbi based time-synchronous continuous speech recognizer is proposed. The idea behind this strategy is to control the influence of each state on the total emission score by parameters containing a 'measure of classification relevance* wherever the models compete with each other. The modeling of and the processing within the individual units is based on the common maximum likelihood approach.
Bibliographic reference. Wolfertstetter, F. / Ruske, Günther (1994): "Discriminative state-weighting in hidden Markov models", In ICSLP-1994, 219-222.