8th Annual Conference of the International Speech Communication Association

Antwerp, Belgium
August 27-31, 2007

Applying Word Duration Constraints by Using Unrolled HMMs

Ning Ma, Jon Barker, Phil Green

University of Sheffield, UK

Conventional HMMs have weak duration constraints. In noisy conditions, the mismatch between corrupted speech signals and models trained on clean speech may cause the decoder to produce word matches with unrealistic durations. This paper presents a simple way to incorporate word duration constraints by unrolling HMMs to form a lattice where word duration probabilities can be applied directly to state transitions. The expanded HMMs are compatible with conventional Viterbi decoding. Experiments on connected-digit recognition show that when using explicit duration constraints the decoder generates word matches with more reasonable durations, and word error rates are significantly reduced across a broad range of noise conditions.

Full Paper

Bibliographic reference.  Ma, Ning / Barker, Jon / Green, Phil (2007): "Applying word duration constraints by using unrolled HMMs", In INTERSPEECH-2007, 1066-1069.