Sixth European Conference on Speech Communication and Technology

Budapest, Hungary
September 5-9, 1999

Optimization of Dynamic Regimes in a Statistical Hidden Dynamic Model for Conversational Speech Recognition

Jeff Ma, Li Deng

Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada

This paper reports our on-going work aimimg to improve the performance of a novel speech recognizer based on an underlying statistical hidden dynamic model of phonetic reduction in the production of conversational speech. We have developed a - it path-stack - search algorithm which efficiently computes the likelihood of any observation utterance while optimizing the dynamic regimes in the speech model. The effectiveness of the algorithm is tested in simulation experiments It is also tested on Switchboard data where the optimized dynamic regimes by the search algorithm are compared with those from exhaustive search. Finally, we show speech recognition results on Switchboard data that demonstrate improvements of the recognizer's performance compared with use of the dynamic regimes heuristically set from the phone segmentation by a state-of-the-art HMM system.

Full Paper (PDF)   Gnu-Zipped Postscript

Bibliographic reference.  Ma, Jeff / Deng, Li (1999): "Optimization of dynamic regimes in a statistical hidden dynamic model for conversational speech recognition", In EUROSPEECH'99, 1339-1342.