In this paper, we describe approaches for improving the search efficiency of a dynamic programming based one-pass decoder for dialogue applications. In order to allow the use of long-term language models (LM) and cross-word acoustic models, efficient pruning techniques and fast methods for the calculation of emission probability density functions (pdfs) are required. This is particularly important for real-time and memory constrained applications such as dialogue systems involving automatic speech recognition (ASR) and natural-language understanding. We propose an effective pruning technique exploiting the LM and cross-word context. We also present a fast distance calculation method to reduce the cost of state likelihood calculations in HMM-based systems. Experimental results on a natural language call routing task indicate that the proposed techniques speeded up the search process by a factor of 4 without loss in the recognition accuracy. In addition, we present a technique for generating word graphs incorporating cross-word context.
Cite as: Ortmanns, S., Reichl, W., Chou, W. (1999) An efficient decoding method for real time speech recognition. Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999), 499-502, doi: 10.21437/Eurospeech.1999-128
@inproceedings{ortmanns99_eurospeech, author={Stefan Ortmanns and Wolfgang Reichl and Wu Chou}, title={{An efficient decoding method for real time speech recognition}}, year=1999, booktitle={Proc. 6th European Conference on Speech Communication and Technology (Eurospeech 1999)}, pages={499--502}, doi={10.21437/Eurospeech.1999-128} }