Sixth European Conference on Speech Communication and Technology
In this paper, we describe approaches for improving the search efficiency of a dynamic programming based one-pass decoder for dialogue applications. In order to allow the use of long-term language models (LM) and cross-word acoustic models, efficient pruning techniques and fast methods for the calculation of emission probability density functions (pdfs) are required. This is particularly important for real-time and memory constrained applications such as dialogue systems involving automatic speech recognition (ASR) and natural-language understanding. We propose an effective pruning technique exploiting the LM and cross-word context. We also present a fast distance calculation method to reduce the cost of state likelihood calculations in HMM-based systems. Experimental results on a natural language call routing task indicate that the proposed techniques speeded up the search process by a factor of 4 without loss in the recognition accuracy. In addition, we present a technique for generating word graphs incorporating cross-word context.
Full Paper (PDF) Gnu-Zipped Postscript
Bibliographic reference. Ortmanns, Stefan / Reichl, Wolfgang / Chou, Wu (1999): "An efficient decoding method for real time speech recognition", In EUROSPEECH'99, 499-502.