Third International Conference on Spoken Language Processing (ICSLP 94)

Yokohama, Japan
September 18-22, 1994

Recognition ********* A Dynamic Network Decoder Design for Large Vocabulary Speech Recognition

V. Valtchev, J. J. Odell, Phil C. Woodland, Steve J. Young

Cambridge University Engineering Department, Cambridge, England, UK

Accuracy and speed are the main issues to consider when designing a large vocabulary speech recogniser. Recent experience with the Wall Street Journal (WSJ) corpus [5], has shown that high recognition accuracy requires the use of detailed acoustic models in conjunction with well-trained long span language models. In this paper we present a two-pass decoder architecture which extends an original [4] one-pass design. The initial pass consists of a time syn- chronous backward search in a pre-compiled network using simplified acoustic models and a null grammar. The forward pass can function as a stand-alone one-pass decoder capable of using cross-word context-dependent models and long span language models. The capabilities of this framework are empirically examined in terms of recognition accuracy vs speed on the Wall Street Journal database.

Full Paper

Bibliographic reference.  Valtchev, V. / Odell, J. J. / Woodland, Phil C. / Young, Steve J. (1994): "Recognition ********* a dynamic network decoder design for large vocabulary speech recognition", In ICSLP-1994, 1351-1354.