EUROSPEECH 2003 - INTERSPEECH 2003
8th European Conference on Speech Communication and Technology

Geneva, Switzerland
September 1-4, 2003

        

Design of the CMU Sphinx-4 Decoder

Paul Lamere (1), Philip Kwok (1), William Walker (1), Evandro Gouvea (2), Rita Singh (2), Bhiksha Raj (3), Peter Wolf (3)

(1) Sun Microsystems Laboratories, USA
(2) Carnegie Mellon University, USA
(3) Mitsubishi Electric Research Laboratories, USA

Sphinx-4 is an open source HMM-based speech recognition system written in the Java^TM programming language. The design of the Sphinx-4 decoder incorporates several new features in response to current demands on HMM-based large vocabulary systems. Some new design aspects include graph construction for multilevel parallel decoding with multiple feature streams without the use of compound HMMs, the incorporation of a generalized search algorithm that subsumes Viterbi decoding as a special case, token stack decoding for efficient maintenance of multiple paths during search, design of a generalized language HMM graph from grammars and language models of multiple standard formats, that can potentially toggle between flat search structure, tree search structure, etc. This paper describes a few of these design aspects, and reports some preliminary performance measures for speed and accuracy.

Full Paper

Bibliographic reference.  Lamere, Paul / Kwok, Philip / Walker, William / Gouvea, Evandro / Singh, Rita / Raj, Bhiksha / Wolf, Peter (2003): "Design of the CMU sphinx-4 decoder", In EUROSPEECH-2003, 1181-1184.