EUROSPEECH 2003 - INTERSPEECH 2003
Sphinx-4 is an open source HMM-based speech recognition system written in the Java^TM programming language. The design of the Sphinx-4 decoder incorporates several new features in response to current demands on HMM-based large vocabulary systems. Some new design aspects include graph construction for multilevel parallel decoding with multiple feature streams without the use of compound HMMs, the incorporation of a generalized search algorithm that subsumes Viterbi decoding as a special case, token stack decoding for efficient maintenance of multiple paths during search, design of a generalized language HMM graph from grammars and language models of multiple standard formats, that can potentially toggle between flat search structure, tree search structure, etc. This paper describes a few of these design aspects, and reports some preliminary performance measures for speed and accuracy.
Bibliographic reference. Lamere, Paul / Kwok, Philip / Walker, William / Gouvea, Evandro / Singh, Rita / Raj, Bhiksha / Wolf, Peter (2003): "Design of the CMU sphinx-4 decoder", In EUROSPEECH-2003, 1181-1184.