ASR2000 - Automatic Speech Recognition: Challenges for the new Millenium

September 18-20, 2000
Paris, France

A Decoder for Finite-State Structured Search Spaces

Diamantino Caseiro and Isabel Trancoso

Speech Processing Group, INESC/IST, Lisbon, Portugal

The theory of weighted finite state transducers (WFST) allows great flexibility in the early use of multiple sources of information in speech decoders. In this paper, we describe a decoder that relies on the algebra of WFSTs to integrate multiple sources of information in a one-pass search. The system has two modes of operation: time-synchronously for use with finite state problem specific grammars or with a word loop grammar for large vocabulary tasks; or time-asynchronously as a stack decoder also for large vocabulary recognition. Both modes of operation are decoupled from the language model. Experiments done with lattice rescoring tasks showed that the error rate is the same as with established state of the art decoders. Furthermore, experiments with explicit cross word pronunciation rules showed the feasibility of the inclusion of new knowledge sources early in the decoding process. We also found that the use of a time-synchronous search with a word loop grammar outperforms the stack decoder mode of operation by a factor of 10.


Full Paper (PDF)   Full Paper (Zipped Postscript)

Bibliographic reference.  Caseiro, Diamantino / Trancoso, Isabel (2000): "A decoder for finite-state structured search spaces", In ASR-2000, 35-39.