The theory of weighted finite state transducers (WFST) allows great flexibility in the early use of multiple sources of information in speech decoders. In this paper, we describe a decoder that relies on the algebra of WFSTs to integrate multiple sources of information in a one-pass search. The system has two modes of operation: time-synchronously for use with finite state problem specific grammars or with a word loop grammar for large vocabulary tasks; or time-asynchronously as a stack decoder also for large vocabulary recognition. Both modes of operation are decoupled from the language model. Experiments done with lattice rescoring tasks showed that the error rate is the same as with established state of the art decoders. Furthermore, experiments with explicit cross word pronunciation rules showed the feasibility of the inclusion of new knowledge sources early in the decoding process. We also found that the use of a time-synchronous search with a word loop grammar outperforms the stack decoder mode of operation by a factor of 10.
Cite as: Caseiro, D., Trancoso, I. (2000) A decoder for finite-state structured search spaces. Proc. ASR2000 - Automatic Speech Recognition: Challenges for the New Millenium, 35-39
@inproceedings{caseiro00_asr, author={Diamantino Caseiro and Isabel Trancoso}, title={{A decoder for finite-state structured search spaces}}, year=2000, booktitle={Proc. ASR2000 - Automatic Speech Recognition: Challenges for the New Millenium}, pages={35--39} }