When using Weighted Finite State Transducers (WFSTs) in speech recognition, on-the-fly composition approaches have been proposed as a method of reducing memory consumption and increasing flexibility during decoding. We have recently implemented several fast on-the-fly techniques, namely avoiding dead-end states, dynamic pushing and state sharing in our decoding engine. The goal of this paper is to provide a unified study of how the different on-the-fly techniques and online composition combinations effect speech recognition performance. The evaluations were performed on a large spontaneous speech recognition task and the results show that when using on-the-fly composition with a fully dynamically composed language model component the performance degrades substantially even when avoiding dead-end states. We then show in these cases the recognition performance can be dramatically improved with the addition of dynamic pushing and state sharing.
Bibliographic reference. Oonishi, Tasuku / Dixon, Paul R. / Iwano, Koji / Furui, Sadaoki (2008): "Implementation and evaluation of fast on-the-fly WFST composition algorithms", In INTERSPEECH-2008, 2110-2113.