5th International Conference on Spoken Language Processing
We propose an efficient two-pass search algorithm for LVCSR. Instead of conventional word graph, the first preliminary pass generates "word trellis index", keeping track of all survived word hypotheses within the beam every time-frame. As it represents all found word boundaries non-deterministically, we can (1) obtain accurate sentence-dependent hypotheses on the second search, and (2) avoid expensive word-pair approximation on the first pass. The second pass performs an efficient stack decoding search, where the index is referred to as predicted word list and heuristics. Experimental results on 5,000-word Japanese dictation task show that, compared with the word-graph method, this trellis-based method runs with less than 1/10 memory cost while keeping high accuracy. Finally, by handling inter-word context dependency, we achieved the word error rate of 5.6%.
Bibliographic reference. Lee, Akinobu / Kawahara, Tatsuya / Doshita, Shuji (1998): "An efficient two-pass search algorithm using word trellis index", In ICSLP-1998, paper 0655.