5th International Conference on Spoken Language Processing
We describe two new techniques for reducing word lattice sizes without eliminating hypotheses. The first technique is an algorithm to reduce the size of non-deterministic bigram word lattices by merging redundant nodes. On bigram word lattices generated from Hub4 Broadcast News speech, this reduces lattice sizes by half on average. The second technique is an improved algorithm for expanding lattices with trigram language models. Backed-off trigram probabilities are encoded without node duplication by factoring the probabilities into bigram probabilities and backoff weights, and duplicating nodes only for explicit trigrams. Experiments on Broadcast News show that this method reduces trigram lattice sizes by a factor of 6, and reduces expansion time by more than a factor of 10. Compared to conventionally expanded lattices, recognition with the compactly expanded lattices was also found to be 40% faster, without affecting recognition accuracy.
Bibliographic reference. Weng, Fuliang / Stolcke, Andreas / Sankar, Ananth (1998): "Efficient lattice representation and generation", In ICSLP-1998, paper 0136.