In a large vocabulary continuous speech recognition task the search for the "best" (in the maximum-a-posteriori sense) word sequence is the most (computing) time consuming part of the system. End-of-word hypotheses are created almost every time frame. With a stochastic language model every lexicon entry is an admissible successor candidate. By using a "fast match" module which scores the word candidates according to their acoustic feasibility ahead of the current time frame, the search cost can be considerably reduced. Only the fraction of the words with favourable fast match scores will be further processed in the detailed match, where the likelihood of a segment of acoustics given the word model is computed. We derive a novel word selection strategy which is "consistent" in the sense that it introduces no additional decoding errors and which still reduces the search space by a factor of 2 - 3 compared to standard Viterbi beam search. Giving up the consistency requirement, pruning strategies can be deduced which further reduce the search effort significantly: the size of the word startup list is reduced to 2% - 4% of its original size with a modest increase in error rate by l%-2%.
Bibliographic reference. Haeb-Umbach, Reinhold / Ney, Hermann (1991): "A look-ahead search technique for large vocabulary continuous speech recognition", In EUROSPEECH-1991, 495-498.