September 22-25, 1997
Current speech recognition systems usually use word-based trigram language models. More elaborate models are applied to word lattices or N best lists in a rescoring pass following the acoustic decoding process. In this paper we consider techniques for dealing with class-based language models in the lattice rescoring framework of our JANUS large vocabulary speech recognizer. We demonstrate how tointerpolate with a Part-of-Speech (POS) tag-based language model as example of a class-based model, where a word can be member of many different classes. Here the actual class membership of a word in the lattice becomes a hidden event of the A-algorithm used for rescoring. A forward type of algorithm is defined as extension of the lattice rescorer to handle these hidden events in a mathematically sound fashion. Applying the mixture of viterbi and forward kind of rescoring procedure to the German Spontaneous Scheduling Task (GSST) yields some improvement inword accuracy. Above all, the rescoring procedure enables usage of any fuzzy/stochastic class definition for recognition units that might be determined through automatic clustering algorithms in the future.
Bibliographic reference. Geutner, Petra (1997): "Fuzzy class rescoring: a part-of-speech language model", In EUROSPEECH-1997, 2743-2746.