5th International Conference on Spoken Language Processing

Sydney, Australia
November 30 - December 4, 1998

Using X-Gram For Efficient Speech Recognition

Antonio Bonafonte, Josť B. Marino

Universitat Politecnica de Catalunya, Spain

X-grams are a generalization of the n-grams, where the number of previous conditioning words is different for each case and decided from the training data. X-grams reduce perplexity with respect to trigrams and need less number of parameters. In this paper, the representation of the x-grams using finite state automata is considered. This representation leads to a new model, the non-deterministic x-grams, an approximation that is much more efficient, suffering small degradation on the modeling capability. Empirical experiments for a continuous speech recognition task show how, for each ending word, the number of transitions is reduced from 1222 (the size of the lexicon) to around 66.

Full Paper

