7th International Conference on Spoken Language Processing
September 16-20, 2002
In this paper, we propose an extension of the backoff word n-gram language model that allows a better likelihood estimation of unseen events. Instead of using the (n-1)-gram to estimate the probability of an unseen n-gram, the proposed approach uses a class hierarchy to define a context which is more general than the unseen n-gram but more specific than the (n-1)-gram. Each node in the hierarchy is a class containing all the words of the descendant nodes (classes). Hence, the closer a node is to the root, the more general the corresponding class is. Performance is evaluated both in terms of test perplexity and word error rate (WER) on a simplified WSJ database. Experiments show an improvement of more than 26% on the unseen events perplexity.
Bibliographic reference. Zitouni, Imed / Siohan, Olivier / Kuo, Hong-Kwang Jeff / Lee, Chin-Hui (2002): "Backoff hierarchical class n-gram language modelling for automatic speech recognition systems", In ICSLP-2002, 885-888.